GitHub - thejonwong/warpkernel: warpkernel - GPU Sparse Matrix Vector Product library

thejonwong / warpkernel Public

Notifications You must be signed in to change notification settings
Fork 0
Star 0

warpkernel - GPU Sparse Matrix Vector Product library

Notifications

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
LICENSE		LICENSE
README		README
cusparse.cu		cusparse.cu
makefile		makefile
optimization.cu		optimization.cu
optimize.cu		optimize.cu
optimize2.cu		optimize2.cu
reorder.cu		reorder.cu
serialize.cu		serialize.cu
test.cu		test.cu
testloop.cu		testloop.cu
warpkernel.hpp		warpkernel.hpp
wpk1.cu		wpk1.cu
wpk2.cu		wpk2.cu

Repository files navigation

Requirements for warpkernel.hpp:
1. CUDA 4.0
2. Thrust (http://docs.nvidia.com/cuda/thrust/index.html)

Optional Requirements for test codes. Packages listed below are used for comparison of results:
3. CUSP (http://code.google.com/p/cusp-library/)
4. ModernGPU sparse

Installation:
Run "make all"

Warpkernel Library Usage:
1. scan
Prototype:
void scan(uint & nz_, uint & nrows_, ValueType * A, IndexType * rows, IndexType *colinds, [IndexType threshold])
This scans the matrix A and analyzes its structure to generate the propper mappings for the warpkernels. For warpkernel::structure (K1) and warpkernel::structure2 (K2) have similar but separate scan calls that convert a matrix into the WPK1/2 formats repsectively.

2. Process
Performs the re-ordering of values and columns into the corresponding WPK format. warpkernel::engine instatiation will call this by default.

3. Run
warpkernel::engine::run will perform the SPMV multiplication on arrays x and y. On the other hand warpkernel::engine::run_x will perform the multiplication with the entries of X reordered (Kr). This means that for each entry of the reordered x array x', the result from run_x yields the proper corresponding entry in the original solution array y, y'. run computes A*x=y and run_x computes A*x'=y'.

4. Verify
warpkernel::engine::verify compare the computed vs another GPU SPMV computed result. verify_x remaps the compared result such that the ordering matches the ordering for the result vector for hte computed result if Kr SPMV algorithms are used.