Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SpMV_struct: Implementation of a structured spmv algorithm #446

Merged
merged 1 commit into from
Jul 23, 2019

Conversation

lucbv
Copy link
Contributor

@lucbv lucbv commented Jul 1, 2019

The implementation hinges on the regularity of the sparsity pattern of the matrix.
It is leveraged to access vector entries without using the view of column indices.
The new kernel is faster for the 5 stencils implemented: 1D, 2D and 3D FD and FE.
It also does check for a multivector containing a single vector in which case it
calls the vector implementation of spmv_struct. A fall back to the unstructured
implementation is used for the multivector case at the moment.
A unit test and a performance test are added to assess correctness and performance
of the kernel. Three utility functions are implemented in:

test_common/KokkosKernels_Test_Structured_Matrix.hpp

They return matrices corresponding to the discretization of the Laplace operator on
1D, 2D and 3D grids for FD and FE discretization.

Reactivating the spmv unit-test on Cuda

The implementation hinges on the regularity of the sparsity pattern of the matrix.
It is leveraged to access vector entries without using the view of column indices.
The new kernel is faster for the 5 stencils implemented: 1D, 2D and 3D FD and FE.
It also does check for a multivector containing a single vector in which case it
 calls the vector implementation of spmv_struct. A fall back to the unstructured
 implementation is used for the multivector case at the moment.
A unit test and a performance test are added to assess correctness and performance
of the kernel. Three utility functions are implemented in:

test_common/KokkosKernels_Test_Structured_Matrix.hpp

They return matrices corresponding to the discretization of the Laplace operator on
1D, 2D and 3D grids for FD and FE discretization.

Reactivating the spmv unit-test on Cuda
@lucbv lucbv requested review from srajama1 and ndellingwood July 1, 2019 18:59
@lucbv lucbv self-assigned this Jul 1, 2019
@lucbv lucbv mentioned this pull request Jul 1, 2019
@lucbv
Copy link
Contributor Author

lucbv commented Jul 1, 2019

@srajama1 this is a refresh on the previous PR #365 as I needed to rebase and do some clean-up...
I am running the spot_check on white and bowman, I think white is done but I'm still waiting on results from bowman.
I will post everything when it has all completed.

@srajama1
Copy link
Contributor

srajama1 commented Jul 1, 2019

@lucbv Thanks ! Let me take a look.

@lucbv
Copy link
Contributor Author

lucbv commented Jul 2, 2019

So, results on bowman:

Running on machine: bowman
Going to test compilers:  intel/16.4.258 intel/17.2.174 intel/18.2.199
Testing compiler intel/16.4.258
  Starting job intel-16.4.258-Serial-release
  Starting job intel-16.4.258-Pthread-release
  PASSED intel-16.4.258-Serial-release
Testing compiler intel/17.2.174
  Starting job intel-16.4.258-Pthread_Serial-release
  PASSED intel-16.4.258-Pthread-release
  Starting job intel-17.2.174-OpenMP-release
  PASSED intel-17.2.174-OpenMP-release
  Starting job intel-17.2.174-Pthread-release
  PASSED intel-16.4.258-Pthread_Serial-release
  Starting job intel-17.2.174-Serial-release
  PASSED intel-17.2.174-Pthread-release
  Starting job intel-17.2.174-OpenMP_Serial-release
  PASSED intel-17.2.174-Serial-release
Testing compiler intel/18.2.199
  Starting job intel-17.2.174-Pthread_Serial-release
  PASSED intel-17.2.174-OpenMP_Serial-release
  Starting job intel-18.2.199-OpenMP-release
  PASSED intel-17.2.174-Pthread_Serial-release
  Starting job intel-18.2.199-Pthread-release
  PASSED intel-18.2.199-OpenMP-release
  Starting job intel-18.2.199-Serial-release
  PASSED intel-18.2.199-Pthread-release
  Starting job intel-18.2.199-OpenMP_Serial-release
  PASSED intel-18.2.199-Serial-release
  Starting job intel-18.2.199-Pthread_Serial-release
  PASSED intel-18.2.199-OpenMP_Serial-release
  PASSED intel-18.2.199-Pthread_Serial-release
#######################################################
PASSED TESTS
#######################################################
intel-16.4.258-Pthread-release build_time=1512 run_time=571
intel-16.4.258-Pthread_Serial-release build_time=2414 run_time=1212
intel-16.4.258-Serial-release build_time=1458 run_time=601
intel-17.2.174-OpenMP-release build_time=1890 run_time=356
intel-17.2.174-OpenMP_Serial-release build_time=2748 run_time=1063
intel-17.2.174-Pthread-release build_time=1329 run_time=600
intel-17.2.174-Pthread_Serial-release build_time=2326 run_time=1226
intel-17.2.174-Serial-release build_time=1513 run_time=640
intel-18.2.199-OpenMP-release build_time=1530 run_time=387
intel-18.2.199-OpenMP_Serial-release build_time=2688 run_time=963
intel-18.2.199-Pthread-release build_time=1231 run_time=603
intel-18.2.199-Pthread_Serial-release build_time=2311 run_time=1189
intel-18.2.199-Serial-release build_time=1192 run_time=645
#######################################################
FAILED TESTS
#######################################################

@lucbv
Copy link
Contributor Author

lucbv commented Jul 2, 2019

Results on white:

#######################################################
PASSED TESTS
#######################################################
cuda-10.0.130-Cuda_Serial-release build_time=1093 run_time=345
cuda-9.2.88-Cuda_OpenMP-release build_time=1158 run_time=283
gcc-6.4.0-OpenMP_Serial-release build_time=591 run_time=285
gcc-7.2.0-OpenMP-release build_time=406 run_time=112
gcc-7.2.0-OpenMP_Serial-release build_time=642 run_time=317
gcc-7.2.0-Serial-release build_time=260 run_time=200
ibm-16.1.0-Serial-release build_time=1377 run_time=257
#######################################################
FAILED TESTS
#######################################################

Copy link
Contributor

@srajama1 srajama1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, Luc, for getting this PR in ! Are the performance results in the wiki up to date with these changes. I know there were multiple passes to the code, so just want to make sure the final performance is in the wiki.

@srajama1 srajama1 merged commit 5765b47 into kokkos:develop Jul 23, 2019
@lucbv lucbv deleted the spmv_struct_mv branch July 24, 2019 15:27
@srajama1
Copy link
Contributor

@brian-kelley Spot checks passed here, why would it fail now ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants