Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add fallback condition to use spmv_native when cuSPARSE won't work #834

Merged
merged 1 commit into from
Oct 19, 2020

Conversation

brian-kelley
Copy link
Contributor

  • Improve SpMV unit test:
    • generate random complex values with nonzero imaginary component
    • catch exceptions in spmv
  • Add "spmv_native" support to the cusparse spmv file (this calls KokkosKernels spmv)
  • Fall back to native for mode 'C' (which cusparse doesn't have), or modes 'T'/'H' for CUDA 9 (fixing cuda.sparse_spmv test failure on Kepler arch with cusparse in Trilinos nightly build #833)
  • Fix some minor type casts/checks in spgemm/sptrsv cusparse wrappers - now those builds and tests succeed with all 4 possible combinations of offset/ordinal width

Improve SpMV unit test:
  - generate random complex values with nonzero imaginary component
  - catch exceptions in spmv
@brian-kelley
Copy link
Contributor Author

Kokkos-dev2, non-TPL:
#######################################################
PASSED TESTS
#######################################################
clang-8.0-Cuda_OpenMP-release build_time=690 run_time=146
clang-8.0-Pthread_Serial-release build_time=243 run_time=157
clang-9.0.0-Pthread-release build_time=154 run_time=69
clang-9.0.0-Serial-release build_time=147 run_time=58
cuda-10.1-Cuda_OpenMP-release build_time=955 run_time=135
cuda-11.0-Cuda_OpenMP-release build_time=1026 run_time=133
cuda-9.2-Cuda_Serial-release build_time=894 run_time=194
gcc-7.3.0-OpenMP-release build_time=163 run_time=52
gcc-7.3.0-Pthread-release build_time=126 run_time=75
gcc-8.3.0-Serial-release build_time=164 run_time=61
gcc-9.1-OpenMP-release build_time=192 run_time=49
gcc-9.1-Serial-release build_time=174 run_time=53
intel-17.0.1-Serial-release build_time=307 run_time=59
intel-18.0.5-OpenMP-release build_time=426 run_time=52
intel-19.0.5-Pthread-release build_time=431 run_time=75

TPL:
#######################################################
PASSED TESTS
#######################################################
clang-8.0-Cuda_OpenMP-release build_time=705 run_time=146
clang-8.0-Pthread_Serial-release build_time=240 run_time=158
clang-9.0.0-Pthread-release build_time=137 run_time=75
clang-9.0.0-Serial-release build_time=155 run_time=58
cuda-10.1-Cuda_OpenMP-release build_time=1727 run_time=124
cuda-11.0-Cuda_OpenMP-release build_time=1732 run_time=136
gcc-7.3.0-OpenMP-release build_time=170 run_time=51
gcc-7.3.0-Pthread-release build_time=144 run_time=72
gcc-8.3.0-Serial-release build_time=170 run_time=56
gcc-9.1-OpenMP-release build_time=194 run_time=50
gcc-9.1-Serial-release build_time=173 run_time=56
intel-17.0.1-Serial-release build_time=299 run_time=59
intel-18.0.5-OpenMP-release build_time=397 run_time=49
intel-19.0.5-Pthread-release build_time=449 run_time=70

RIDE, non-TPL:
#######################################################
PASSED TESTS
#######################################################
cuda-10.1.105-Cuda_OpenMP-release build_time=713 run_time=149
cuda-9.2.88-Cuda_OpenMP-release build_time=711 run_time=176
cuda-9.2.88-Cuda_Serial-release build_time=692 run_time=231
gcc-6.4.0-OpenMP_Serial-release build_time=238 run_time=177
gcc-7.2.0-OpenMP-release build_time=161 run_time=59
gcc-7.2.0-OpenMP_Serial-release build_time=234 run_time=174
gcc-7.2.0-Serial-release build_time=160 run_time=69
#######################################################
FAILED TESTS
#######################################################
cuda-10.1.105-Cuda_Serial-release (test failed)
#######################################################

TPL:
#######################################################
PASSED TESTS
#######################################################
cuda-10.1.105-Cuda_Serial-release build_time=925 run_time=203
gcc-7.2.0-OpenMP-release build_time=171 run_time=62
gcc-7.2.0-Serial-release build_time=161 run_time=71
gcc-7.4.0-OpenMP-release build_time=176 run_time=59
#######################################################
FAILED TESTS
#######################################################
cuda-9.2.88-Cuda_OpenMP-release (test failed)
#######################################################

Only failures on RIDE were due to #799 (cuda.batched_scalar_teamvector_solve_utv2_double)

Copy link
Contributor

@ndellingwood ndellingwood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look good to me, thanks @brian-kelley !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants