Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add guard for cusparse spmv_mv_tpl_spec_avail #2176

Merged
merged 1 commit into from
Apr 12, 2024

Conversation

ndellingwood
Copy link
Contributor

Address issue #2175
Configuring with magma tpl enabled and cusparse disabled mistakenly triggers the cusparse tpl avail check to be true Guard the KOKKOSSPARSE_SPMV_MV_TPL_SPEC_AVAIL_CUSPARSE macros when CUSPARSE is enabled to prevent this

Address issue kokkos#2175
Configuring with magma tpl enabled and cusparse disabled mistakenly triggers the cusparse tpl avail check to be true
Guard the KOKKOSSPARSE_SPMV_MV_TPL_SPEC_AVAIL_CUSPARSE macros when CUSPARSE is enabled to prevent this
@ndellingwood
Copy link
Contributor Author

Testing info with magma, using a local install (not covered by CI):

Reproducer configuration (weaver)

source /etc/profile.d/modules.sh
source /projects/ppc64le-pwr9-rhel8/legacy-env.sh
module purge
module load cuda/11.2.2/gcc/8.3.1 cmake/3.23.1 openblas/0.3.18/gcc/8.3.1

../../cm_generate_makefile.bash --with-cuda --compiler=$HOME/kokkos/bin/nvcc_wrapper --arch=Volta70 --with-tpls=magma

Test output:

[ndellin@weaver1 WeaverCuda112-magma]$ ctest --output-on-failure 2>&1 | tee out.ctest
Test project /home/ndellin/kokkos-kernels/testing/WeaverCuda112-magma
      Start  1: common_cuda
 1/33 Test  #1: common_cuda ......................   Passed   41.18 sec
      Start  2: common_serial
 2/33 Test  #2: common_serial ....................   Passed   16.21 sec
      Start  3: batched_dla_cuda
 3/33 Test  #3: batched_dla_cuda .................   Passed   33.50 sec
      Start  4: batched_gemm_cuda
 4/33 Test  #4: batched_gemm_cuda ................   Passed   78.94 sec
      Start  5: batched_dla_serial
 5/33 Test  #5: batched_dla_serial ...............   Passed   16.06 sec
      Start  6: batched_gemm_serial
 6/33 Test  #6: batched_gemm_serial ..............   Passed  485.57 sec
      Start  7: batched_sla_cuda
 7/33 Test  #7: batched_sla_cuda .................   Passed    5.61 sec
      Start  8: batched_sla_serial
 8/33 Test  #8: batched_sla_serial ...............   Passed    5.39 sec
      Start  9: blas_cuda
 9/33 Test  #9: blas_cuda ........................   Passed   68.35 sec
      Start 10: blas_serial
10/33 Test #10: blas_serial ......................   Passed   72.46 sec
      Start 11: lapack_cuda
11/33 Test #11: lapack_cuda ......................   Passed   10.21 sec
      Start 12: lapack_serial
12/33 Test #12: lapack_serial ....................   Passed    7.95 sec
      Start 13: graph_cuda
13/33 Test #13: graph_cuda .......................   Passed   24.84 sec
      Start 14: graph_serial
14/33 Test #14: graph_serial .....................   Passed   54.48 sec
      Start 15: sparse_cuda
15/33 Test #15: sparse_cuda ......................   Passed  113.35 sec
      Start 16: blocksparse_cuda
16/33 Test #16: blocksparse_cuda .................   Passed   29.83 sec
      Start 17: sparse_serial
17/33 Test #17: sparse_serial ....................   Passed  166.94 sec
      Start 18: blocksparse_serial
18/33 Test #18: blocksparse_serial ...............   Passed   59.88 sec
      Start 19: ode_cuda
19/33 Test #19: ode_cuda .........................   Passed    6.81 sec
      Start 20: ode_serial
20/33 Test #20: ode_serial .......................   Passed    6.26 sec
      Start 21: wiki_blas2_ger
21/33 Test #21: wiki_blas2_ger ...................   Passed    4.51 sec
      Start 22: wiki_blas2_syr
22/33 Test #22: wiki_blas2_syr ...................   Passed    4.76 sec
      Start 23: wiki_blas2_syr2
23/33 Test #23: wiki_blas2_syr2 ..................   Passed    5.17 sec
      Start 24: wiki_crsmatrix
24/33 Test #24: wiki_crsmatrix ...................   Passed    5.27 sec
      Start 25: wiki_spmv
25/33 Test #25: wiki_spmv ........................   Passed    5.61 sec
      Start 26: wiki_spadd
26/33 Test #26: wiki_spadd .......................   Passed    5.65 sec
      Start 27: wiki_spgemm
27/33 Test #27: wiki_spgemm ......................   Passed    5.79 sec
      Start 28: wiki_gauss_seidel
28/33 Test #28: wiki_gauss_seidel ................   Passed    6.47 sec
      Start 29: wiki_coloring
29/33 Test #29: wiki_coloring ....................   Passed    6.04 sec
      Start 30: wiki_mis2
30/33 Test #30: wiki_mis2 ........................   Passed    6.34 sec
      Start 31: wiki_coarsening
31/33 Test #31: wiki_coarsening ..................   Passed    6.52 sec
      Start 32: wiki_rcm
32/33 Test #32: wiki_rcm .........................   Passed    6.71 sec
      Start 33: gmres_test_prec
33/33 Test #33: gmres_test_prec ..................   Passed    6.76 sec

100% tests passed, 0 tests failed out of 33

Total Test time (real) = 1379.63 sec

@ndellingwood ndellingwood merged commit cb824ca into kokkos:develop Apr 12, 2024
7 checks passed
@ndellingwood ndellingwood deleted the issue-2175 branch April 12, 2024 22:34
ndellingwood added a commit to ndellingwood/Trilinos that referenced this pull request Apr 12, 2024
Patch created from kokkos/kokkos-kernels#2176
Add guard for cusparse spmv_mv_tpl_spec_avail
Addresses linker errors for builds with magma enabled and cusparse
disabled
@ndellingwood ndellingwood added the TrilinosPatchMatch Apply this label for PR's mirroring changes submitted directly to Trilinos label Apr 12, 2024
ndellingwood added a commit to ndellingwood/Trilinos that referenced this pull request Apr 18, 2024
Resolve issues with Magma and Cuda singleton multiple definitions
Reported in kokkos/kokkos-kernels#2175
Patch generated from kokkos/kokkos-kernels#2176
ndellingwood added a commit that referenced this pull request May 1, 2024
Add guard for cusparse spmv_mv_tpl_spec_avail

(cherry picked from commit cb824ca)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
TrilinosPatchMatch Apply this label for PR's mirroring changes submitted directly to Trilinos
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants