-
Notifications
You must be signed in to change notification settings - Fork 99
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Spgemm non-reuse: unification layer and TPLs #1678
Conversation
This version has a simpler interface (A, B) -> C, so the user doesn't have to manage a handle. The default algorithm (SPGEMM_KK) is always used. The native implementation just calls symbolic and then numeric.
For these versions, the no-reuse wrapper would be identical to the symbolic wrapper plus the numeric wrapper, so just call those.
@@ -57,6 +56,7 @@ void spmv_cusparse(const KokkosKernels::Experimental::Controls& controls, | |||
|
|||
#if defined(CUSPARSE_VERSION) && (10300 <= CUSPARSE_VERSION) | |||
|
|||
using entry_type = typename AMatrix::non_const_ordinal_type; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not related to the SpGEMM changes, but when building with clang+cuda10 I got an unused local typedef warning so this fixes it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there nothing to be done for rocsparse? I guess they are not implementing an all at once spgemm?
@lucbv Nope, rocsparse only gives one interface (rocsparse_csrgemm_nnz + rocsparse_Xcsrgemm_numeric) so there's no speedup from an all-at-once wrapper. If I wrote one, it would be the same as sticking the symbolic and numeric wrappers together. |
Status Flag 'Pre-Test Inspection' - Auto Inspected - Inspection Is Not Necessary for this Pull Request. |
Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects: Pull Request Auto Testing STARTING (click to expand)Build InformationTest Name: KokkosKernels_PullRequest_GCC930_Light_Tpls_GCC930
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_CUDA11_CUDA11_LayoutRight
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_GCC1020
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_GCC1020_Light_LayoutRight
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_GCC1020
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_INTEL19
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_CLANG1001
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_CLANG13CUDA10
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_A64FX_Tpls_ARMPL2110
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_A64FX_GCC1020
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_VEGA908_ROCM520
Jenkins Parameters
Using Repos:
Pull Request Author: brian-kelley |
Status Flag 'Pull Request AutoTester' - Jenkins Testing: 1 or more Jobs FAILED Note: Testing will normally be attempted again in approx. 2 Hrs 30 Mins. If a change to the PR source branch occurs, the testing will be attempted again on next available autotester run. Pull Request Auto Testing has FAILED (click to expand)Build InformationTest Name: KokkosKernels_PullRequest_GCC930_Light_Tpls_GCC930
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_CUDA11_CUDA11_LayoutRight
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_GCC1020
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_GCC1020_Light_LayoutRight
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_GCC1020
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_INTEL19
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_CLANG1001
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_CLANG13CUDA10
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_A64FX_Tpls_ARMPL2110
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_A64FX_GCC1020
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_VEGA908_ROCM520
Jenkins Parameters
Console Output (last 100 lines) : KokkosKernels_PullRequest_GCC930_Light_Tpls_GCC930 # 289 (click to expand)
Console Output (last 100 lines) : KokkosKernels_PullRequest_CUDA11_CUDA11_LayoutRight # 296 (click to expand)
Console Output (last 100 lines) : KokkosKernels_PullRequest_GCC1020 # 210 (click to expand)
Console Output (last 100 lines) : KokkosKernels_PullRequest_GCC1020_Light_LayoutRight # 209 (click to expand)
Console Output (last 100 lines) : KokkosKernels_PullRequest_Tpls_GCC1020 # 172 (click to expand)
Console Output (last 100 lines) : KokkosKernels_PullRequest_Tpls_INTEL19 # 259 (click to expand)
Console Output (last 100 lines) : KokkosKernels_PullRequest_CLANG1001 # 309 (click to expand)
Console Output (last 100 lines) : KokkosKernels_PullRequest_CLANG13CUDA10 # 229 (click to expand)
Console Output (last 100 lines) : KokkosKernels_PullRequest_A64FX_Tpls_ARMPL2110 # 124 (click to expand)
Console Output (last 100 lines) : KokkosKernels_PullRequest_A64FX_GCC1020 # 122 (click to expand)
Console Output (last 100 lines) : KokkosKernels_PullRequest_VEGA908_ROCM520 # 118 (click to expand)
|
Okay, it looks like our testing did not like something about it. Ping me when you want me to look at this again : ) |
Status Flag 'Pre-Test Inspection' - Auto Inspected - Inspection Is Not Necessary for this Pull Request. |
Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects: Pull Request Auto Testing STARTING (click to expand)Build InformationTest Name: KokkosKernels_PullRequest_GCC930_Light_Tpls_GCC930
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_CUDA11_CUDA11_LayoutRight
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_GCC1020
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_GCC1020_Light_LayoutRight
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_GCC1020
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_INTEL19
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_CLANG1001
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_CLANG13CUDA10
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_A64FX_Tpls_ARMPL2110
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_A64FX_GCC1020
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_VEGA908_ROCM520
Jenkins Parameters
Using Repos:
Pull Request Author: brian-kelley |
Status Flag 'Pull Request AutoTester' - Jenkins Testing: all Jobs PASSED Pull Request Auto Testing has PASSED (click to expand)Build InformationTest Name: KokkosKernels_PullRequest_GCC930_Light_Tpls_GCC930
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_CUDA11_CUDA11_LayoutRight
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_GCC1020
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_GCC1020_Light_LayoutRight
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_GCC1020
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_INTEL19
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_CLANG1001
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_CLANG13CUDA10
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_A64FX_Tpls_ARMPL2110
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_A64FX_GCC1020
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_VEGA908_ROCM520
Jenkins Parameters
|
Status Flag 'Pre-Merge Inspection' - - This Pull Request Requires Inspection... The code must be inspected by a member of the Team before Testing/Merging |
All Jobs Finished; status = PASSED, However Inspection must be performed before merge can occur... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, thanks @brian-kelley
Status Flag 'Pre-Merge Inspection' - SUCCESS: The last commit to this Pull Request has been INSPECTED AND APPROVED by [ lucbv ]! |
Status Flag 'Pull Request AutoTester' - Pull Request MUST BE MERGED MANUALLY BY Project Team - This Repo does not support Automerge |
* Add unification layer, tests for non-reuse SpGEMM This version has a simpler interface (A, B) -> C, so the user doesn't have to manage a handle. The default algorithm (SPGEMM_KK) is always used. The native implementation just calls symbolic and then numeric. * Add cusparse 11.0+ spgemm noreuse wrapper * Fix unused local typedef warning/error * Add cusparse 10.x spgemm noreuse wrapper * Formatting * Remove pointless no-reuse spgemm wrapper for cusparse 10, rocsparse For these versions, the no-reuse wrapper would be identical to the symbolic wrapper plus the numeric wrapper, so just call those. * Add MKL non-reuse spgemm wrapper * Formatting * Don't try to call 10 spgemm noreuse from cusparse (cherry picked from commit b0965b7)
* Add unification layer, tests for non-reuse SpGEMM This version has a simpler interface (A, B) -> C, so the user doesn't have to manage a handle. The default algorithm (SPGEMM_KK) is always used. The native implementation just calls symbolic and then numeric. * Add cusparse 11.0+ spgemm noreuse wrapper * Fix unused local typedef warning/error * Add cusparse 10.x spgemm noreuse wrapper * Formatting * Remove pointless no-reuse spgemm wrapper for cusparse 10, rocsparse For these versions, the no-reuse wrapper would be identical to the symbolic wrapper plus the numeric wrapper, so just call those. * Add MKL non-reuse spgemm wrapper * Formatting * Don't try to call 10 spgemm noreuse from cusparse (cherry picked from commit b0965b7)
Add a unification layer for the simplified, handle-free spgemm function:
Also add testing for this function.
The native implementation just creates a handle with default settings, and then calls symbolic/numeric. symbolic and numeric could then call TPLs even if the spgemm() itself doesn't.
Add TPL wrappers for cuSparse 11+ and MKL, since they provide efficiency gains. There was no need to add wrappers for cuSparse 10 or rocSparse because those would have been identical to just calling symbolic and then numeric.