-
Notifications
You must be signed in to change notification settings - Fork 99
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Spgemm TPL refactor #1618
Spgemm TPL refactor #1618
Conversation
Status Flag 'Pre-Test Inspection' - Auto Inspected - Inspection Is Not Necessary for this Pull Request. |
dece8ab
to
fa717e3
Compare
Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects: Pull Request Auto Testing STARTING (click to expand)Build InformationTest Name: KokkosKernels_PullRequest_GCC930_Light_Tpls_GCC930
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_CUDA11_CUDA11_LayoutRight
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_GCC1020
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_GCC1020_Light_LayoutRight
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_GCC1020
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_INTEL19
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_CLANG1001
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_CLANG13CUDA10
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_A64FX_Tpls_ARMPL2110
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_A64FX_GCC1020
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_VEGA908_ROCM520
Jenkins Parameters
Using Repos:
Pull Request Author: brian-kelley |
Status Flag 'Pull Request AutoTester' - Jenkins Testing: 1 or more Jobs FAILED Note: Testing will normally be attempted again in approx. 2 Hrs 30 Mins. If a change to the PR source branch occurs, the testing will be attempted again on next available autotester run. Pull Request Auto Testing has FAILED (click to expand)Build InformationTest Name: KokkosKernels_PullRequest_GCC930_Light_Tpls_GCC930
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_CUDA11_CUDA11_LayoutRight
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_GCC1020
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_GCC1020_Light_LayoutRight
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_GCC1020
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_INTEL19
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_CLANG1001
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_CLANG13CUDA10
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_A64FX_Tpls_ARMPL2110
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_A64FX_GCC1020
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_VEGA908_ROCM520
Jenkins Parameters
Console Output (last 100 lines) : KokkosKernels_PullRequest_GCC930_Light_Tpls_GCC930 # 180 (click to expand)
Console Output (last 100 lines) : KokkosKernels_PullRequest_CUDA11_CUDA11_LayoutRight # 188 (click to expand)
Console Output (last 100 lines) : KokkosKernels_PullRequest_GCC1020 # 141 (click to expand)
Console Output (last 100 lines) : KokkosKernels_PullRequest_GCC1020_Light_LayoutRight # 140 (click to expand)
Console Output (last 100 lines) : KokkosKernels_PullRequest_Tpls_GCC1020 # 103 (click to expand)
Console Output (last 100 lines) : KokkosKernels_PullRequest_Tpls_INTEL19 # 190 (click to expand)
Console Output (last 100 lines) : KokkosKernels_PullRequest_CLANG1001 # 239 (click to expand)
Console Output (last 100 lines) : KokkosKernels_PullRequest_CLANG13CUDA10 # 125 (click to expand)
Console Output (last 100 lines) : KokkosKernels_PullRequest_A64FX_Tpls_ARMPL2110 # 19 (click to expand)
Console Output (last 100 lines) : KokkosKernels_PullRequest_A64FX_GCC1020 # 19 (click to expand)
Console Output (last 100 lines) : KokkosKernels_PullRequest_VEGA908_ROCM520 # 26 (click to expand)
|
Status Flag 'Pre-Test Inspection' - Auto Inspected - Inspection Is Not Necessary for this Pull Request. |
Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects: Pull Request Auto Testing STARTING (click to expand)Build InformationTest Name: KokkosKernels_PullRequest_GCC930_Light_Tpls_GCC930
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_CUDA11_CUDA11_LayoutRight
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_GCC1020
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_GCC1020_Light_LayoutRight
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_GCC1020
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_INTEL19
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_CLANG1001
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_CLANG13CUDA10
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_A64FX_Tpls_ARMPL2110
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_A64FX_GCC1020
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_VEGA908_ROCM520
Jenkins Parameters
Using Repos:
Pull Request Author: brian-kelley |
Status Flag 'Pull Request AutoTester' - Jenkins Testing: 1 or more Jobs FAILED Note: Testing will normally be attempted again in approx. 2 Hrs 30 Mins. If a change to the PR source branch occurs, the testing will be attempted again on next available autotester run. Pull Request Auto Testing has FAILED (click to expand)Build InformationTest Name: KokkosKernels_PullRequest_GCC930_Light_Tpls_GCC930
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_CUDA11_CUDA11_LayoutRight
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_GCC1020
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_GCC1020_Light_LayoutRight
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_GCC1020
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_INTEL19
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_CLANG1001
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_CLANG13CUDA10
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_A64FX_Tpls_ARMPL2110
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_A64FX_GCC1020
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_VEGA908_ROCM520
Jenkins Parameters
Console Output (last 100 lines) : KokkosKernels_PullRequest_GCC930_Light_Tpls_GCC930 # 181 (click to expand)
Console Output (last 100 lines) : KokkosKernels_PullRequest_CUDA11_CUDA11_LayoutRight # 189 (click to expand)
Console Output (last 100 lines) : KokkosKernels_PullRequest_GCC1020 # 142 (click to expand)
Console Output (last 100 lines) : KokkosKernels_PullRequest_GCC1020_Light_LayoutRight # 141 (click to expand)
Console Output (last 100 lines) : KokkosKernels_PullRequest_Tpls_GCC1020 # 104 (click to expand)
Console Output (last 100 lines) : KokkosKernels_PullRequest_Tpls_INTEL19 # 191 (click to expand)
Console Output (last 100 lines) : KokkosKernels_PullRequest_CLANG1001 # 240 (click to expand)
Console Output (last 100 lines) : KokkosKernels_PullRequest_CLANG13CUDA10 # 126 (click to expand)
Console Output (last 100 lines) : KokkosKernels_PullRequest_A64FX_Tpls_ARMPL2110 # 20 (click to expand)
Console Output (last 100 lines) : KokkosKernels_PullRequest_A64FX_GCC1020 # 20 (click to expand)
Console Output (last 100 lines) : KokkosKernels_PullRequest_VEGA908_ROCM520 # 27 (click to expand)
|
Status Flag 'Pre-Test Inspection' - Auto Inspected - Inspection Is Not Necessary for this Pull Request. |
Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects: Pull Request Auto Testing STARTING (click to expand)Build InformationTest Name: KokkosKernels_PullRequest_GCC930_Light_Tpls_GCC930
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_CUDA11_CUDA11_LayoutRight
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_GCC1020
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_GCC1020_Light_LayoutRight
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_GCC1020
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_INTEL19
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_CLANG1001
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_CLANG13CUDA10
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_A64FX_Tpls_ARMPL2110
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_A64FX_GCC1020
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_VEGA908_ROCM520
Jenkins Parameters
Using Repos:
Pull Request Author: brian-kelley |
6c337a7
to
a172579
Compare
Status Flag 'Pull Request AutoTester' - Jenkins Testing: 1 or more Jobs FAILED Note: Testing will normally be attempted again in approx. 2 Hrs 30 Mins. If a change to the PR source branch occurs, the testing will be attempted again on next available autotester run. Pull Request Auto Testing has FAILED (click to expand)Build InformationTest Name: KokkosKernels_PullRequest_GCC930_Light_Tpls_GCC930
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_CUDA11_CUDA11_LayoutRight
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_GCC1020
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_GCC1020_Light_LayoutRight
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_GCC1020
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_INTEL19
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_CLANG1001
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_CLANG13CUDA10
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_A64FX_Tpls_ARMPL2110
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_A64FX_GCC1020
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_VEGA908_ROCM520
Jenkins Parameters
Console Output (last 100 lines) : KokkosKernels_PullRequest_GCC930_Light_Tpls_GCC930 # 182 (click to expand)
Console Output (last 100 lines) : KokkosKernels_PullRequest_CUDA11_CUDA11_LayoutRight # 190 (click to expand)
Console Output (last 100 lines) : KokkosKernels_PullRequest_GCC1020 # 143 (click to expand)
Console Output (last 100 lines) : KokkosKernels_PullRequest_GCC1020_Light_LayoutRight # 142 (click to expand)
Console Output (last 100 lines) : KokkosKernels_PullRequest_Tpls_GCC1020 # 105 (click to expand)
Console Output (last 100 lines) : KokkosKernels_PullRequest_Tpls_INTEL19 # 192 (click to expand)
Console Output (last 100 lines) : KokkosKernels_PullRequest_CLANG1001 # 241 (click to expand)
Console Output (last 100 lines) : KokkosKernels_PullRequest_CLANG13CUDA10 # 127 (click to expand)
Console Output (last 100 lines) : KokkosKernels_PullRequest_A64FX_Tpls_ARMPL2110 # 21 (click to expand)
Console Output (last 100 lines) : KokkosKernels_PullRequest_A64FX_GCC1020 # 21 (click to expand)
Console Output (last 100 lines) : KokkosKernels_PullRequest_VEGA908_ROCM520 # 28 (click to expand)
|
Status Flag 'Pre-Test Inspection' - Auto Inspected - Inspection Is Not Necessary for this Pull Request. |
Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects: Pull Request Auto Testing STARTING (click to expand)Build InformationTest Name: KokkosKernels_PullRequest_GCC930_Light_Tpls_GCC930
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_CUDA11_CUDA11_LayoutRight
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_GCC1020
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_GCC1020_Light_LayoutRight
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_GCC1020
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_INTEL19
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_CLANG1001
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_CLANG13CUDA10
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_A64FX_Tpls_ARMPL2110
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_A64FX_GCC1020
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_VEGA908_ROCM520
Jenkins Parameters
Using Repos:
Pull Request Author: brian-kelley |
Status Flag 'Pull Request AutoTester' - Jenkins Testing: 1 or more Jobs FAILED Note: Testing will normally be attempted again in approx. 2 Hrs 30 Mins. If a change to the PR source branch occurs, the testing will be attempted again on next available autotester run. Pull Request Auto Testing has FAILED (click to expand)Build InformationTest Name: KokkosKernels_PullRequest_GCC930_Light_Tpls_GCC930
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_CUDA11_CUDA11_LayoutRight
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_GCC1020
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_GCC1020_Light_LayoutRight
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_GCC1020
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_INTEL19
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_CLANG1001
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_CLANG13CUDA10
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_A64FX_Tpls_ARMPL2110
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_A64FX_GCC1020
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_VEGA908_ROCM520
Jenkins Parameters
Console Output (last 100 lines) : KokkosKernels_PullRequest_GCC930_Light_Tpls_GCC930 # 183 (click to expand)
Console Output (last 100 lines) : KokkosKernels_PullRequest_CUDA11_CUDA11_LayoutRight # 191 (click to expand)
Console Output (last 100 lines) : KokkosKernels_PullRequest_GCC1020 # 144 (click to expand)
Console Output (last 100 lines) : KokkosKernels_PullRequest_GCC1020_Light_LayoutRight # 143 (click to expand)
Console Output (last 100 lines) : KokkosKernels_PullRequest_Tpls_GCC1020 # 106 (click to expand)
Console Output (last 100 lines) : KokkosKernels_PullRequest_Tpls_INTEL19 # 193 (click to expand)
Console Output (last 100 lines) : KokkosKernels_PullRequest_CLANG1001 # 242 (click to expand)
Console Output (last 100 lines) : KokkosKernels_PullRequest_CLANG13CUDA10 # 128 (click to expand)
Console Output (last 100 lines) : KokkosKernels_PullRequest_A64FX_Tpls_ARMPL2110 # 22 (click to expand)
Console Output (last 100 lines) : KokkosKernels_PullRequest_A64FX_GCC1020 # 22 (click to expand)
Console Output (last 100 lines) : KokkosKernels_PullRequest_VEGA908_ROCM520 # 29 (click to expand)
|
Split symbolic, numeric, fused-jacobi into separate TPL avail/decl files.
- Symbolic now backwards compatible with bspgemm_numeric and spgemm_jacobi
- Remove mentions of deprecated enum value SPGEMM_MKL and SPGEMM_MKL2PHASE - Add MKL wrapper under unification layer with correct symbolic/numeric separation - Remove SPGEMM_CUSP and SPGEMM_VIENNA enum values
<rocblas.h> is deprecated and throws warnings; now we have to use <rocblas/rocblas.h>
Status Flag 'Pre-Merge Inspection' - SUCCESS: The last commit to this Pull Request has been INSPECTED AND APPROVED by [ e10harvey ]! |
Status Flag 'Pull Request AutoTester' - Pull Request MUST BE MERGED MANUALLY BY Project Team - This Repo does not support Automerge |
It was causing SPGEMM_KK_LP (a completely differnet, parallel algo) to be used, rather than SPGEMM_DEBUG (the host serial reference implementation).
@e10harvey Thanks for the review, I just pushed your suggestions. |
Status Flag 'Pre-Test Inspection' - Auto Inspected - Inspection Is Not Necessary for this Pull Request. |
Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects: Pull Request Auto Testing STARTING (click to expand)Build InformationTest Name: KokkosKernels_PullRequest_GCC930_Light_Tpls_GCC930
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_CUDA11_CUDA11_LayoutRight
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_GCC1020
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_GCC1020_Light_LayoutRight
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_GCC1020
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_INTEL19
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_CLANG1001
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_CLANG13CUDA10
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_A64FX_Tpls_ARMPL2110
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_A64FX_GCC1020
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_VEGA908_ROCM520
Jenkins Parameters
Using Repos:
Pull Request Author: brian-kelley |
Status Flag 'Pull Request AutoTester' - Jenkins Testing: all Jobs PASSED Pull Request Auto Testing has PASSED (click to expand)Build InformationTest Name: KokkosKernels_PullRequest_GCC930_Light_Tpls_GCC930
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_CUDA11_CUDA11_LayoutRight
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_GCC1020
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_GCC1020_Light_LayoutRight
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_GCC1020
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_INTEL19
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_CLANG1001
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_CLANG13CUDA10
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_A64FX_Tpls_ARMPL2110
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_A64FX_GCC1020
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_VEGA908_ROCM520
Jenkins Parameters
|
Status Flag 'Pre-Merge Inspection' - - This Pull Request Requires Inspection... The code must be inspected by a member of the Team before Testing/Merging |
All Jobs Finished; status = PASSED, However Inspection must be performed before merge can occur... |
All Jobs Finished; status = PASSED, However Inspection must be performed before merge can occur... |
2 similar comments
All Jobs Finished; status = PASSED, However Inspection must be performed before merge can occur... |
All Jobs Finished; status = PASSED, However Inspection must be performed before merge can occur... |
Does symbolic phase populate column indices? |
@jczhang07 No, symbolic only counts nonzeros by default (so that the user can control the allocation of entries/values), but it will also populate rowptrs if you set the new last parameter |
I'll add, another reason to not populate entries in symbolic is that none of the TPLs supported yet can save time by just doing values in numeric. (MKL sp2m does have a mode to just do values, but it turns out sp2m also doesn't sort the matrix by default, so we can't re-use the entries because after we sort, they are permuted differently than MKL expects). And then the native KK implementation still always does entries+values together. In the future though, if a TPL or our own impl comes along that can just do values, the logic could look like this:
|
@brian-kelley Does that mean one has to call the numeric phase to get column indices (even matrix values might not be ready for numeric computation at that moment)? BTW, I think rocsparse spgemm has intuitive APIs, https://rocsparse.readthedocs.io/en/latest/usermanual.html#rocsparse-csrgemm-numeric, whose symbolic phase fills column indices. |
I think you have weakened the meaning of symbolic. Usually we regard it as "except numeric values, all things are set up". Usually when users are doing higher-level symbolic computation, matrix value arrays might have not been allocated yet, or even allocated, they contain garbage values. Thus users have to refactor their code to use the current spgemm_symbolic, which is difficult. |
@jczhang07 we agree with what you are saying but again we do not have TPL support for a true symbolic calculation.
but it is not supported yet by us or the TPLs |
No, see rocsparse_csrgemm_symbolic() |
OK sorry, I was incorrect here about rocsparse but the code in the numeric wrapper is correct - it only calls |
Also, see discussion at NVIDIA/CUDALibrarySamples#84 cusparseSpGEMMreuse_copy() acts like a symbolic phase and sets up the column indices. cusparseSpGEMMreuse_compute() is only for numeric computation. |
I think rocsparse's API is the right way to go. If users only need to do spgemm once, then they call |
I think there's a misunderstanding of the changes in this PR, I didn't make any major changes to what the KokkosKernels symbolic and numeric do.
This is true, but when you call that you still must set all of C's CSR pointers (rowptrs, entries, values). You must set all 3 - leaving values NULL is not allowed (I just tried it on cuSparse 11.6), and got
so this means that in cuSparse, C's values must be allocated before populating C's entries. I agree that rocSparse has the best interface of all of these, and it could easily compute entries in symbolic. I could provide the function symbolic2, which was actually my plan before I worked through this refactor:
But I cannot currently provide an efficient implementation of this for KK (our own impl), MKL or cuSparse. Only rocSparse makes this way efficient. The code from this PR, using one call to symbolic and many calls to numeric, is as efficient as possible on all of these with no redundant work. If more redundant work is acceptable, then I could still provide that |
True, but you don't need to assign meaningful values to C. |
Refactor TPL wrappers for spgemm symbolic/numeric. All backwards compatible from user's point of view, except that we start requiring spgemm input matrices to be sorted. Also now guarantee output to be sorted.
mkl_sparse_sp2m
to do proper symbolic/numeric separation. Previous 2-phase wrapper used deprecated interface, and previous single phase wrapper had to compute the entire C with dummy values in symbolic.SPGEMM_CUSPARSE
,SPGEMM_MKL
,SPGEMM_MKL2PHASE
,SPGEMM_ROCSPARSE
since the TPLs will be called automatically whenever available.kk_is_relatively_identical_view
) to be more tolerant when one or both values being compared is close to zero (i.e. the result of summing many terms that mostly cancel out). If absolute value from both views is less than epsilon, assume they match. And when getting the relative error, divide by the sum of magnitudes, not just the magnitude from the second view (makes the relation symmetric). Print out a message for every pair of values that doesn't match to aid debugging.