-
Notifications
You must be signed in to change notification settings - Fork 99
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Small fixes to spgemm, and plug gaps in testing #1159
Conversation
- Simplified interface: don't destroy spgemm subhandle after numeric. Want to be able to reuse, and user is responsible for creating the handle anyway. - Don't default to SPGEMM_CUSPARSE unless cusparse TPL is actually enabled. - Don't divide by 0 when A has 0 rows. - In testing, test the case where A has 0 rows. - In testing, actually test default algo SPGEMM_KK, since this takes a slightly different code path than SPGEMM_KK_MEMORY.
d46f646
to
94fa382
Compare
@e10harvey Does the new autotester not leave a comment with which builds passed/failed? |
Correct. However, it can be enabled again. |
@e10harvey Is there another way to see that information without going to each build individually on Jenkins? If not, I think we would have to bring it back, even if it does cause more spam :) |
You will start getting spam soon :) |
Status Flag 'Pull Request AutoTester' - User Requested Retest - Label AT: RETEST will be reset after testing. |
Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects: Pull Request Auto Testing STARTING (click to expand)Build InformationTest Name: KokkosKernels_PullRequest_GCC720_Light
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_GCC720
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_GCC720_Light_LayoutRight
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_GCC720
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_CUDA10
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_INTEL18
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_CUDA10_LayoutRight
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_CUDA9
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_GCC720_GCC740
Jenkins Parameters
Using Repos:
Pull Request Author: brian-kelley |
I think all the weaver builds are going to fail due to kokkos/kokkos#4485 |
Done. |
Status Flag 'Pull Request AutoTester' - Jenkins Testing: 1 or more Jobs FAILED Note: Testing will normally be attempted again in approx. 2 Hrs 30 Mins. If a change to the PR source branch occurs, the testing will be attempted again on next available autotester run. Pull Request Auto Testing has FAILED (click to expand)Build InformationTest Name: KokkosKernels_PullRequest_GCC720_Light
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_GCC720
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_GCC720_Light_LayoutRight
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_GCC720
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_CUDA10
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_INTEL18
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_CUDA10_LayoutRight
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_CUDA9
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_GCC720_GCC740
Jenkins Parameters
Console Output (last 100 lines) : KokkosKernels_PullRequest_GCC720_Light # 178 (click to expand)
Console Output (last 100 lines) : KokkosKernels_PullRequest_GCC720 # 556 (click to expand)
Console Output (last 100 lines) : KokkosKernels_PullRequest_GCC720_Light_LayoutRight # 203 (click to expand)
Console Output (last 100 lines) : KokkosKernels_PullRequest_Tpls_GCC720 # 547 (click to expand)
Console Output (last 100 lines) : KokkosKernels_PullRequest_Tpls_CUDA10 # 178 (click to expand)
Console Output (last 100 lines) : KokkosKernels_PullRequest_Tpls_INTEL18 # 536 (click to expand)
Console Output (last 100 lines) : KokkosKernels_PullRequest_Tpls_CUDA10_LayoutRight # 180 (click to expand)
Console Output (last 100 lines) : KokkosKernels_PullRequest_Tpls_CUDA9 # 172 (click to expand)
Console Output (last 100 lines) : KokkosKernels_PullRequest_Tpls_GCC720_GCC740 # 163 (click to expand)
|
Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects: Pull Request Auto Testing STARTING (click to expand)Build InformationTest Name: KokkosKernels_PullRequest_GCC720_Light
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_GCC720
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_GCC720_Light_LayoutRight
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_GCC720
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_CUDA10
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_INTEL18
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_CUDA10_LayoutRight
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_CUDA9
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_GCC720_GCC740
Jenkins Parameters
Using Repos:
Pull Request Author: brian-kelley |
Status Flag 'Pull Request AutoTester' - Jenkins Testing: all Jobs PASSED Pull Request Auto Testing has PASSED (click to expand)Build InformationTest Name: KokkosKernels_PullRequest_GCC720_Light
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_GCC720
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_GCC720_Light_LayoutRight
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_GCC720
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_CUDA10
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_INTEL18
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_CUDA10_LayoutRight
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_CUDA9
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_GCC720_GCC740
Jenkins Parameters
|
Status Flag 'Pre-Merge Inspection' - - This Pull Request Requires Inspection... The code must be inspected by a member of the Team before Testing/Merging |
All Jobs Finished; status = PASSED, However Inspection must be performed before merge can occur... |
2 similar comments
All Jobs Finished; status = PASSED, However Inspection must be performed before merge can occur... |
All Jobs Finished; status = PASSED, However Inspection must be performed before merge can occur... |
Status Flag 'Pre-Merge Inspection' - - This Pull Request Requires Inspection... The code must be inspected by a member of the Team before Testing/Merging |
All Jobs Finished; status = PASSED, However Inspection must be performed before merge can occur... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, @brian-kelley ! I left a few comments / questions below. My main concerns are maintenance burden increases due to maintaining the old spgemm interface and a runtime check that may need to be moved from the tests into the production code. Let me know what you think; perhaps we should address these in one or more follow-on PRs.
|
||
crsMat_t output_mat2; | ||
if (oldInterface) | ||
run_spgemm_old_interface<crsMat_t, device>(input_mat, input_mat, | ||
SPGEMM_DEBUG, output_mat2); | ||
run_spgemm_old_interface<crsMat_t, device>(A, B, SPGEMM_DEBUG, output_mat2); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can the old interface be removed? Has the old interface already been marked for deprecation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@e10harvey For now the "old" interface is not deprecated - it's just an alternative. The CrsMatrix based interface is implemented in terms of it. We test both just to be safe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also there are some users that want to have close control on memory allocations that prefer the "old" interface.
That way they can create their own Kokkos Views and give them pre-allocated memory.
@@ -309,7 +310,7 @@ void test_spgemm(lno_t numRows, size_type nnz, lno_t bandwidth, | |||
} | |||
// if size_type is larger than int, mkl casts it to int. | |||
// it will fail if casting cause overflow. | |||
if (input_mat.values.extent(0) > max_integer) { | |||
if (A.values.extent(0) > max_integer) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to check all extents against max_integer here? If this is protecting a code path in the production code, this check should really be moved to the production code where an exception case raised and then we can try-catch the exception here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed this too - luckily the MKL wrapper also has this check:
//if size_type != int...
const int max_integer = 2147483647;
if (entriesB.extent(0) > max_integer ||
entriesA.extent(0) > max_integer) {
throw std::runtime_error(
"MKL requires integer values for size type for SPGEMM. Copying to "
"integer will cause overflow.\n");
return;
}
//then convert rowmaps to int
@@ -92,7 +92,6 @@ void spgemm_numeric(KernelHandle& kh, const AMatrix& A, const bool Amode, | |||
&kh, A.numRows(), B.numRows(), B.numCols(), A.graph.row_map, | |||
A.graph.entries, A.values, Amode, B.graph.row_map, B.graph.entries, | |||
B.values, Bmode, C.graph.row_map, C.graph.entries, C.values); | |||
kh.destroy_spgemm_handle(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the handle destroyed when kh
goes out of scope or was this just a typo?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, kh is passed in (by reference, so it doesn't go out of scope) from the user and it must have had create_spgemm_handle(...)
called on it. Removing this destroy for 2 reasons:
- for symmetry, if the user is responsible for creating the subhandle they should also be responsible for destroying it
- to enable reuse: the simplified interface should be able to do one symbolic followd by multiple numerics
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reuse issue affected a user on slack btw, that's how I found out about it
Status Flag 'Pre-Merge Inspection' - - This Pull Request Requires Inspection... The code must be inspected by a member of the Team before Testing/Merging |
All Jobs Finished; status = PASSED, However Inspection must be performed before merge can occur... |
Status Flag 'Pre-Merge Inspection' - SUCCESS: The last commit to this Pull Request has been INSPECTED AND APPROVED by [ e10harvey ]! |
Status Flag 'Pull Request AutoTester' - Pull Request MUST BE MERGED MANUALLY BY Project Team - This Repo does not support Automerge |
Want to be able to reuse, and user is responsible for creating the
handle anyway, so they should also be responsible for destroying.
enabled.
run_spgemm()
to accept m/k/n (A is m * k, B is k * n, C is m * n) instead of forcing all matrices to be square.slightly different code path than SPGEMM_KK_MEMORY.