Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cleanup spadd (#685, #694) #773

Merged
merged 3 commits into from
Jul 30, 2020
Merged

Conversation

brian-kelley
Copy link
Contributor

@brian-kelley brian-kelley commented Jul 29, 2020

  • Remove "max_nnz_in_result" and its getter/setter from SpADD handle. It was really a duplicate of "c_nnz". Max is a bad name for this since it is not an upper bound, it is always exact. This will need to be propagated to Tpetra_MatrixMatrix_def.hpp when we do promotion.
  • Remove team size/vector length stuff from SpADD handle, since SpADD only uses range policy. This was just copied from SpGEMM handle without needing it
  • No longer require C's scalar values to be zero-initialized when they are passed in (SpAdd requires output values to be zero-initialized, but this shouldn't be needed #694)
  • If A and/or B have unmerged rows (aka repeated column in a row), the resulting C is now always merged. Before, it would produce the correct sum but could only merge one entry from A and one entry from B. Now, it can merge an arbitrary number of entries from A and B (SpAdd doesn't merge entries correctly #685)
  • Improve test coverage: test with unmerged inputs (both sorted and unsorted). The test now checks that the result is sorted and merged. Also add a test for a matrix with nonzero dimensions but zero entries.

Note that these are all just enhancements, not bugfixes. I noticed these while trying to track down an EMPIRE issue, where add's MergeEntriesFunctor is crashing after my recent cuSPARSE TPL changes. I think the underlying cause of the bug is somewhere else since cuSPARSE SpMV is completely unrelated to add.

Testing:
#######################################################
PASSED TESTS
#######################################################
clang-8.0-Pthread_Serial-release build_time=336 run_time=173
clang-9.0.0-Pthread-release build_time=136 run_time=77
clang-9.0.0-Serial-release build_time=174 run_time=64
cuda-10.1-Cuda_OpenMP-release build_time=848 run_time=167
cuda-9.2-Cuda_Serial-release build_time=794 run_time=210
gcc-4.8.4-OpenMP-release build_time=117 run_time=64
gcc-7.3.0-OpenMP-release build_time=179 run_time=2532
gcc-7.3.0-Pthread-release build_time=132 run_time=80
gcc-8.3.0-Serial-release build_time=206 run_time=66
gcc-9.1-OpenMP-release build_time=190 run_time=392
gcc-9.1-Serial-release build_time=175 run_time=67
intel-17.0.1-Serial-release build_time=263 run_time=59
intel-18.0.5-OpenMP-release build_time=425 run_time=67
intel-19.0.5-Pthread-release build_time=467 run_time=72
clang-8.0-Cuda_OpenMP-release (test failed) <== sparse_openmp timed out due to machine congestion, but I re-ran it today from the same build and it passed
#######################################################

-remove vector/team size stuff from handle, because it only uses range policy
-remove "get_max_result_nnz", should only use "get_c_nnz" now. Max is a
bad name for what this is, since C always has exactly this many
nonzeros, it's not an upper bound.
-address kokkos#694 (don't require values to be initialized)
-address kokkos#685 (produce a fully sorted and merged C even if A and/or B
aren't merged)
-improve testing: test matrix with zero rows per entry, and test matrix
with duplicate entries
@brian-kelley brian-kelley self-assigned this Jul 29, 2020
@brian-kelley brian-kelley requested review from ndellingwood and lucbv and removed request for ndellingwood July 29, 2020 17:34
Copy link
Contributor

@lucbv lucbv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me, I just have one comment regarding value initialization.

src/sparse/KokkosSparse_spadd.hpp Outdated Show resolved Hide resolved
just in case filling with 0 doesn't work for UQ/PCE types
(they probably do but I didn't try it)
@brian-kelley
Copy link
Contributor Author

Still good after making the 1-line change:

#######################################################
PASSED TESTS
#######################################################
clang-8.0-Cuda_OpenMP-release build_time=658 run_time=170
clang-8.0-Pthread_Serial-release build_time=210 run_time=141
clang-9.0.0-Pthread-release build_time=148 run_time=71
clang-9.0.0-Serial-release build_time=169 run_time=63
cuda-10.1-Cuda_OpenMP-release build_time=942 run_time=186
cuda-9.2-Cuda_Serial-release build_time=901 run_time=212
gcc-4.8.4-OpenMP-release build_time=122 run_time=67
gcc-7.3.0-OpenMP-release build_time=158 run_time=113
gcc-7.3.0-Pthread-release build_time=185 run_time=76
gcc-8.3.0-Serial-release build_time=146 run_time=63
gcc-9.1-OpenMP-release build_time=215 run_time=151
gcc-9.1-Serial-release build_time=168 run_time=61
intel-17.0.1-Serial-release build_time=255 run_time=58
intel-18.0.5-OpenMP-release build_time=353 run_time=60
intel-19.0.5-Pthread-release build_time=452 run_time=67

@brian-kelley brian-kelley merged commit b02741b into kokkos:develop Jul 30, 2020
@brian-kelley brian-kelley deleted the CleanupSpadd branch July 30, 2020 06:56
brian-kelley added a commit to brian-kelley/kokkos-kernels that referenced this pull request Aug 3, 2020
Remove some deprecated stuff that no longer exists/works after kokkos#773
ndellingwood added a commit to trilinos/Trilinos that referenced this pull request Aug 19, 2020
brian-kelley added a commit to brian-kelley/kokkos-kernels that referenced this pull request Sep 21, 2020
Remove some deprecated stuff that no longer exists/works after kokkos#773
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants