-
Notifications
You must be signed in to change notification settings - Fork 576
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tpetra: improve performance of TAFC (Transfer and fillComplete) #11689
Labels
Comments
Sample timer data:
|
Why is it running the Legacy code path? There is an MMOpt, if I remember correctly. |
My understanding is that See
Trilinos/packages/tpetra/core/src/Tpetra_CrsMatrix_def.hpp Lines 7756 to 7757 in 99e0860
|
Looking at the code, that seems to be correct. |
This was referenced Mar 17, 2023
lucbv
added a commit
to lucbv/Trilinos
that referenced
this issue
Sep 14, 2023
After the refactor done in PR trilinos#11751, these new changes remove the copy to host after sorting the CRS matrix and instead just call the setAllValues overload that accepts Kokkos Views. Should improve performance a bit and this provides a step toward completing issue trilinos#11689.
cwpearson
pushed a commit
to cwpearson/Trilinos
that referenced
this issue
Sep 19, 2023
After the refactor done in PR trilinos#11751, these new changes remove the copy to host after sorting the CRS matrix and instead just call the setAllValues overload that accepts Kokkos Views. Should improve performance a bit and this provides a step toward completing issue trilinos#11689.
closing as complete |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Enhancement
@trilinos/tpetra
TAFC currently is a host-based operation. This makes it potentially very expensive, for example, in a matrix product such as$R*(AP)$ in multigrid setup. Within
Tpetra::CrsMatrix::transferAndFillComplete()
, I've measured the tall poles to beTpetra: improve performance of unpackAndCombineWithOwningPIDsCount #11694
Tpetra: improve performance of lowCommunicationMakeColMapAndReindex #11695
Tpetra: improve performance of sortAndMergeCrsEntries #11696
The text was updated successfully, but these errors were encountered: