-
Notifications
You must be signed in to change notification settings - Fork 99
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature request : Support multiple rhs in SGS efficiently #488
Comments
It's implemented, but it's slow (it just loops over columns on the outside). I'm pretty sure Ichi was talking about the not-multithreaded case. |
@mhoemmen Are you talking about inside Ifpack2 Relaxation? It looks like Relaxation::MTGaussSeidel() does a loop over vectors. There would be a significant win there from moving that loop all the way inside the KK SGS implementation, from fewer kernel launches and better locality with the matrix accesses. |
@brian-kelley You would probably gain time on a GPU by not having to load the matrix in memory multiple times. |
@iyamazaki Are we talking about MTGaussSeidel or the non-MT Gauss-Seidel? |
Hi, @mhoemmen. I was looking at MTGaussSeidel, which seems to loop over the vectors in Ifpack2? But, I did not look at what the KokkosKernel does. |
I checked the MT-SGS case and I should have been clear in the original posting. It is implemented but it not in an optimal way. I will edit the description above. |
Non-MT Gauss-Seidel has the same issue but it's separate code. |
Within KokkosKernels MT-SGS multiple RHS doesn't exist at all. Within Ifpack2 the inefficient implementation (outermost loop over vectors) exists. |
Feature request : Support multiple rhs in MT-SGS efficiently. Currently, it is implemented but it could be better if we pass the multivector to Kokkos Kernels, which accepts and iterates on all the vectors. Based on comment by @mhoemmen serial SGS also does the same.
Customer: Exawind
@brian-kelley : One more thing when you are at SGS.
@iyamaza found this was not implemented. @lucbv
The text was updated successfully, but these errors were encountered: