-
Notifications
You must be signed in to change notification settings - Fork 99
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add two BLAS/LAPACK calls needed by: Sptrsv supernode #552 #589
Comments
@e10harvey : Yes for no. 1. I don't understand the question for permute_vector.
|
Is part of this issue to also provide the blas and lapack capabilities necessary to replace the cblas and lapacke calls in the supernodal sptrsv, or will that be discussed and addressed by a different issue? |
@ndellingwood This issue basically does that. |
This is mainly just a note to myself (which I should have made clear) -- I'm looking into which internal KokkosKernels routines we can use instead of these
Got it -- I'll have to look into this one further as well. |
@srajama1, @ndellingwood: I am confused. This issue is to address items no. 1 and no. 2 above which was gleaned from the PR feedback left by Mark. What specifically in the supernodal sptrsv code needs to be replaced? |
@iyamazaki Can you elaborate? |
@iyamazaki: Do you have your Matrix Market formatted test files for the perf_test |
Thank you so much for looking at this, @e10harvey !! In the current code, I use xTRMM and xTRTRI to setup our solver. Since the setup is currently on the host (sequential), I am calling cblas_xtrmm and LAPACKE_xtrtri. |
Hi, @e10harvey, again. I need to check with @ndellingwood . I have enabled my code only through make-script (e.g., compileKokkosKernelsSimple.sh), and have not tried using cmake. |
@iyamazaki: No problem, thanks for the swift responses! Would you please point me to where in the code you're using "xTRMM and xTRTRI to setup our solver"? Also, where are "cblas_xtrmm and LAPACKE_xtrtri" called?
Hi, @iyamazaki. Does |
@e10harvey: The cblas and lapacke functions are called in src/sparse/KokkosSparse_sptrsv_supernode.hpp and src/sparse/KokkosSparse_sptrsv_superlu.hpp. That is for the general sparse-triangular solve. The perf_tests for the supernodal triangular solve are KokkosSparse_sptrsv_superlu.exe and KokkosSparse_sptrsv_cholmod.exe. |
@iyamazaki: Thanks! I have: $ ./KokkosKernels_sparse_sptrsv.exe --help
Options:
--test [OPTION] : Use different kernel implementations
Options:
lvlrp, lvltp1, lvltp2
cusparse (Vendor Libraries)
-lf [file] : Read in Matrix Market formatted text file 'file'.
-uf [file] : Read in Matrix Market formatted text file 'file'.
--offset [O] : Subtract O from every index.
Useful in case the matrix market file is not 0 based.
-rpt [K] : Number of Rows assigned to a thread.
-ts [T] : Number of threads per team.
-vl [V] : Vector-length (i.e. how many Cuda threads are a Kokkos 'thread').
--loop [LOOP] : How many spmv to run to aggregate average time. How do I run |
Hi, @e10harvey That perf_test is for a general sparse-triangular solve by @ndellingwood . If you want to run the supernodal version (that uses CBLAS and LAPACKE), then you want to run KokkosSparse_sptrsv_superlu.exe or KokkosSparse_sptrsv_cholmod.exe |
@e10harvey The SuperLU TPL changes for CMake are not in develop. @jjwilke has a PR but that has conflict. See Use make system for now. |
This issue is to address a subset of feedback from #552.
The following items should be addressed:
KokkosSparse_sptrsv_aux.hpp
definitions of{for,back}wardP_supernode
should be replaced with the correct KokkosBLAS / LAPACK / KokkosKernels routines. WillKokkosKernels::Impl::permute_vector
orKokkosKernels::Impl::permute_block_vector
work?print_crsmat
should be replaced with a new routine (print_2Dview
) inKokkosKernels::Impl::print_2Dview
that usesKokkosKernels::Impl::print_1Dview
trmm
: ETI and TPL_CBLAS. Similarly we should have two specializations fortrtri
: ETI and TPL_LAPACKE (this is HostBlas). If I restructure this code correctly, the following behavior (in general) should be realized by adding KokkosBlas support fortrmm
andtrtri
:KokkosBlas3::trmm
:KokkosBlas3::trtri
:EDIT: @srajama1: is no 3. correct?
@srajama1, please let me know if this is what you had in mind.
The text was updated successfully, but these errors were encountered: