You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Serial implementation should not use any parallelism nor dispatch to TPLs, so it can be called within any parallel context;
Team implementation should take team member argument, be callable in a functor (same where TeamThreadRange) and use TeamThreadRange+ThreadVectorRange combination;
TeamVector: WAIT for confirmation: should use ThreadVectorRange only and be callable from inside TeamThreadRange ? Should NOT use TeamVectorRange !
Note:ThreadVectorRange can be also called like TeamThreadRange (not inside it) - and then works like TeamVectorRange (which is probably better choice - TODO learn what's the difference)
Scope
Try to give common "feeling" to both interfaces 2a and 2b
2a. Execution Space
Blas kernels to cover:
The objective would be do things like:
2b. Parallelization level dispatch
Have parallelization level (serial, team and team-vector) as a parameter - like
ArgMode
in: https://github.com/kokkos/kokkos-kernels/blob/develop/src/batched/dense/KokkosBatched_Gemm_Decl.hpp#L98-L119 (inspiration):TeamThreadRange
) and useTeamThreadRange
+ThreadVectorRange
combination;ThreadVectorRange
only and be callable from insideTeamThreadRange
? Should NOT useTeamVectorRange
!Blas kernels to cover:
The text was updated successfully, but these errors were encountered: