forked from kokkos/kokkos-kernels
-
Notifications
You must be signed in to change notification settings - Fork 0
BLAS interface(s)
uhetmaniuk edited this page Nov 30, 2021
·
1 revision
Kokkos Kernels has multiple adapters for BLAS routines.
For the sake of simplicity, this page will look only at {S,D,C,Z}GEMM
with op(B) = B
.
-
When
KokkosKernels_ENABLE_TPL_BLAS
is set to ON,- the struct
GEMM
uses the TPL implementation when theKokkos::View
objectsA
,B
, andC
have the same layout.
- the struct
-
When
KokkosKernels_ENABLE_TPL_CUBLAS
is set to ON,- the struct
GEMM
uses the TPL implementation when theKokkos::View
objectsA
,B
, andC
have the same layout.
- the struct
-
The function
KokkosBlas::gemm
implements the casesop(A) = A
,op(A) = A^T
, andop(A) = A^H
. -
The function
KokkosBlas::gemm
uses multiple variants:- when
A
has a LayoutRight,op(A) = A^T
, andMN < 1600
, then we useDotBasedGEMM
. - else the structure
KokkosBlas::Impl::GEMMImpl
is used.- the structure allocates 3 scratch matrices and deep-copies blocks of
A
andB
.
- the structure allocates 3 scratch matrices and deep-copies blocks of
- when