Skip to content

BLAS interface(s)

uhetmaniuk edited this page Nov 30, 2021 · 1 revision

Kokkos Kernels has multiple adapters for BLAS routines. For the sake of simplicity, this page will look only at {S,D,C,Z}GEMM with op(B) = B.

  • When KokkosKernels_ENABLE_TPL_BLAS is set to ON,

    • the struct GEMM uses the TPL implementation when the Kokkos::View objects A, B, and C have the same layout.
  • When KokkosKernels_ENABLE_TPL_CUBLAS is set to ON,

    • the struct GEMM uses the TPL implementation when the Kokkos::View objects A, B, and C have the same layout.
  • The function KokkosBlas::gemm implements the cases op(A) = A, op(A) = A^T, and op(A) = A^H.

  • The function KokkosBlas::gemm uses multiple variants:

    • when A has a LayoutRight, op(A) = A^T, and MN < 1600, then we use DotBasedGEMM.
    • else the structure KokkosBlas::Impl::GEMMImpl is used.
      • the structure allocates 3 scratch matrices and deep-copies blocks of A and B.
Clone this wiki locally