Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compile error in batched gemm perf test #998

Closed
vqd8a opened this issue May 27, 2021 · 7 comments
Closed

Compile error in batched gemm perf test #998

vqd8a opened this issue May 27, 2021 · 7 comments
Assignees

Comments

@vqd8a
Copy link
Contributor

vqd8a commented May 27, 2021

I have encountered these errors when building Kokkos Kernels with tests on:

/ascldap/users/vqdang/Kokkos/kokkos-kernels/perf_test/blas/blas3/KokkosBlas3_gemm_perf_test.hpp: In function ‘void __gemm_copy_simd_view_to_3d_view(gemm_simd_args_t, dstViewType, options_t)’:
/ascldap/users/vqdang/Kokkos/kokkos-kernels/perf_test/blas/blas3/KokkosBlas3_gemm_perf_test.hpp:1460:1861: error: type/value mismatch at argument 2 in template parameter list for ‘template<class DataType, class ... Properties> struct Kokkos::ViewTraits’
   using h_subview_type_2d = Kokkos::View<src_scalar_type **, Kokkos::LayoutStride, Kokkos::HostSpace>;
/ascldap/users/vqdang/Kokkos/kokkos-kernels/perf_test/blas/blas3/KokkosBlas3_gemm_perf_test.hpp:1460:1861: note:   expected a type, got ‘std::conditional<((((Kokkos::Impl::ViewMapping<void, Kokkos::ViewTraits<double***, Kokkos::LayoutLeft, Kokkos::Cuda>, int, Kokkos::Impl::ALL_t, Kokkos::Impl::ALL_t>::rank == 0) || Kokkos::Impl::SubviewLegalArgsCompileTime<Kokkos::LayoutLeft, Kokkos::LayoutLeft, 2, 3, 0, int, Kokkos::Impl::ALL_t, Kokkos::Impl::ALL_t>::value) || (((Kokkos::Impl::ViewMapping<void, Kokkos::ViewTraits<double***, Kokkos::LayoutLeft, Kokkos::Cuda>, int, Kokkos::Impl::ALL_t, Kokkos::Impl::ALL_t>::rank <= 2) && Kokkos::Impl::ViewMapping<void, Kokkos::ViewTraits<double***, Kokkos::LayoutLeft, Kokkos::Cuda>, int, Kokkos::Impl::ALL_t, Kokkos::Impl::ALL_t>::R0) && std::integral_constant<bool, true>::value)) || (((Kokkos::Impl::ViewMapping<void, Kokkos::ViewTraits<double***, Kokkos::LayoutLeft, Kokkos::Cuda>, int, Kokkos::Impl::ALL_t, Kokkos::Impl::ALL_t>::rank <= 2) && Kokkos::Impl::ViewMapping<void, Kokkos::ViewTraits<double***, Kokkos::LayoutLeft, Kokkos::Cuda>, int, Kokkos::Impl::ALL_t, Kokkos::Impl::ALL_t>::R0_rev) && std::integral_constant<bool, false>::value)), Kokkos::LayoutLeft, Kokkos::LayoutStride>::type’
/ascldap/users/vqdang/Kokkos/kokkos-kernels/perf_test/blas/blas3/KokkosBlas3_gemm_perf_test.hpp:1460:3099: error: template argument 1 is invalid
   using h_subview_type_2d = Kokkos::View<src_scalar_type **, Kokkos::LayoutStride, Kokkos::HostSpace>;
/ascldap/users/vqdang/Kokkos/kokkos-kernels/perf_test/blas/blas3/KokkosBlas3_gemm_perf_test.hpp:1466:1: error: ‘h_subview_type_2d’ was not declared in this scope
   h_subview_type_2d h_sv2;
 ^ ~~~~~~~~~~~~~~~
/ascldap/users/vqdang/Kokkos/kokkos-kernels/perf_test/blas/blas3/KokkosBlas3_gemm_perf_test.hpp:1466:1: note: suggested alternative: ‘h_subview_type_4d’
   h_subview_type_2d h_sv2;
 ^ ~~~~~~~~~~~~~~~
 h_subview_type_4d
/ascldap/users/vqdang/Kokkos/kokkos-kernels/perf_test/blas/blas3/KokkosBlas3_gemm_perf_test.hpp:1496:2: error: ‘h_sv2’ was not declared in this scope
           h_sv2 = Kokkos::subview(h_sv1, Kokkos::ALL(), Kokkos::ALL(), simd_batch_size_idx);
  ^    
/ascldap/users/vqdang/Kokkos/kokkos-kernels/perf_test/blas/blas3/KokkosBlas3_gemm_perf_test.hpp:1496:2: note: suggested alternative: ‘h_sv1’
           h_sv2 = Kokkos::subview(h_sv1, Kokkos::ALL(), Kokkos::ALL(), simd_batch_size_idx);
  ^    
  h_sv1
/ascldap/users/vqdang/Kokkos/kokkos-kernels/perf_test/blas/blas3/KokkosBlas3_gemm_perf_test.hpp:1498:2: error: ‘h_sv2’ was not declared in this scope
           h_sv2 = Kokkos::subview(h_sv1, simd_batch_size_idx, Kokkos::ALL(), Kokkos::ALL());
  ^    
/ascldap/users/vqdang/Kokkos/kokkos-kernels/perf_test/blas/blas3/KokkosBlas3_gemm_perf_test.hpp:1498:2: note: suggested alternative: ‘h_sv1’
           h_sv2 = Kokkos::subview(h_sv1, simd_batch_size_idx, Kokkos::ALL(), Kokkos::ALL());
  ^    
  h_sv1
/ascldap/users/vqdang/Kokkos/kokkos-kernels/perf_test/blas/blas3/KokkosBlas3_gemm_perf_test.hpp:1502:81: error: there are no arguments to ‘h_sv2’ that depend on a template parameter, so a declaration of ‘h_sv2’ must be available [-fpermissive]
               h_dst(m, n, simd_internal_vec_idx + simd_batch_size_idx + vector_batch_idx) = h_sv2(m, n);
                                                                                 ^~~~~
/ascldap/users/vqdang/Kokkos/kokkos-kernels/perf_test/blas/blas3/KokkosBlas3_gemm_perf_test.hpp:1502:81: note: (if you use ‘-fpermissive’, G++ will accept your code, but allowing the use of an undeclared name is deprecated)
/ascldap/users/vqdang/Kokkos/kokkos-kernels/perf_test/blas/blas3/KokkosBlas3_gemm_perf_test.hpp:1504:81: error: there are no arguments to ‘h_sv2’ that depend on a template parameter, so a declaration of ‘h_sv2’ must be available [-fpermissive]
               h_dst(simd_internal_vec_idx + simd_batch_size_idx + vector_batch_idx, m, n) = h_sv2(m, n);
                                                                                 ^~~~~
/ascldap/users/vqdang/Kokkos/kokkos-kernels/perf_test/blas/blas3/KokkosBlas3_gemm_perf_test.hpp: In instantiation of ‘void __gemm_copy_simd_view_to_3d_view(gemm_simd_args_t, dstViewType, options_t) [with dstViewType = Kokkos::View<double***, Kokkos::LayoutLeft, Kokkos::Cuda>; gemm_simd_args_t = gemm_simd_args; options_t = perf_test_options]’:
/ascldap/users/vqdang/Kokkos/kokkos-kernels/perf_test/blas/blas3/KokkosBlas3_gemm_perf_test.hpp:1579:97:   required from ‘void __gemm_do_verify(options_t, gemm_args_t, void (*)(options_t, gemm_args_t)) [with ScalarType = double; LayoutType = Kokkos::LayoutLeft; DeviceType = Kokkos::Cuda; options_t = perf_test_options; gemm_args_t = gemm_args]’
/ascldap/users/vqdang/Kokkos/kokkos-kernels/perf_test/blas/blas3/KokkosBlas3_gemm_perf_test.hpp:1825:84:   required from here
/ascldap/users/vqdang/Kokkos/kokkos-kernels/perf_test/blas/blas3/KokkosBlas3_gemm_perf_test.hpp:1502:86: error: ‘h_sv2’ was not declared in this scope
               h_dst(m, n, simd_internal_vec_idx + simd_batch_size_idx + vector_batch_idx) = h_sv2(m, n);
                                                                                 ~~~~~^~~~~~
/ascldap/users/vqdang/Kokkos/kokkos-kernels/perf_test/blas/blas3/KokkosBlas3_gemm_perf_test.hpp:1502:86: note: suggested alternative: ‘h_sv1’
               h_dst(m, n, simd_internal_vec_idx + simd_batch_size_idx + vector_batch_idx) = h_sv2(m, n);
                                                                                 ~~~~~^~~~~~
                                                                                 h_sv1
/ascldap/users/vqdang/Kokkos/kokkos-kernels/perf_test/blas/blas3/KokkosBlas3_gemm_perf_test.hpp:1504:86: error: ‘h_sv2’ was not declared in this scope
               h_dst(simd_internal_vec_idx + simd_batch_size_idx + vector_batch_idx, m, n) = h_sv2(m, n);
                                                                                 ~~~~~^~~~~~
/ascldap/users/vqdang/Kokkos/kokkos-kernels/perf_test/blas/blas3/KokkosBlas3_gemm_perf_test.hpp:1504:86: note: suggested alternative: ‘h_sv1’
               h_dst(simd_internal_vec_idx + simd_batch_size_idx + vector_batch_idx, m, n) = h_sv2(m, n);
                                                                                 ~~~~~^~~~~~
                                                                                 h_sv1
make[2]: *** [perf_test/blas/blas3/CMakeFiles/KokkosBlas3_perf_test.dir/KokkosBlas3_perf_test.cpp.o] Error 1
make[1]: *** [perf_test/blas/blas3/CMakeFiles/KokkosBlas3_perf_test.dir/all] Error 2
make[1]: *** Waiting for unfinished jobs....
[100%] Linking CXX executable KokkosKernels_graph_openmp
[100%] Linking CXX executable KokkosKernels_blas_openmp
[100%] Built target KokkosKernels_graph_openmp
[100%] Built target KokkosKernels_blas_openmp
[100%] Linking CXX executable KokkosKernels_batched_dla_openmp
[100%] Linking CXX executable KokkosKernels_sparse_openmp
[100%] Built target KokkosKernels_batched_dla_openmp
[100%] Built target KokkosKernels_sparse_openmp
make: *** [all] Error 2

How to reproduce the errors on Weaver:
Module load:

module purge
module load devpack/20210226/openmpi/4.0.5/gcc/7.2.0/cuda/10.2.2

Kokkos build:

cmake ~/Kokkos/kokkos -DCMAKE_CXX_COMPILER=/home/vqdang/Kokkos/kokkos/bin/nvcc_wrapper -DCMAKE_INSTALL_PREFIX=~/Kokkos/kokkos-install-weaver -DCMAKE_BUILD_TYPE:STRING=RELEASE -DKokkos_ENABLE_OPENMP:BOOL=ON -DKokkos_ARCH_POWER9=ON -DKokkos_ENABLE_CUDA:BOOL=ON -DKokkos_ENABLE_CUDA_LAMBDA:BOOL=ON -DKokkos_ENABLE_CUDA_RELOCATABLE_DEVICE_CODE:BOOL=ON -DKokkos_ARCH_VOLTA70=ON
make -j install

Kokkos Kernels build:

cmake ~/Kokkos/kokkos-kernels -DCMAKE_CXX_COMPILER=/home/vqdang/Kokkos/kokkos/bin/nvcc_wrapper -DCMAKE_INSTALL_PREFIX=~/Kokkos/kokkoskernels-install-weaver-magma -DCMAKE_BUILD_TYPE:STRING=RELEASE -DKokkos_DIR=~/Kokkos/kokkos-install-weaver/lib64/cmake/Kokkos -DKokkosKernels_INST_COMPLEX_DOUBLE:BOOL=ON -DKokkosKernels_ENABLE_TESTS:BOOL=ON
make -j install
@e10harvey
Copy link
Contributor

Thanks, @vqd8a. This appears to be a software stack bug. I found the following:

  • I can reproduce this using your reproducer above
  • I can not reproduce this using using your reproducer with the following change: module load cmake/3.18.0 gcc/7.2.0 cuda/10.1.105 instead of module load devpack/20210226/openmpi/4.0.5/gcc/7.2.0/cuda/10.2.2

I will briefly look into a work around for devpack/20210226/openmpi/4.0.5/gcc/7.2.0/cuda/10.2.2.

@e10harvey
Copy link
Contributor

e10harvey commented May 27, 2021

I also found the following:

  • After building kokkos and kokkos-kernes with cmake/3.18.0 gcc/7.2.0 cuda/10.1.105 following the reproducer above, then removing the KokkosBlas3_perf_test binary and running module purge; module load devpack/20210226/openmpi/4.0.5/gcc/7.2.0/cuda/10.2.2, and finally re-building KokkosBlas3_perf_test did not generate build errors.

@vqd8a: This and the error message above, particularly KokkosBlas3_gemm_perf_test.hpp:1460:1861: note: expected a type, got ‘std::conditional<((((Kokkos::Impl::ViewMapping<void, Kokkos::ViewTraits<double***, Kokkos::LayoutLeft, Kokkos::Cuda>, int, Kokkos::Impl::ALL_t, Kokkos::Impl::ALL_t>::rank == 0) || Kokkos::Impl::SubviewLegalArgsCompileTime<Kokkos::LayoutLeft, Kokkos::LayoutLeft, 2, 3, 0, int, Kokkos::Impl::ALL_t, Kokkos::Impl::ALL_t>::value) || (((Kokkos::Impl::ViewMapping<void, Kokkos::ViewTraits<double***, Kokkos::LayoutLeft, Kokkos::Cuda>, int, Kokkos::Impl::ALL_t, Kokkos::Impl::ALL_t>::rank <= 2) && Kokkos::Impl::ViewMapping<void, Kokkos::ViewTraits<double***, Kokkos::LayoutLeft, Kokkos::Cuda>, int, Kokkos::Impl::ALL_t, Kokkos::Impl::ALL_t>::R0) && std::integral_constant<bool, true>::value)) || (((Kokkos::Impl::ViewMapping<void, Kokkos::ViewTraits<double***, Kokkos::LayoutLeft, Kokkos::Cuda>, int, Kokkos::Impl::ALL_t, Kokkos::Impl::ALL_t>::rank <= 2) && Kokkos::Impl::ViewMapping<void, Kokkos::ViewTraits<double***, Kokkos::LayoutLeft, Kokkos::Cuda>, int, Kokkos::Impl::ALL_t, Kokkos::Impl::ALL_t>::R0_rev) && std::integral_constant<bool, false>::value)), Kokkos::LayoutLeft, Kokkos::LayoutStride>::type’ indicates that the bug is from building kokkos with devpack/20210226/openmpi/4.0.5/gcc/7.2.0/cuda/10.2.2; do you agree?

@crtrott
Copy link
Member

crtrott commented May 27, 2021

this is very weird. That conditional stuff should really only happen in a subview call? Where did this come from?

@e10harvey
Copy link
Contributor

this is very weird. That conditional stuff should really only happen in a subview call? Where did this come from?

Using kokkos@524a10e and kokkos-kernels@118fe76, it came from:

using h_subview_type_2d = Kokkos::View<src_scalar_type **, Kokkos::LayoutStride, Kokkos::HostSpace>;

according to nvcc_wrapper.

@e10harvey
Copy link
Contributor

I can not reproduce this using cmake/3.19.1 gcc/7.3.0 cuda/11.0

@kliegeois
Copy link
Contributor

I have faced the same issue on weaver with the following modules:

Currently Loaded Modulefiles:
  1) cmake/3.19.3      2) binutils/2.30.0   3) gcc/7.2.0         4) cuda/10.2.2

It works fine with cuda/10.1.243 instead of cuda/10.2.2.

@kliegeois
Copy link
Contributor

The nvprof of cuda/10.1.243 on weaver is no more working.
It is now recommended to use the nvprof of cuda/10.2.2.
However, the above mentioned issue is still present while building KK with cuda/10.2.2 on weaver.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants