Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GEMM Serial test failures #528

Closed
ndellingwood opened this issue Dec 13, 2019 · 6 comments
Closed

GEMM Serial test failures #528

ndellingwood opened this issue Dec 13, 2019 · 6 comments

Comments

@ndellingwood
Copy link
Contributor

ndellingwood commented Dec 13, 2019

The GEMM unit test is failing in serial builds on White with the xl compiler following merge of PR #524

The change likely causing the serial failures was the modification in
unit_test/blas/Test_Blas3_gemm.hpp

from this

    Kokkos::fill_random(A,rand_pool,ScalarA(10));
    Kokkos::fill_random(B,rand_pool,ScalarB(10));
    Kokkos::fill_random(C,rand_pool,ScalarC(10));

to this

    Kokkos::fill_random(A,rand_pool, Kokkos::rand<typename Kokkos::Random_XorShift64_Pool<execution_space>::generator_type,ScalarA>::max());
    Kokkos::fill_random(B,rand_pool, Kokkos::rand<typename Kokkos::Random_XorShift64_Pool<execution_space>::generator_type,ScalarB>::max());
    Kokkos::fill_random(C,rand_pool, Kokkos::rand<typename Kokkos::Random_XorShift64_Pool<execution_space>::generator_type,ScalarC>::max());

@seheracer do you think if we add if-guards etc to allow the new fill_random calls when Cuda is enabled and allow the previous behavior otherwise will achieve the same goal, or is another approach needed?

Edit: Adding link to failing Jenkins build

@ndellingwood
Copy link
Contributor Author

ndellingwood commented Dec 13, 2019

Also failing with these builds:

KokkosKernels_White_XL_16_1_1_Serial_cpp14 
KokkosKernels_White_OpenMP_gcc_720_cpp14 
KokkosKernels_KokkosDev2_GCC720 
KokkosKernels_White_OpenMP_gcc_640
KokkosKernels_Apollo_Clang6_Threads
...

Edit: Nearly all host builds are seeing this test fail.

@ndellingwood
Copy link
Contributor Author

I think this will get tests back to passing:

#ifdef KOKKOS_ENABLE_CUDA

if (std::is_same<execution_space,Kokkos::Cuda>::value) {
    Kokkos::fill_random(A,rand_pool, Kokkos::rand<typename Kokkos::Random_XorShift64_Pool<execution_space>::generator_type,ScalarA>::max());
    Kokkos::fill_random(B,rand_pool, Kokkos::rand<typename Kokkos::Random_XorShift64_Pool<execution_space>::generator_type,ScalarB>::max());
    Kokkos::fill_random(C,rand_pool, Kokkos::rand<typename Kokkos::Random_XorShift64_Pool<execution_space>::generator_type,ScalarC>::max());
}
else
#endif 
{
    Kokkos::fill_random(A,rand_pool,ScalarA(10));
    Kokkos::fill_random(B,rand_pool,ScalarB(10));
    Kokkos::fill_random(C,rand_pool,ScalarC(10));
}

I'll put in a PR shortly

@seheracer
Copy link
Contributor

@ndellingwood Thanks! I shouldn't have skipped the extensive testing, lessons learned..

@ndellingwood
Copy link
Contributor Author

@ndellingwood Thanks! I shouldn't have skipped the extensive testing, lessons learned..

Yeah, my mistake for requesting to skip it since the Trilinos tests passed which included host builds, lessons learned indeed...

@ndellingwood
Copy link
Contributor Author

The change above worked, single host run on kokkos-dev-2 confirmed:

[ndellin@kokkos-dev-2 TestAllSandia]$ ../../scripts/test_all_sandia --spot-check gcc/7.3 --skip-hwloc
Running on machine: kokkos-dev-2
Going to test compilers:  gcc/7.3.0
Testing compiler gcc/7.3.0
  Starting job gcc-7.3.0-Pthread-release
  Starting job gcc-7.3.0-OpenMP-release
  PASSED gcc-7.3.0-OpenMP-release
  PASSED gcc-7.3.0-Pthread-release
#######################################################
PASSED TESTS
#######################################################
gcc-7.3.0-OpenMP-release build_time=218 run_time=272
gcc-7.3.0-Pthread-release build_time=204 run_time=455
#######################################################

I'll have a PR ready pretty quick here

@ndellingwood
Copy link
Contributor Author

PR #529 issued.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants