Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failing spgemm unit test with MKL #289

Closed
ndellingwood opened this issue Sep 11, 2018 · 23 comments
Closed

Failing spgemm unit test with MKL #289

ndellingwood opened this issue Sep 11, 2018 · 23 comments
Assignees

Comments

@ndellingwood
Copy link
Contributor

Sample output with either intel/17 or intel/18.2.199 on Bowman with OpenMP backend:

[ RUN      ] openmp.sparse_spgemm_double_int_int_TestExecSpace
/home/ndellin/kokkos-kernels/unit_test/sparse/Test_Sparse_spgemm.hpp:366: Failure
Value of: (failed == is_expected_to_fail)
  Actual: false
Expected: true
[  FAILED  ] openmp.sparse_spgemm_double_int_int_TestExecSpace (9309 ms)
[----------] 1 test from openmp (9309 ms total)

[----------] Global test environment tear-down
[==========] 1 test from 1 test case ran. (9311 ms total)
[  PASSED  ] 0 tests.
[  FAILED  ] 1 test, listed below:
[  FAILED  ] openmp.sparse_spgemm_double_int_int_TestExecSpace
@ndellingwood ndellingwood self-assigned this Sep 11, 2018
@ndellingwood
Copy link
Contributor Author

Think I have this chased down, looks like there is a macro defined in the config file HAVE_KOKKOSKERNELS_MKL that never gets set but is expected in the failing unit test at this line:
line 311

@ndellingwood
Copy link
Contributor Author

Updating that line to instead check that
#if !defined(HAVE_KOKKOSKERNELS_MKL) && !defined(KOKKOSKERNELS_ENABLE_TPL_BLAS)
seems to fix it, will put in PR shortly. Will probably combine with update requested by #268

@srajama1
Copy link
Contributor

I don't think !defined(KOKKOSKERNELS_ENABLE_TPL_BLAS) is needed for this test.

@ndellingwood
Copy link
Contributor Author

@srajama1 To have the expected setting for is_expected_to_fail the HAVE_KOKKOSKERNELS_MKL should be defined. Since it is never set in the CMakeLists.txt file I thought adding the check for !defined(KOKKOSKERNELS_ENABLE_TPL_BLAS) was the right thing to do. Should the CMakeLists.txt file be changed to define HAVE_KOKKOSKERNELS_MKL when KOKKOSKERNELS_ENABLE_TPL_BLAS is defined?

@ndellingwood
Copy link
Contributor Author

The change above fixes the failing tests for intel/18.2.199 and intel/17.

@srajama1
Copy link
Contributor

Is there something wrong in setting HAVE_KOKKOSKERNELS_MKL in the CMakeLists.txt when MKL is enabled ? TPL_BLAS could be enabled even if cuBLAS is enabled, right ?

@ndellingwood
Copy link
Contributor Author

Also, the logic in the config file never does anything with HAVE_KOKKOSKERNELS_MKL; the CMakeLists.txt would only have an affect if building with Trilinos I spose, I'll add an update to the macro logic in the config file to set HAVE_KOKKOSKERNELS_MKL when KOKKOSKERNELS_ENABLE_TPL_BLAS is defined; then this should work either as stand-alone kokkos-kernels or within Trilinos, I think...

@ndellingwood
Copy link
Contributor Author

@srajama1 pointed out I used the wrong macro above (and in my sandbox), should have used KOKKOSKERNELS_ENABLE_TPL_MKL. Made the change and retested, things are passing :)

ndellingwood added a commit that referenced this issue Sep 11, 2018
Address issue #289 - fix macro use for unit test,
and #268 - update macro check for INTEL 18 to also check that update is 2,
needed due to changes in MKL interface with updated version.
@ndellingwood
Copy link
Contributor Author

Issue PR #290 with fix

@kyungjoo-kim
Copy link
Contributor

No one gets the following compilation error in spotchecking on bowman ?

make[2]: *** [KokkosSparse_spgemm_numeric_eti_spec_inst_double_size_t_int64_t_LayoutRight_OpenMP_HostSpace_HostSpace.o] Error 4
In file included from /home/kyukim/Work/lib/kokkoskernels/master/src/sparse/impl/KokkosSparse_spgemm_numeric_spec.hpp(58),
                 from /home/kyukim/Work/lib/kokkoskernels/master/src/impl/generated_specializations_cpp/spgemm_numeric/KokkosSparse_spgemm_numeric_eti_spec_inst_double_int_int_LayoutRight_OpenMP_HostSpace_HostSpace.cpp(55):
/home/kyukim/Work/lib/kokkoskernels/master/src/sparse/impl/KokkosSparse_spgemm_mkl_impl.hpp(49): catastrophic error: cannot open source file "mkl_spblas.h"
  #include "mkl_spblas.h"
                         ^

ndellingwood added a commit that referenced this issue Sep 12, 2018
Part of resolving issue #268 and #289
@ndellingwood
Copy link
Contributor Author

@kyungjoo-kim which version of intel were you using? We can add the correct if-guards to protect against this.

@kyungjoo-kim
Copy link
Contributor

intel/compilers/18.2.199

@ndellingwood
Copy link
Contributor Author

Huh, that's the one I'm using...
I've only been testing the unit_test directory for now, did this show up with the perf_tests?

@kyungjoo-kim
Copy link
Contributor

no.... this shows up in eti compiling

ndellingwood added a commit that referenced this issue Sep 12, 2018
Address comments in issue #289
@ndellingwood
Copy link
Contributor Author

@kyungjoo-kim can you post your script for calling generate_makefile? I'm not running into this issue with my usual configuration so I'll need to match yours.

@kyungjoo-kim
Copy link
Contributor

../../master/scripts/test_all_sandia --kokkos-path=/home/kyukim/Work/lib/kokkos/master --kokkoskernels-path=/home/kyukim/Work/lib/kokkoskernels/master --spot-check --with-tpls=mkl  

@kyungjoo-kim
Copy link
Contributor

@ndellingwood if you cannot reproduce the same error, let me know when you are available in your office. I will come by. Lately I figured out that my custum bash environement can generate some errors that others cannot see.

@ndellingwood
Copy link
Contributor Author

Do you also need to set export MKL_PATH=$MKLROOT and pass -mkl as argument to --cxxflags?

@kyungjoo-kim
Copy link
Contributor

I thought that such environment variable should be set corresponding to the testing machine. I did not set MKL_PATH additionally.

@ndellingwood
Copy link
Contributor Author

Merged in #290 to address first set of problems in this issue. Will open new PR once we get to the bottom of the issue with #include "mkl_spblas.h"

@kyungjoo-kim
Copy link
Contributor

And I believe that test_all_sandia script load appropriate modules; thus, the environment variables should be set in the script too as the varialbe does not exist before the modules are loaded.

@ndellingwood
Copy link
Contributor Author

@kyungjoo-kim I'm not sure if it is needed to set the env variable or the cxxflags, but wondering if this needs to be done like when generating a makefile to test with mkl. I haven't tried the test_all_sandia script while enabling mkl, glad you're trying it out so we can get it working for future Jenkins testing.

@ndellingwood
Copy link
Contributor Author

Looked at the other include issue with @kyungjoo-kim, we believe the issue is with using test_all_sandia which automatically loads modules but does not set MKL_PATH, and thus it defaults to expecting a default path associated with that set in a SEMS environment (see Makefile.kokkos-kernels). I'm going to mark this issue as in develop, we can fix the test_all_sandia testing issue with MKL on non_SEMS machines in a separate issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants