Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question on how to build Kokkos with Tpetra #1412

Closed
kvmkrao opened this issue Jun 9, 2017 · 10 comments
Closed

Question on how to build Kokkos with Tpetra #1412

kvmkrao opened this issue Jun 9, 2017 · 10 comments
Assignees
Labels
Framework tasks Framework tasks (used internally by Framework team) impacting: configure or build The issue is primarily related to configuring or building type: question

Comments

@kvmkrao
Copy link

kvmkrao commented Jun 9, 2017

CC: @wfspotz
I set Trilonos path and compiled a simple Kokkos with Tpetra code (https://github.com/trilinos/Trilinos_tutorial/wiki/KokkosExample01) with CMakeList.txt. The compilation was not successful because of the following errors:

_Scanning dependencies of target example
[ 50%] Building CXX object CMakeFiles/example.dir/kokkosexample01.cpp.o
/oasis/scratch/comet/vkotteda/Software/trilinos/install/include/impl/Kokkos_Error.hpp(76): error: namespace "Kokkos::Impl" has no member "cuda_abort"

/oasis/scratch/comet/vkotteda/Software/trilinos/install/include/impl/Kokkos_Atomic_Increment.hpp(61): error: namespace "Kokkos" has no member "atomic_fetch_add"

/oasis/scratch/comet/vkotteda/Software/trilinos/install/include/impl/Kokkos_Atomic_Increment.hpp(76): error: namespace "Kokkos" has no member "atomic_fetch_add"

/oasis/scratch/comet/vkotteda/Software/trilinos/install/include/impl/Kokkos_Atomic_Increment.hpp(91): error: namespace "Kokkos" has no member "atomic_fetch_add"

/oasis/scratch/comet/vkotteda/Software/trilinos/install/include/impl/Kokkos_Atomic_Increment.hpp(106): error: namespace "Kokkos" has no member "atomic_fetch_add"

/oasis/scratch/comet/vkotteda/Software/trilinos/install/include/impl/Kokkos_Atomic_Decrement.hpp(63): error: namespace "Kokkos" has no member "atomic_fetch_sub"

/oasis/scratch/comet/vkotteda/Software/trilinos/install/include/impl/Kokkos_Atomic_Decrement.hpp(78): error: namespace "Kokkos" has no member "atomic_fetch_sub"

/oasis/scratch/comet/vkotteda/Software/trilinos/install/include/impl/Kokkos_Atomic_Decrement.hpp(93): error: namespace "Kokkos" has no member "atomic_fetch_sub"

/oasis/scratch/comet/vkotteda/Software/trilinos/install/include/impl/Kokkos_Atomic_Decrement.hpp(108): error: namespace "Kokkos" has no member "atomic_fetch_sub"

9 errors detected in the compilation of "/tmp/tmpxft_00001cb6_00000000-7_kokkosexample01.cpp1.ii".
make[2]: *** [CMakeFiles/example.dir/kokkosexample01.cpp.o] Error 2
make[1]: *** [CMakeFiles/example.dir/all] Error 2
make: *** [all] Error 2
_

I thought that these errors are not related to the installation of Trilinos and created a file using Makefile.export.Trilinos (please see the attachment, #1409 ) to compile the cpp code as there is no documentation to compile Kokkos (CUDA) with Tpetra codes.

I use CMakeList.txt to compile MPI/E(T)petra codes and Makefile to compile the Kokkos codes.
Can I use the attachment to compile those codes ?

compile.txt

@mhoemmen
Copy link
Contributor

Are you able to build Kokkos examples? Please try that first.

@mhoemmen mhoemmen changed the title Kokkos with Tpetra : compilation error Question on how to build Kokkos with Tpetra Jun 12, 2017
@kvmkrao
Copy link
Author

kvmkrao commented Jun 12, 2017

I loaded gnu/4.9.2, mvapich2/2.2, CUDA/7.5 modules and used the following the script to build Kokkos

home=$(pwd)
module load gnu/4.9.2
module load cuda/7.5
module load mvapich2_gdr/2.2

export CUDA_LAUNCH_BLOCKING=1
cd build
../src/generate_makefile.bash --prefix=$home/kokkos/install
make -j KOKKOS_DEVICES=Cuda
make install

Then, I compiled some of the tutorials in src/example/tutorial/ directory using make -j KOKKOS_DEVICES=Cuda as well as make -j. And, I successfully ran .host as well as .cuda extension files.

Output from .cuda extension file for 02_simple_reduce_lambda case

Kokkos::Cuda::initialize WARNING: running kernels compiled for compute capability 3.5 on device with compute capability 3.7 , this will likely reduce potential performance.
Kokkos::Cuda::initialize WARNING: running kernels compiled for compute capability 3.5 on device with compute capability 3.7 , this will likely reduce potential performance.
Sum of squares of integers from 0 to 9, computed in parallel, is 285
Sum of squares of integers from 0 to 9, computed sequentially, is 285
Sum of squares of integers from 0 to 9, computed in parallel, is 285
Sum of squares of integers from 0 to 9, computed sequentially, is 285

Output from .host extension file
IBRUN: Command is ./02_simple_reduce_lambda.host
IBRUN: Command is /oasis/scratch/comet/vkotteda/temp_project/Software/trilinos_3rd/kokkos/src/example/tutorial/02_simple_reduce_lambda/02_simple_reduce_lambda.host
IBRUN: no hostfile mod needed
IBRUN: Nodefile is /tmp/2V9_3rCXu1

IBRUN: MPI binding policy: scatter/core for 1 threads per rank (12 cores per socket)
IBRUN: Adding MV2_USE_OLD_BCAST=1 to the environment
IBRUN: Adding MV2_CPU_BINDING_LEVEL=core to the environment
IBRUN: Adding MV2_ENABLE_AFFINITY=1 to the environment
IBRUN: Adding MV2_DEFAULT_TIME_OUT=23 to the environment
IBRUN: Adding MV2_CPU_BINDING_POLICY=scatter to the environment
IBRUN: Adding MV2_USE_HUGEPAGES=0 to the environment
IBRUN: Adding MV2_HOMOGENEOUS_CLUSTER=0 to the environment
IBRUN: Adding MV2_USE_UD_HYBRID=0 to the environment
IBRUN: Added 8 new environment variables to the execution environment
IBRUN: Command string is [mpirun_rsh -np 12 -hostfile /tmp/2V9_3rCXu1 -export /oasis/scratch/comet/vkotteda/Software/trilinos_3rd/kokkos/src/example/tutorial/02_simple_reduce_lambda/02_simple_reduce_lambda.host]
Kokkos::OpenMP::initialize WARNING: You are likely oversubscribing your CPU cores.
Detected: 24 cores per node.
Detected: 12 MPI_ranks per node.
Requested: 24 threads per process.
Sum of squares of integers from 0 to 9, computed in parallel, is 285
Sum of squares of integers from 0 to 9, computed sequentially, is 285

I used the procedure described above to build and run Kokkos examples. Kindly let me know if it is not correct.
Thank you.

@mhoemmen mhoemmen added the Framework tasks Framework tasks (used internally by Framework team) label Jun 12, 2017
@mhoemmen
Copy link
Contributor

@trilinos/framework

@mhoemmen mhoemmen added the impacting: configure or build The issue is primarily related to configuring or building label Jun 12, 2017
@kvmkrao
Copy link
Author

kvmkrao commented Jun 20, 2017

CC: @wfspotz
I modified the do-config file and reinstallled Trilnos with Kokkos(cuda)
The modifications are as follows:

module load gcc/4.9.2
module load openmpi
module load cuda/8.0
export NVCC_WRAPPER_DEFAULT_COMPILER=/opt/gnu/gcc/bin/g++
export OMPI_CXX=/oasis/scratch/comet/vkotteda/temp_project/Software/trilinos_3rd/openmpi/trilinos/src12p10/packages/kokkos/config/nvcc_wrapper
export CUDA_LAUNCH_BLOCKING=1

_cmake -D CMAKE_INSTALL_PREFIX:PATH=$BUILD_DIR/install
-D CMAKE_CXX_COMPILER="mpicxx" _

...................................................
..

There were no errors during the installation. So, I tried to compile a code at https://github.com/trilinos/Trilinos/blob/master/packages/tpetra/core/example/Lesson07-Kokkos-Fill/03_fill.cpp via the following the command line
$OMPI_CXX -std=c++11 03_fill.cpp -L$tripath/lib -I$tripath/include .............packages in Trilinos installation directory....... -I/usr/local/cuda-8.0/include $LAPACK/lib64/liblapack.so.3 $LAPACK/lib64/libblas.so.3 /usr/local/cuda-8.0/lib64/libcusparse.so /usr/local/cuda-8.0/lib64/libcudart.so /usr/local/cuda-8.0/lib64/libcublas.so /usr/local/cuda-8.0/lib64/libcufft.so

and got some errors. I think those errors occur when I compile a code that includes KOKKOS LAMBDAs

Errors during the compilation of a code at https://github.com/trilinos/Trilinos/blob/master/packages/tpetra/core/example/Lesson07-Kokkos-Fill/03_fill.cpp
_03_fill.cpp(42): warning: variable "T_left" was declared but never referenced

03_fill.cpp(44): warning: variable "T_right" was declared but never referenced

/oasis/scratch/comet/vkotteda/temp_project/Software/trilinos_3rd/openmpi/trilinos/install/include/Cuda/Kokkos_CudaExec.hpp(287): error: The closure type for a lambda ("lambda ->void") cannot be used in the template argument type of a global function template instantiation, unless the lambda is defined within a device or global function, or the lambda is a 'extended lambda' and the flag --expt-extended-lambda is specified
detected during:

instantiation of "Kokkos::Impl::cuda_parallel_launch_local_memory" based on template argument <Kokkos::Impl::ParallelFor<lambda ->void, Kokkos::RangePolicyKokkos::CudaUVMSpace::execution_space, Kokkos::CudaUVMSpace::execution_space>>
(287): here

instantiation of "Kokkos::Impl::CudaParallelLaunch<DriverType, false>::CudaParallelLaunch(const DriverType &, const dim3 &, const dim3 &, int, cudaStream_t) [with DriverType=Kokkos::Impl::ParallelFor<lambda ->void, Kokkos::RangePolicyKokkos::CudaUVMSpace::execution_space, Kokkos::CudaUVMSpace::execution_space>]"
/oasis/scratch/comet/vkotteda/temp_project/Software/trilinos_3rd/openmpi/trilinos/install/include/Cuda/Kokkos_Cuda_Parallel.hpp(366): here

Any suggestions to eliminate/suppress those errors.
Thank you.

@mhoemmen
Copy link
Contributor

What version of Trilinos is this? Try the master branch.

@kvmkrao
Copy link
Author

kvmkrao commented Jun 20, 2017

It is 12.10 (*** Base Git Repo: Trilinos 545269e [Mon Jun 19 16:49:55 2017 -0600] kyukim@sandia.gov Intrepid2 - check point inclusion )

I also used 12.08 version (*** Base Git Repo: Trilinos d2e490d [Mon Sep 26 18:05:28 2016 -0700] amota@sandia.gov Intrepid2: Fix dimension in return value for equality constraint.) and got the same errors.
I am building now with master branch.
Thank you.

@kvmkrao
Copy link
Author

kvmkrao commented Jun 20, 2017

I checkout the master branch and used that to build Trilinos. The error messages are same.

/oasis/scratch/comet/vkotteda/temp_project/Software/trilinos_3rd/openmpi/trilinos/install/include/Cuda/Kokkos_CudaExec.hpp(287): error: The closure type for a lambda ("lambda ->void") cannot be used in the template argument type of a global function template instantiation, unless the lambda is defined within a device or global function, or the lambda is a 'extended lambda' and the flag --expt-extended-lambda is specified

However, I am able to compile a code which does include KOKKOS lambdas. I observed that these errors occur when I compile the codes having KOKKOS LAMBDAS.

I appreciate your comments to suppress these errors during the compilations of the codes at
https://github.com/trilinos/Trilinos/tree/master/packages/tpetra/core/example

@mhoemmen
Copy link
Contributor

Use CUDA 8.0 or add the flag to your CUDA 7.5 build that enables lambdas.

@kvmkrao
Copy link
Author

kvmkrao commented Jun 21, 2017

CC : @wfspotz @mhoemmen
I loaded cuda/8.0 module and set the cxx flags as per your suggestion to build Trilinos.
kokkos/kokkos#343

I installed it without an error and used the same cxx flags to compile the examples. There were no errors. Therefore, I submitted the job on the machine and hope that it will run without an error.
Thank you.

@mhoemmen
Copy link
Contributor

@kvmkrao Excellent. Thank you for reporting this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Framework tasks Framework tasks (used internally by Framework team) impacting: configure or build The issue is primarily related to configuring or building type: question
Projects
None yet
Development

No branches or pull requests

4 participants
@mhoemmen @wfspotz @kvmkrao and others