Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

faster transformer compile error with docker #51

Closed
ganguagua opened this issue Jul 18, 2019 · 5 comments
Closed

faster transformer compile error with docker #51

ganguagua opened this issue Jul 18, 2019 · 5 comments

Comments

@ganguagua
Copy link

image: nvidia/cuda 10.0-cudnn7-devel-ubuntu16.04 docker image
cmake -DSM=70 -DCMAKE_BUILD_TYPE=Release -DBUILD_TF=ON -DTF_PATH=/usr/lib/python2.7/site-packages/tensorflow .. output:
-- The CXX compiler identification is GNU 5.4.0
-- The CUDA compiler identification is NVIDIA 10.0.130
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Check for working CUDA compiler: /usr/local/cuda/bin/nvcc
-- Check for working CUDA compiler: /usr/local/cuda/bin/nvcc -- works
-- Detecting CUDA compiler ABI info
-- Detecting CUDA compiler ABI info - done
-- Looking for C++ include pthread.h
-- Looking for C++ include pthread.h - found
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Failed
-- Looking for pthread_create in pthreads
-- Looking for pthread_create in pthreads - not found
-- Looking for pthread_create in pthread
-- Looking for pthread_create in pthread - found
-- Found Threads: TRUE
-- Found CUDA: /usr/local/cuda (found suitable version "10.0", minimum required is "10.0")
-- Found CUDA: /usr/local/cuda (found version "10.0")
-- Assign GPU architecture (sm=70)
-- Configuring done
-- Generating done
-- Build files have been written to: /root/DeepLearningExamples/FasterTransformer/build

make output:
CMakeFiles/gemm_fp32.dir/gemm_fp32.cu.o: In function __sti____cudaRegisterAll()': tmpxft_0000054d_00000000-5_gemm_fp32.cudafe1.cpp:(.text.startup+0x15): undefined reference to __cudaRegisterLinkedBinary_44_tmpxft_0000054d_00000000_6_gemm_fp32_cpp1_ii_5cd8620e'
collect2: error: ld returned 1 exit status
tools/gemm_test/CMakeFiles/gemm_fp32.dir/build.make:83: recipe for target 'bin/gemm_fp32' failed
make[2]: *** [bin/gemm_fp32] Error 1
CMakeFiles/Makefile2:148: recipe for target 'tools/gemm_test/CMakeFiles/gemm_fp32.dir/all' failed
make[1]: *** [tools/gemm_test/CMakeFiles/gemm_fp32.dir/all] Error 2
Makefile:83: recipe for target 'all' failed
make: *** [all] Error 2

@IrishCoffee
Copy link

It seems that there is something wrong with your environment. Could you please check your driver first?

@ganguagua
Copy link
Author

It seems that there is something wrong with your environment. Could you please check your driver first?

my environment:
cmake version 3.15.0
gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-36)
cuda 10.0
cudnn 7
Tensorflow-gpu 1.13.1

@jackkosaian
Copy link

I also receive this error on V100. My driver version is 418.56.

@jackkosaian
Copy link

I was able to resolve this by adding the following lines to the end of various CMakeLists:

tools/gemm_test/CMakeLists.txt

set_target_properties(gemm_fp32 PROPERTIES CUDA_RESOLVE_DEVICE_SYMBOLS ON)
set_target_properties(gemm_fp16 PROPERTIES CUDA_RESOLVE_DEVICE_SYMBOLS ON)

fastertransformer/cuda/CMakeLists.txt

set_target_properties(fastertransformer PROPERTIES CUDA_RESOLVE_DEVICE_SYMBOLS ON)

fastertransformer/tf_op/CMakeLists.txt

set_target_properties(tf_fastertransformer PROPERTIES CUDA_RESOLVE_DEVICE_SYMBOLS ON)

I found this solution here.

I'm not sure whether this is the "correct" solution to the problem, but I'm able to compile and run ./build/bin/gemm_fp16 with these changes.

@ganguagua
Copy link
Author

I was able to resolve this by adding the following lines to the end of various CMakeLists:
tools/gemm_test/CMakeLists.txt
set_target_properties(gemm_fp32 PROPERTIES CUDA_RESOLVE_DEVICE_SYMBOLS ON)
set_target_properties(gemm_fp16 PROPERTIES CUDA_RESOLVE_DEVICE_SYMBOLS ON)

fastertransformer/cuda/CMakeLists.txt
set_target_properties(fastertransformer PROPERTIES CUDA_RESOLVE_DEVICE_SYMBOLS ON)

fastertransformer/tf_op/CMakeLists.txt
set_target_properties(tf_fastertransformer PROPERTIES CUDA_RESOLVE_DEVICE_SYMBOLS ON)

I found this solution here.
I'm not sure whether this is the "correct" solution to the problem, but I'm able to compile and run ./build/bin/gemm_fp16 with these changes.

It works, thanks very much!

@nvpstr nvpstr closed this as completed Jul 30, 2019
@byshiue byshiue transferred this issue from NVIDIA/DeepLearningExamples Apr 5, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants