Failed to build from source on ROCm (with pytorch and xformers working correctly) #3067

nayn99 · 2024-02-28T01:05:28Z

OS: Linux 6.6.17-1-lts
HW: AMD 4650G (Renoir), gfx90c
SW: torch==2.3.0.dev20240224+rocm5.7, xformers==0.0.23 (both confirmed working).

Description of the issue: Following the installation guide for ROCm to build from source:

Total number of replaced kernel launches: 21
running install
/home/toto/tmp/testenv/lib/python3.11/site-packages/setuptools/command/install.py:34: SetuptoolsDeprecationWarning: setup.py install is deprecated. Use build and pip and other standards-based tools.
  warnings.warn(
/home/toto/tmp/testenv/lib/python3.11/site-packages/setuptools/command/easy_install.py:144: EasyInstallDeprecationWarning: easy_install command is deprecated. Use build and pip and other standards-based tools.
  warnings.warn(
running bdist_egg
running egg_info
writing vllm.egg-info/PKG-INFO
writing dependency_links to vllm.egg-info/dependency_links.txt
writing requirements to vllm.egg-info/requires.txt
writing top-level names to vllm.egg-info/top_level.txt
reading manifest file 'vllm.egg-info/SOURCES.txt'
reading manifest template 'MANIFEST.in'
adding license file 'LICENSE'
writing manifest file 'vllm.egg-info/SOURCES.txt'
installing library code to build/bdist.linux-x86_64/egg
running install_lib
running build_py
running build_ext
building 'vllm._C' extension
Emitting ninja build file /home/toto/tmp/vllm/build/temp.linux-x86_64-cpython-311/build.ninja...
Compiling objects...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
ninja: no work to do.
g++ -shared -Wl,-O1,--sort-common,--as-needed,-z,relro,-z,now -flto=auto -Wl,-O1,--sort-common,--as-needed,-z,relro,-z,now -flto=auto /home/toto/tmp/vllm/build/temp.linux-x86_64-cpython-311/csrc/activation_kernels.o /home/toto/tmp/vllm/build/temp.linux-x86_64-cpython-311/csrc/attention/attention_kernels.o /home/toto/tmp/vllm/build/temp.linux-x86_64-cpython-311/csrc/cache_kernels.o /home/toto/tmp/vllm/build/temp.linux-x86_64-cpython-311/csrc/hip_utils_kernels.o /home/toto/tmp/vllm/build/temp.linux-x86_64-cpython-311/csrc/layernorm_kernels.o /home/toto/tmp/vllm/build/temp.linux-x86_64-cpython-311/csrc/moe_align_block_size_kernels.o /home/toto/tmp/vllm/build/temp.linux-x86_64-cpython-311/csrc/pos_encoding_kernels.o /home/toto/tmp/vllm/build/temp.linux-x86_64-cpython-311/csrc/pybind.o /home/toto/tmp/vllm/build/temp.linux-x86_64-cpython-311/csrc/quantization/gptq/q_gemm.o /home/toto/tmp/vllm/build/temp.linux-x86_64-cpython-311/csrc/quantization/squeezellm/quant_hip_kernel.o -L/home/toto/tmp/testenv/lib/python3.11/site-packages/torch/lib -L/opt/rocm/lib -L/opt/rocm/hip/lib -L/usr/lib -lc10 -ltorch -ltorch_cpu -ltorch_python -lamdhip64 -lc10_hip -ltorch_hip -o build/lib.linux-x86_64-cpython-311/vllm/_C.cpython-311-x86_64-linux-gnu.so
/usr/bin/ld: /home/toto/tmp/vllm/build/temp.linux-x86_64-cpython-311/csrc/cache_kernels.o: in function `__float2bfloat16(float)':
cache_kernels.hip:(.text+0x0): multiple definition of `__float2bfloat16(float)'; /home/toto/tmp/vllm/build/temp.linux-x86_64-cpython-311/csrc/attention/attention_kernels.o:attention_kernels.hip:(.text+0x0): first defined here
/usr/bin/ld: /home/toto/tmp/vllm/build/temp.linux-x86_64-cpython-311/csrc/cache_kernels.o: in function `__bfloat1622float2(__hip_bfloat162)':
cache_kernels.hip:(.text+0x40): multiple definition of `__bfloat1622float2(__hip_bfloat162)'; /home/toto/tmp/vllm/build/temp.linux-x86_64-cpython-311/csrc/attention/attention_kernels.o:attention_kernels.hip:(.text+0x40): first defined here
/usr/bin/ld: /home/toto/tmp/vllm/build/temp.linux-x86_64-cpython-311/csrc/cache_kernels.o: in function `__double2bfloat16(double)':
cache_kernels.hip:(.text+0x60): multiple definition of `__double2bfloat16(double)'; /home/toto/tmp/vllm/build/temp.linux-x86_64-cpython-311/csrc/attention/attention_kernels.o:attention_kernels.hip:(.text+0x60): first defined here
/usr/bin/ld: /home/toto/tmp/vllm/build/temp.linux-x86_64-cpython-311/csrc/cache_kernels.o: in function `__float22bfloat162_rn(HIP_vector_type<float, 2u>)':
cache_kernels.hip:(.text+0xa0): multiple definition of `__float22bfloat162_rn(HIP_vector_type<float, 2u>)'; /home/toto/tmp/vllm/build/temp.linux-x86_64-cpython-311/csrc/attention/attention_kernels.o:attention_kernels.hip:(.text+0xa0): first defined here
/usr/bin/ld: /home/toto/tmp/vllm/build/temp.linux-x86_64-cpython-311/csrc/cache_kernels.o: in function `__high2float(__hip_bfloat162)':
cache_kernels.hip:(.text+0x110): multiple definition of `__high2float(__hip_bfloat162)'; /home/toto/tmp/vllm/build/temp.linux-x86_64-cpython-311/csrc/attention/attention_kernels.o:attention_kernels.hip:(.text+0x110): first defined here
/usr/bin/ld: /home/toto/tmp/vllm/build/temp.linux-x86_64-cpython-311/csrc/cache_kernels.o: in function `__low2float(__hip_bfloat162)':
cache_kernels.hip:(.text+0x120): multiple definition of `__low2float(__hip_bfloat162)'; /home/toto/tmp/vllm/build/temp.linux-x86_64-cpython-311/csrc/attention/attention_kernels.o:attention_kernels.hip:(.text+0x120): first defined here
collect2: error: ld returned 1 exit status
error: command '/usr/bin/g++' failed with exit code 1

george-kuanli-peng · 2024-02-29T07:41:56Z

I have the same problem building vllm from source on two platforms:

First:

vllm tag v0.3.2
rocm 6.0.2
PyTorch 2.1.2+git98a6632
xformers 0.0.23

Second:

vllm tag v0.3.2
rocm 5.7.0
PyTorch 2.0.1+git4c8bc42
xformers 0.0.23

g++ -pthread -B /opt/conda/envs/py_3.10/compiler_compat -shared -Wl,-rpath,/opt/conda/envs/py_3.10/lib -Wl,-rpath-link,/opt/conda/envs/py_3.10/lib -L/opt/conda/envs/py_3.10/lib -Wl,-rpath,/opt/conda/envs/py_3.10/lib -Wl,-rpath-link,/opt/conda/envs/py_3.10/lib -L/opt/conda/envs/py_3.10/lib /workspace/vllm/build/temp.linux-x86_64-cpython-310/csrc/activation_kernels.o /workspace/vllm/build/temp.linux-x86_64-cpython-310/csrc/attention/attention_kernels.o /workspace/vllm/build/temp.linux-x86_64-cpython-310/csrc/cache_kernels.o /workspace/vllm/build/temp.linux-x86_64-cpython-310/csrc/hip_utils_kernels.o /workspace/vllm/build/temp.linux-x86_64-cpython-310/csrc/layernorm_kernels.o /workspace/vllm/build/temp.linux-x86_64-cpython-310/csrc/moe_align_block_size_kernels.o /workspace/vllm/build/temp.linux-x86_64-cpython-310/csrc/pos_encoding_kernels.o /workspace/vllm/build/temp.linux-x86_64-cpython-310/csrc/pybind.o /workspace/vllm/build/temp.linux-x86_64-cpython-310/csrc/quantization/gptq/q_gemm.o /workspace/vllm/build/temp.linux-x86_64-cpython-310/csrc/quantization/squeezellm/quant_hip_kernel.o -L/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib -L/opt/rocm/lib -L/opt/rocm/hip/lib -lc10 -ltorch -ltorch_cpu -ltorch_python -lamdhip64 -lc10_hip -ltorch_hip -o build/lib.linux-x86_64-cpython-310/vllm/_C.cpython-310-x86_64-linux-gnu.so
/opt/conda/envs/py_3.10/compiler_compat/ld: /workspace/vllm/build/temp.linux-x86_64-cpython-310/csrc/cache_kernels.o: in function `__float2bfloat16(float)':
cache_kernels.hip:(.text+0x0): multiple definition of `__float2bfloat16(float)'; /workspace/vllm/build/temp.linux-x86_64-cpython-310/csrc/attention/attention_kernels.o:attention_kernels.hip:(.text+0x0): first defined here
/opt/conda/envs/py_3.10/compiler_compat/ld: /workspace/vllm/build/temp.linux-x86_64-cpython-310/csrc/cache_kernels.o: in function `__bfloat1622float2(__hip_bfloat162)':
cache_kernels.hip:(.text+0x40): multiple definition of `__bfloat1622float2(__hip_bfloat162)'; /workspace/vllm/build/temp.linux-x86_64-cpython-310/csrc/attention/attention_kernels.o:attention_kernels.hip:(.text+0x40): first defined here
/opt/conda/envs/py_3.10/compiler_compat/ld: /workspace/vllm/build/temp.linux-x86_64-cpython-310/csrc/cache_kernels.o: in function `__double2bfloat16(double)':
cache_kernels.hip:(.text+0x60): multiple definition of `__double2bfloat16(double)'; /workspace/vllm/build/temp.linux-x86_64-cpython-310/csrc/attention/attention_kernels.o:attention_kernels.hip:(.text+0x60): first defined here
/opt/conda/envs/py_3.10/compiler_compat/ld: /workspace/vllm/build/temp.linux-x86_64-cpython-310/csrc/cache_kernels.o: in function `__float22bfloat162_rn(HIP_vector_type<float, 2u>)':
cache_kernels.hip:(.text+0xa0): multiple definition of `__float22bfloat162_rn(HIP_vector_type<float, 2u>)'; /workspace/vllm/build/temp.linux-x86_64-cpython-310/csrc/attention/attention_kernels.o:attention_kernels.hip:(.text+0xa0): first defined here
/opt/conda/envs/py_3.10/compiler_compat/ld: /workspace/vllm/build/temp.linux-x86_64-cpython-310/csrc/cache_kernels.o: in function `__high2float(__hip_bfloat162)':
cache_kernels.hip:(.text+0x110): multiple definition of `__high2float(__hip_bfloat162)'; /workspace/vllm/build/temp.linux-x86_64-cpython-310/csrc/attention/attention_kernels.o:attention_kernels.hip:(.text+0x110): first defined here
/opt/conda/envs/py_3.10/compiler_compat/ld: /workspace/vllm/build/temp.linux-x86_64-cpython-310/csrc/cache_kernels.o: in function `__low2float(__hip_bfloat162)':
cache_kernels.hip:(.text+0x120): multiple definition of `__low2float(__hip_bfloat162)'; /workspace/vllm/build/temp.linux-x86_64-cpython-310/csrc/attention/attention_kernels.o:attention_kernels.hip:(.text+0x120): first defined here
collect2: error: ld returned 1 exit status
error: command '/usr/bin/g++' failed with exit code 1

george-kuanli-peng · 2024-02-29T08:10:20Z

I have the same problem building vllm from source on two platforms:

First:

vllm tag v0.3.2
rocm 6.0.2
PyTorch 2.1.2+git98a6632
xformers 0.0.23

Second:

vllm tag v0.3.2
rocm 5.7.0
PyTorch 2.0.1+git4c8bc42
xformers 0.0.23

g++ -pthread -B /opt/conda/envs/py_3.10/compiler_compat -shared -Wl,-rpath,/opt/conda/envs/py_3.10/lib -Wl,-rpath-link,/opt/conda/envs/py_3.10/lib -L/opt/conda/envs/py_3.10/lib -Wl,-rpath,/opt/conda/envs/py_3.10/lib -Wl,-rpath-link,/opt/conda/envs/py_3.10/lib -L/opt/conda/envs/py_3.10/lib /workspace/vllm/build/temp.linux-x86_64-cpython-310/csrc/activation_kernels.o /workspace/vllm/build/temp.linux-x86_64-cpython-310/csrc/attention/attention_kernels.o /workspace/vllm/build/temp.linux-x86_64-cpython-310/csrc/cache_kernels.o /workspace/vllm/build/temp.linux-x86_64-cpython-310/csrc/hip_utils_kernels.o /workspace/vllm/build/temp.linux-x86_64-cpython-310/csrc/layernorm_kernels.o /workspace/vllm/build/temp.linux-x86_64-cpython-310/csrc/moe_align_block_size_kernels.o /workspace/vllm/build/temp.linux-x86_64-cpython-310/csrc/pos_encoding_kernels.o /workspace/vllm/build/temp.linux-x86_64-cpython-310/csrc/pybind.o /workspace/vllm/build/temp.linux-x86_64-cpython-310/csrc/quantization/gptq/q_gemm.o /workspace/vllm/build/temp.linux-x86_64-cpython-310/csrc/quantization/squeezellm/quant_hip_kernel.o -L/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib -L/opt/rocm/lib -L/opt/rocm/hip/lib -lc10 -ltorch -ltorch_cpu -ltorch_python -lamdhip64 -lc10_hip -ltorch_hip -o build/lib.linux-x86_64-cpython-310/vllm/_C.cpython-310-x86_64-linux-gnu.so
/opt/conda/envs/py_3.10/compiler_compat/ld: /workspace/vllm/build/temp.linux-x86_64-cpython-310/csrc/cache_kernels.o: in function `__float2bfloat16(float)':
cache_kernels.hip:(.text+0x0): multiple definition of `__float2bfloat16(float)'; /workspace/vllm/build/temp.linux-x86_64-cpython-310/csrc/attention/attention_kernels.o:attention_kernels.hip:(.text+0x0): first defined here
/opt/conda/envs/py_3.10/compiler_compat/ld: /workspace/vllm/build/temp.linux-x86_64-cpython-310/csrc/cache_kernels.o: in function `__bfloat1622float2(__hip_bfloat162)':
cache_kernels.hip:(.text+0x40): multiple definition of `__bfloat1622float2(__hip_bfloat162)'; /workspace/vllm/build/temp.linux-x86_64-cpython-310/csrc/attention/attention_kernels.o:attention_kernels.hip:(.text+0x40): first defined here
/opt/conda/envs/py_3.10/compiler_compat/ld: /workspace/vllm/build/temp.linux-x86_64-cpython-310/csrc/cache_kernels.o: in function `__double2bfloat16(double)':
cache_kernels.hip:(.text+0x60): multiple definition of `__double2bfloat16(double)'; /workspace/vllm/build/temp.linux-x86_64-cpython-310/csrc/attention/attention_kernels.o:attention_kernels.hip:(.text+0x60): first defined here
/opt/conda/envs/py_3.10/compiler_compat/ld: /workspace/vllm/build/temp.linux-x86_64-cpython-310/csrc/cache_kernels.o: in function `__float22bfloat162_rn(HIP_vector_type<float, 2u>)':
cache_kernels.hip:(.text+0xa0): multiple definition of `__float22bfloat162_rn(HIP_vector_type<float, 2u>)'; /workspace/vllm/build/temp.linux-x86_64-cpython-310/csrc/attention/attention_kernels.o:attention_kernels.hip:(.text+0xa0): first defined here
/opt/conda/envs/py_3.10/compiler_compat/ld: /workspace/vllm/build/temp.linux-x86_64-cpython-310/csrc/cache_kernels.o: in function `__high2float(__hip_bfloat162)':
cache_kernels.hip:(.text+0x110): multiple definition of `__high2float(__hip_bfloat162)'; /workspace/vllm/build/temp.linux-x86_64-cpython-310/csrc/attention/attention_kernels.o:attention_kernels.hip:(.text+0x110): first defined here
/opt/conda/envs/py_3.10/compiler_compat/ld: /workspace/vllm/build/temp.linux-x86_64-cpython-310/csrc/cache_kernels.o: in function `__low2float(__hip_bfloat162)':
cache_kernels.hip:(.text+0x120): multiple definition of `__low2float(__hip_bfloat162)'; /workspace/vllm/build/temp.linux-x86_64-cpython-310/csrc/attention/attention_kernels.o:attention_kernels.hip:(.text+0x120): first defined here
collect2: error: ld returned 1 exit status
error: command '/usr/bin/g++' failed with exit code 1

Well, I can now build vllm from source on the first platform (rocm 6.0.2) by appending static at the end of two lines in /opt/rocm/include/hip/amd_detail/amd_hip_bf16.h as in ROCm/clr@77c581a

Ref: #2646 (comment)

However, later I have another issue as in #3061

cocoderss · 2024-03-02T09:14:47Z

I have the same problem building vllm from source on two platforms:
First:

vllm tag v0.3.2

rocm 6.0.2

PyTorch 2.1.2+git98a6632

xformers 0.0.23

Second:

vllm tag v0.3.2

rocm 5.7.0

PyTorch 2.0.1+git4c8bc42

xformers 0.0.23

However, later I have another issue as in #3061

I will try and give it a try, what is your system setup? Is it also an AMD iGPU?

george-kuanli-peng · 2024-03-05T05:50:42Z

I have the same problem building vllm from source on two platforms:
First:

vllm tag v0.3.2

rocm 6.0.2

PyTorch 2.1.2+git98a6632

xformers 0.0.23

Second:

vllm tag v0.3.2

rocm 5.7.0

PyTorch 2.0.1+git4c8bc42

xformers 0.0.23

However, later I have another issue as in #3061

I will try and give it a try, what is your system setup? Is it also an AMD iGPU?

I am not using AMD integrated GPUs. They are MI210 and MI300X.

hliuca · 2024-03-07T21:30:55Z

Caused by a compiler bug. Need to fix a header file by adding static.

--- amd_hip_bf16.h 2024-02-06 18:28:58.268699142 +0000
+++ amd_hip_bf16.h.new 2024-02-06 18:28:31.988647133 +0000
@@ -90,10 +90,10 @@
#include "math_fwd.h" // ocml device functions

#if defined(HIPCC_RTC)
-#define HOST_DEVICE device
+#define HOST_DEVICE device static
#else
#include
-#define HOST_DEVICE host device
+#define HOST_DEVICE host device static inline
#endif

fxmarty · 2024-04-18T14:05:24Z

same issue even with #2648

fxmarty · 2024-04-18T14:19:00Z

FYI #2790 is need and this is fixed for me. This may be closed imo

hongxiayang · 2024-07-13T16:24:54Z

@nayn99 Please update this issue or close this issue if your problem is resolved. Thanks.

cloudhan mentioned this issue Mar 1, 2024

Workaround ROCm build #3133

Closed

hongxiayang added rocm installation Installation problems labels Jul 13, 2024

hongxiayang closed this as completed Sep 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Failed to build from source on ROCm (with pytorch and xformers working correctly) #3067

Failed to build from source on ROCm (with pytorch and xformers working correctly) #3067

nayn99 commented Feb 28, 2024

george-kuanli-peng commented Feb 29, 2024 •

edited

Loading

george-kuanli-peng commented Feb 29, 2024

cocoderss commented Mar 2, 2024

george-kuanli-peng commented Mar 5, 2024

hliuca commented Mar 7, 2024 •

edited

Loading

fxmarty commented Apr 18, 2024

fxmarty commented Apr 18, 2024

hongxiayang commented Jul 13, 2024

Failed to build from source on ROCm (with pytorch and xformers working correctly) #3067

Failed to build from source on ROCm (with pytorch and xformers working correctly) #3067

Comments

nayn99 commented Feb 28, 2024

george-kuanli-peng commented Feb 29, 2024 • edited Loading

george-kuanli-peng commented Feb 29, 2024

cocoderss commented Mar 2, 2024

george-kuanli-peng commented Mar 5, 2024

hliuca commented Mar 7, 2024 • edited Loading

fxmarty commented Apr 18, 2024

fxmarty commented Apr 18, 2024

hongxiayang commented Jul 13, 2024

george-kuanli-peng commented Feb 29, 2024 •

edited

Loading

hliuca commented Mar 7, 2024 •

edited

Loading