Dockerfile.ubi: use CUDA 12.1 #130

dtrifiro · 2024-08-13T21:31:54Z

Dockerfile.ubi: use CUDA 12.1 instead of 12.4

openshift-ci · 2024-08-13T21:31:57Z

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

openshift-ci · 2024-08-13T21:32:00Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dtrifiro

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [dtrifiro]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

dtrifiro · 2024-08-13T21:32:01Z

/test all

dtrifiro · 2024-08-13T23:00:58Z

@maxdebayser as you can see from CI, the build fails with CUDA 12.1, here's the relevant message from the logs:

FAILED: CMakeFiles/_C.dir/csrc/quantization/cutlass_w8a8/scaled_mm_c2x.cu.o 
ccache /usr/local/cuda/bin/nvcc -forward-unknown-to-host-compiler -DPy_LIMITED_API=3 -DTORCH_EXTENSION_NAME=_C -DUSE_C10D_GLOO -DUSE_C10D_NCCL -DUSE_DISTRIBUTED -DUSE_RPC -DUSE_TENSORPIPE -D_C_EXPORTS -I/workspace/csrc -I/workspace/build/temp.linux-x86_64-cpython-311/_deps/cutlass-src/include -isystem /usr/include/python3.11 -isystem /opt/vllm/lib64/python3.11/site-packages/torch/include -isystem /opt/vllm/lib64/python3.11/site-packages/torch/include/torch/csrc/api/include -isystem /usr/local/cuda/include -DONNX_NAMESPACE=onnx_c2 -Xcudafe --diag_suppress=cc_clobber_ignored,--diag_suppress=field_without_dll_interface,--diag_suppress=base_class_has_different_dll_interface,--diag_suppress=dll_interface_conflict_none_assumed,--diag_suppress=dll_interface_conflict_dllexport_assumed,--diag_suppress=bad_friend_decl --expt-relaxed-constexpr --expt-extended-lambda -O3 -DNDEBUG -std=c++17 "--generate-code=arch=compute_70,code=[sm_70]" "--generate-code=arch=compute_75,code=[sm_75]" "--generate-code=arch=compute_80,code=[sm_80]" "--generate-code=arch=compute_86,code=[sm_86]" "--generate-code=arch=compute_89,code=[sm_89]" "--generate-code=arch=compute_90,code=[sm_90]" "--generate-code=arch=compute_90,code=[compute_90]" -Xcompiler=-fPIC --expt-relaxed-constexpr -DENABLE_FP8 --threads=2 -D_GLIBCXX_USE_CXX11_ABI=0 -MD -MT CMakeFiles/_C.dir/csrc/quantization/cutlass_w8a8/scaled_mm_c2x.cu.o -MF CMakeFiles/_C.dir/csrc/quantization/cutlass_w8a8/scaled_mm_c2x.cu.o.d -x cu -c /workspace/csrc/quantization/cutlass_w8a8/scaled_mm_c2x.cu -o CMakeFiles/_C.dir/csrc/quantization/cutlass_w8a8/scaled_mm_c2x.cu.o
[27/32] Building CUDA object CMakeFiles/_C.dir/csrc/quantization/cutlass_w8a8/scaled_mm_entry.cu.o
[28/32] Building CUDA object CMakeFiles/_C.dir/csrc/attention/attention_kernels.cu.o
[29/32] Building CUDA object CMakeFiles/_C.dir/csrc/quantization/gptq_marlin/gptq_marlin.cu.o
ninja: build stopped: subcommand failed.
Traceback (most recent call last):
  File "/workspace/setup.py", line 456, in <module>
    setup(
  File "/opt/vllm/lib64/python3.11/site-packages/setuptools/__init__.py", line 87, in setup
    return distutils.core.setup(**attrs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/vllm/lib64/python3.11/site-packages/setuptools/_distutils/core.py", line 185, in setup
    return run_commands(dist)
           ^^^^^^^^^^^^^^^^^^
  File "/opt/vllm/lib64/python3.11/site-packages/setuptools/_distutils/core.py", line 201, in run_commands
    dist.run_commands()
  File "/opt/vllm/lib64/python3.11/site-packages/setuptools/_distutils/dist.py", line 968, in run_commands
    self.run_command(cmd)
  File "/opt/vllm/lib64/python3.11/site-packages/setuptools/dist.py", line 1217, in run_command
    super().run_command(command)
  File "/opt/vllm/lib64/python3.11/site-packages/setuptools/_distutils/dist.py", line 987, in run_command
    cmd_obj.run()
  File "/opt/vllm/lib64/python3.11/site-packages/wheel/_bdist_wheel.py", line 378, in run
    self.run_command("build")
  File "/opt/vllm/lib64/python3.11/site-packages/setuptools/_distutils/cmd.py", line 319, in run_command
    self.distribution.run_command(command)
  File "/opt/vllm/lib64/python3.11/site-packages/setuptools/dist.py", line 1217, in run_command
    super().run_command(command)
  File "/opt/vllm/lib64/python3.11/site-packages/setuptools/_distutils/dist.py", line 987, in run_command
    cmd_obj.run()
  File "/opt/vllm/lib64/python3.11/site-packages/setuptools/_distutils/command/build.py", line 132, in run
    self.run_command(cmd_name)
  File "/opt/vllm/lib64/python3.11/site-packages/setuptools/_distutils/cmd.py", line 319, in run_command
    self.distribution.run_command(command)
  File "/opt/vllm/lib64/python3.11/site-packages/setuptools/dist.py", line 1217, in run_command
    super().run_command(command)
  File "/opt/vllm/lib64/python3.11/site-packages/setuptools/_distutils/dist.py", line 987, in run_command
    cmd_obj.run()
  File "/opt/vllm/lib64/python3.11/site-packages/setuptools/command/build_ext.py", line 84, in run
    _build_ext.run(self)
  File "/opt/vllm/lib64/python3.11/site-packages/setuptools/_distutils/command/build_ext.py", line 346, in run
    self.build_extensions()
  File "/workspace/setup.py", line 231, in build_extensions
    subprocess.check_call(["cmake", *build_args], cwd=self.build_temp)
  File "/usr/lib64/python3.11/subprocess.py", line 413, in check_call
    raise CalledProcessError(retcode, cmd)

dtrifiro · 2024-08-14T15:34:40Z

As a note:
Upstream dockerfile uses 12.4: https://github.com/opendatahub-io/vllm/blob/main/Dockerfile#L8
although this is overridden for release wheels https://github.com/opendatahub-io/vllm/blob/main/.buildkite/release-pipeline.yaml#L6

openshift-merge-robot · 2024-08-25T09:20:58Z

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

openshift-ci · 2024-09-03T22:13:02Z

@dtrifiro: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
ci/prow/smoke-test	`0703e45`	link	true	`/test smoke-test`
ci/prow/pr-image-mirror	`0703e45`	link	true	`/test pr-image-mirror`
ci/prow/images	`0703e45`	link	true	`/test images`
ci/prow/rocm-pr-image-mirror	`0703e45`	link	true	`/test rocm-pr-image-mirror`
ci/prow/cuda-pr-image-mirror	`0703e45`	link	true	`/test cuda-pr-image-mirror`

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

dtrifiro · 2024-09-12T09:32:07Z

Closing as stale

* First version * Revert error. While there, add missing finalize. * Use the correct defaults for ROCm. Increase sampling area to capture crossover. * Scope end_sync as well. * Guard only volatile keyword for ifndef USE_ROCM * Document crossover

@iotamudelta

* Per @iotamudelta suggestion until the deadlocks issue is better understood Revert "Make CAR ROCm 6.1 compatible. (opendatahub-io#137)" This reverts commit 4d2dda6. * Per @iotamudelta suggestion until the deadlocks issue is better understood Revert "Optimize custom all reduce (opendatahub-io#130)" This reverts commit 636ff01.

dtrifiro added 5 commits August 12, 2024 18:31

deps: bump vllm-tgis-adapter to 0.2.4

a58d5f2

Dockerfile.ubi: force using python-installed cuda runtime libraries

6b47904

Dockerfile: use uv pip everywhere (it's faster)

2d71e49

Dockerfile.ubi: bump flashinfer to 0.1.2

d7862bd

smoke test: kill server on timeout

75f32d8

openshift-ci bot added the do-not-merge/work-in-progress label Aug 13, 2024

openshift-ci bot added the approved label Aug 13, 2024

Dockerfile.ubi: use cuda 12.1 instead of 12.4

0703e45

dtrifiro mentioned this pull request Aug 13, 2024

fix build for 0.5.4 #127

Merged

openshift-merge-robot added the needs-rebase label Aug 25, 2024

dtrifiro closed this Sep 12, 2024

dtrifiro deleted the set-cuda-to-12.1 branch September 12, 2024 09:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dockerfile.ubi: use CUDA 12.1 #130

Dockerfile.ubi: use CUDA 12.1 #130

dtrifiro commented Aug 13, 2024

openshift-ci bot commented Aug 13, 2024

openshift-ci bot commented Aug 13, 2024

dtrifiro commented Aug 13, 2024

dtrifiro commented Aug 13, 2024

dtrifiro commented Aug 14, 2024

openshift-merge-robot commented Aug 25, 2024

openshift-ci bot commented Sep 3, 2024

dtrifiro commented Sep 12, 2024

Dockerfile.ubi: use CUDA 12.1 #130

Dockerfile.ubi: use CUDA 12.1 #130

Conversation

dtrifiro commented Aug 13, 2024

openshift-ci bot commented Aug 13, 2024

openshift-ci bot commented Aug 13, 2024

dtrifiro commented Aug 13, 2024

dtrifiro commented Aug 13, 2024

dtrifiro commented Aug 14, 2024

openshift-merge-robot commented Aug 25, 2024

openshift-ci bot commented Sep 3, 2024

dtrifiro commented Sep 12, 2024