Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Hardware][Intel CPU] Update torch 2.4.0 for CPU backend #6931

Merged
merged 2 commits into from
Aug 2, 2024

Conversation

DamonFool
Copy link
Contributor

The CPU targets of vLLM fail to build with the following error.

Re-run cmake no build system arguments
-- The CXX compiler identification is GNU 12.3.0
-- Detecting CXX compiler ABI info 
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features 
-- Detecting CXX compile features - done
-- Build type: Release
-- Target device: cpu
-- Found Python: /usr/bin/python3 (found version "3.10.12") found components: Interpreter Development.Module Development.SABIModule
-- Found python matching: /usr/bin/python3.
CUDA_TOOLKIT_ROOT_DIR not found or specified
-- Could NOT find CUDA (missing: CUDA_TOOLKIT_ROOT_DIR CUDA_NVCC_EXECUTABLE CUDA_INCLUDE_DIRS CUDA_CUDART_LIBRARY)
CMake Warning at /usr/local/lib/python3.10/dist-packages/torch/share/cmake/Caffe2/public/cuda.cmake:31 (message):
  Caffe2: CUDA cannot be found.  Depending on whether you are building Caffe2
  or a Caffe2 dependent library, the next warning / error will give you more
  info.
Call Stack (most recent call first):
  /usr/local/lib/python3.10/dist-packages/torch/share/cmake/Caffe2/Caffe2Config.cmake:86 (include)
  /usr/local/lib/python3.10/dist-packages/torch/share/cmake/Torch/TorchConfig.cmake:68 (find_package)
  CMakeLists.txt:67 (find_package)


CMake Error at /usr/local/lib/python3.10/dist-packages/torch/share/cmake/Caffe2/Caffe2Config.cmake:90 (message):
  Your installed Caffe2 version uses CUDA but I cannot find the CUDA
  libraries.  Please set the proper CUDA prefixes and / or install CUDA.
Call Stack (most recent call first):
  /usr/local/lib/python3.10/dist-packages/torch/share/cmake/Torch/TorchConfig.cmake:68 (find_package)
  CMakeLists.txt:67 (find_package)

The reason is that not cpu-only torch is installed.

torch                             2.4.0
torchvision                       0.19.0+cpu

I guess we should install torch from https://download.pytorch.org/whl/cpu as before.
After this patch, cpu-only torch can be installed and the build passed.

torch                             2.4.0+cpu
torchvision                       0.19.0+cpu

Please review it.
Thanks.

Copy link

👋 Hi! Thank you for contributing to the vLLM project.
Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which consists a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of default ones by unblocking the steps in your fast-check build on Buildkite UI.

Once the PR is approved and ready to go, please make sure to run full CI as it is required to merge (or just use auto-merge).

To run full CI, you can do one of these:

  • Comment /ready on the PR
  • Add ready label to the PR
  • Enable auto-merge.

🚀

@bigPYJ1151
Copy link
Contributor

Hi @DamonFool , sorry for the late. I think you can change the title as update torch 2.4.0 for CPU backend and mark the PR as ready to trigger full CI run. Thanks!

@DamonFool DamonFool changed the title [Hardware][Intel CPU] Fix build failure due to not cpu-only torch installed [Hardware][Intel CPU] Update torch 2.4.0 for CPU backend Aug 1, 2024
@DamonFool
Copy link
Contributor Author

/ready

@github-actions github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Aug 1, 2024
@DamonFool
Copy link
Contributor Author

Hi @DamonFool , sorry for the late. I think you can change the title as update torch 2.4.0 for CPU backend and mark the PR as ready to trigger full CI run. Thanks!

Done.
Thanks @bigPYJ1151 .

@DamonFool
Copy link
Contributor Author

Hi @bigPYJ1151 , the failing tests are all running on GPUs which should not be affected by this change.

@DamonFool
Copy link
Contributor Author

Hi @bigPYJ1151 , I would suggest fixing this bug as soon as possible since this is a build failure.
Please let me know if you're fine with this change.
Thanks.

@bigPYJ1151
Copy link
Contributor

Hi @mgoin , would you please help to merge this PR as it passed the CPU tests? Thanks!

@simon-mo simon-mo merged commit c16eaac into vllm-project:main Aug 2, 2024
80 of 85 checks passed
@DamonFool
Copy link
Contributor Author

Thanks @simon-mo for your help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready ONLY add when PR is ready to merge/full CI is needed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants