[Hardware][Metal] Apple Metal support #12640

skyzh · 2025-02-01T05:19:42Z

fix #2081

This patch makes some parts of vllm run with Apple Metal by using the PyTorch MPS fallback mode (see build_and_run.sh), which ensures that PyTorch operators can run natively on MPS while other operations run on CPU. Though the framework runs end-to-end and produces texts, it's not producing a sensible result:

curl http://localhost:8000/v1/completions \
    -H "Content-Type: application/json" \
    -d '{
        "model": "Qwen/Qwen2.5-0.5B-Instruct",
        "prompt": "San Francisco is a",
        "max_tokens": 7,
        "temperature": 0
    }'
{"id":"cmpl-aba9771f627e400fab7be25d8b310fd5","object":"text_completion","created":1738386309,"model":"Qwen/Qwen2.5-0.5B-Instruct","choices":[{"index":0,"text":"!!!!!!!","logprobs":null,"finish_reason":"length","stop_reason":null,"prompt_logprobs":null}],"usage":{"prompt_tokens":4,"total_tokens":11,"completion_tokens":7,"prompt_tokens_details":null}}%

...which needs further debugging, and comments welcomed -- I have no idea what's going on there.

For a full Metal support, we would have to implement all current CUDA kernels with Metal, which will be a lot of work. So this patch is the very first step before we have full Metal support.

In general, the patch assumes the Metal platform is a CPU-based platform (i.e., using CPU workers) with PyTorch MPS backend. This can also be improved in the future, for example, using the GPU scheduler for Metal.

Signed-off-by: Alex Chi <iskyzh@gmail.com>

github-actions · 2025-02-01T05:19:53Z

👋 Hi! Thank you for contributing to the vLLM project.
Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can do one of these:

Add ready label to the PR
Enable auto-merge.

🚀

skyzh added 2 commits January 31, 2025 19:04

initial support for metal

cd180dd

Signed-off-by: Alex Chi <iskyzh@gmail.com>

ensure all tensors are on the same device

40b1bb7

Signed-off-by: Alex Chi <iskyzh@gmail.com>

mergify bot added the ci/build label Feb 1, 2025

skyzh mentioned this pull request Feb 1, 2025

Inquiry Regarding vLLM Support for Mac Metal API #2081

Closed

mgoin self-requested a review February 1, 2025 17:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Hardware][Metal] Apple Metal support #12640

[Hardware][Metal] Apple Metal support #12640

skyzh commented Feb 1, 2025 •

edited by github-actions bot

Loading

github-actions bot commented Feb 1, 2025

[Hardware][Metal] Apple Metal support #12640

Are you sure you want to change the base?

[Hardware][Metal] Apple Metal support #12640

Conversation

skyzh commented Feb 1, 2025 • edited by github-actions bot Loading

github-actions bot commented Feb 1, 2025

skyzh commented Feb 1, 2025 •

edited by github-actions bot

Loading