-
-
Notifications
You must be signed in to change notification settings - Fork 5.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CI/Build] build on empty device for better dev experience #4773
[CI/Build] build on empty device for better dev experience #4773
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curious to hear your thoughts.. currently the PR just enables building of the platform agnostic wheel. What do you think about adding this build to publish.yml so it will be build on every release?
requirements-cpu.txt
Outdated
@@ -2,5 +2,4 @@ | |||
-r requirements-common.txt | |||
|
|||
# Dependencies for x86_64 CPUs | |||
torch == 2.3.0+cpu | |||
triton >= 2.2.0 # FIXME(woosuk): This is a hack to avoid import error. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dealt with triton import errors in code
Interestingly Triton recently supported macOS with Apple Silicon: triton-lang/triton#3443 |
Yep. But AFAIU they still release wheels only for Linux so it can't be pip installed directly from pypi. I guess building from source is the only option to get triton installed on mac. |
this would be great! |
➕ 1 for this feature. It would be really nice to enable local development for code completion, and documentation references. |
I think this feature is useful, but can you use the existing infra without introducing another env var? e.g. |
@@ -7,5 +7,5 @@ nvidia-ml-py # for pynvml package | |||
torch == 2.4.0 | |||
# These must be updated alongside torch | |||
torchvision == 0.19 # Required for phi3v processor. See https://github.com/pytorch/vision?tab=readme-ov-file#installation for corresponding version | |||
xformers == 0.0.27.post2 # Requires PyTorch 2.4.0 | |||
vllm-flash-attn == 2.6.1 # Requires PyTorch 2.4.0 | |||
xformers == 0.0.27.post2; platform_system == 'Linux' and platform_machine == 'x86_64' # Requires PyTorch 2.4.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change is because we take the CUDA requirements for the "empty" device wheel, and xformers
and vllm-flash-attn
are available only on Linux
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
platform_system == 'Linux'
makes sense to me.
is platform_machine == 'x86_64'
necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
vllm-flash-attn
have published wheels only for x86_64
, and no published tar.gz - https://pypi.org/project/vllm-flash-attn/#files
xformers
also has wheels only for 64bit machines. It does have a tar.gz but from what I found online it can't be installed on 32bit - https://pypi.org/project/xformers/#files
So I'm pretty sure it's needed for both
Thanks @youkaichao . Done now |
# vLLM only supports Linux platform | ||
assert sys.platform.startswith( | ||
"linux"), "vLLM only supports Linux platform (including WSL)." | ||
if not sys.platform.startswith("linux"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change actually makes it possible to install the published tar.gz
on mac, by setting VLLM_TARGET_DEVICE
to "empty"
. Also logging a warning about vLLM not actually being able to run.
@@ -350,7 +356,9 @@ def find_version(filepath: str) -> str: | |||
def get_vllm_version() -> str: | |||
version = find_version(get_path("vllm", "version.py")) | |||
|
|||
if _is_cuda(): | |||
if _no_device(): | |||
version += "+empty" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was actually not sure if it is better to add "+empty" to the version or not.. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
adding "+empty" looks good to me.
@youkaichao - now this PR actually does 2 things:
IMO the second point is the nicer feature here. Really makes development using vLLM much easier on macs. I'm not sure though if you'd want to include it.. so if you think it's not a good idea, we can discard it and just leave the |
@tomeras91 this feature is only for dev and debugging. I don't think it make any sense to publish to pypi. Making |
I understand and agree. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the great work! I manually verified that VLLM_TARGET_DEVICE="empty" pip install -vvv -e .
works for macos now🎉
@tomeras91 can you add a followup pr for https://docs.vllm.ai/en/latest/getting_started/installation.html , to tell users how to use this feature? |
Sure - here: #7403 |
…ect#4773) Signed-off-by: Alvant <alvasian@yandex.ru>
This PR enables build of a platform-agnostic wheel which is installable also on macos. The idea is to improve the dev-experience for creating projects that import and use vLLM.
Important: This wheel does not enable running of vllm on mac, but does allow to import it.
The PR doesn't entirely fix issues #212, #695, #1397, #1921, but it's a step forward.