-
Notifications
You must be signed in to change notification settings - Fork 383
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
vLLM's default multiprocessing method is incompatible with ROCm and Gaudi #2439
Comments
Additional comment by Russell
|
IMO we should fix this in the ODH branches for the vllm versions we are pulling in, and also in any container definitions. I don't have a problem with adding some code as a rededundancy to instructlab to handle plugging in different versions of vllm . |
@n1hility does ODH vLLM |
Looks like we need to wait for another bump. The branches were created in odh for intel and amd, but not utilized yet and we still need a patch here. |
Describe the bug
vLLM defaults to
VLLM_WORKER_MULTIPROC_METHOD=fork
, https://docs.vllm.ai/en/v0.6.1/serving/env_vars.html . Forking is incompatible with ROCm and Gaudi.To Reproduce
ilab model serve
on a system with more than one AMD GPUExpected behavior
vLLM works
Screenshots
Additional context
I recommend to switch to "spawn". Python is switching from fork to spawn for all platforms. The fork method has issues, e.g. it can lead to deadlocks when a process mixes threads and fork.
I switch InstructLab to spawn a long time ago, because it was causing trouble on Gaudi, see #956. InstructLab should set
VLLM_WORKER_MULTIPROC_METHOD=spawn
by default.The text was updated successfully, but these errors were encountered: