-
Notifications
You must be signed in to change notification settings - Fork 589
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[core] Ray version change breaks SkyPilot cluster #2722
Comments
We should install skypilot's remote dependency in the environment other than base to avoid this kind of issue. Another user encountered similar issue due to installing some package in the base environment. |
To add a minimal repro
Inside
This breaks skypilot/sky/setup_files/setup.py Lines 190 to 194 in 51a831c
and results in job submission failures. |
If the user's setup installs a newer version of ray than the one running on SkyPilot remote cluster, SkyPilot will get stuck at streaming logs. Many packages (e.g.,
vllm==0.2.0
) tend to install a newer version of ray.SkyPilot should:
Minimal repro:
Workaround
Install all dependencies in a new conda env.
Debug
ssh into cluster and run
ray status
:The text was updated successfully, but these errors were encountered: