Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: vllm v0.6.2 is crashed on multiple GPU #9225

Closed
1 task done
tonyaw opened this issue Oct 10, 2024 · 4 comments
Closed
1 task done

[Bug]: vllm v0.6.2 is crashed on multiple GPU #9225

tonyaw opened this issue Oct 10, 2024 · 4 comments
Labels
bug Something isn't working

Comments

@tonyaw
Copy link

tonyaw commented Oct 10, 2024

Your current environment

The output of `python collect_env.py`
Your output of `python collect_env.py` here

Model Input Dumps

No response

🐛 Describe the bug

INFO: Started server process [11912]
INFO: Waiting for application startup.
INFO: Application startup complete.
ERROR: [Errno 98] error while attempting to bind on address ('0.0.0.0', 8080): address already in use
INFO: Waiting for application shutdown.
INFO: Application shutdown complete.

My command to start vllm:

python3 -m vllm.entrypoints.openai.api_server --model hugging-quants/Meta-Llama-3.1-70B-Instruct-AWQ-INT4 \
        --host 0.0.0.0 --port 8080  --seed 42 --trust-remote-code --disable-frontend-multiprocessing \
        --enable-chunked-prefill --tensor-parallel-size 2 --max-model-len 98304 >> "$LOG_FILE" 2>&1 &

If I change tensor-parallel-size from 2 to 1, no such issue.

docker image in use is "vllm/vllm-openai:v0.6.2".

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.
@tonyaw tonyaw added the bug Something isn't working label Oct 10, 2024
@DarkLight1337
Copy link
Member

DarkLight1337 commented Oct 10, 2024

ERROR: [Errno 98] error while attempting to bind on address ('0.0.0.0', 8080): address already in use

It should have been fixed by #8537. Can you try using the latest main branch of vLLM?

@tonyaw
Copy link
Author

tonyaw commented Oct 10, 2024

I just tried without "--disable-frontend-multiprocessing", it works on v0.6.2. Thanks!

Will have a try with following main later:
ttps://vllm-wheels.s3.us-west-2.amazonaws.com/nightly/vllm-1.0.0.dev-cp38-abi3-manylinux1_x86_64.whl

@nikhilcms
Copy link

Hi @DarkLight1337

for me these error not related to --disable-frontend-multiprocessing params , I'm using v0.6.2 without this params and pod restart automatically with "OSError: [Errno 98] Address already in use" error.

If I am correct, fix related to this issue already merged, It would be great if you push latest image to fix this issue

@DarkLight1337
Copy link
Member

Hi @DarkLight1337

for me these error not related to --disable-frontend-multiprocessing params , I'm using v0.6.2 without this params and pod restart automatically with "OSError: [Errno 98] Address already in use" error.

If I am correct, fix related to this issue already merged, It would be great if you push latest image to fix this issue

You can download images for specific commits as described here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants