[Bug]: vllm v0.6.2 is crashed on multiple GPU #9225

tonyaw · 2024-10-10T05:46:02Z

Your current environment

The output of `python collect_env.py`

Your output of `python collect_env.py` here

Model Input Dumps

No response

🐛 Describe the bug

INFO: Started server process [11912]
INFO: Waiting for application startup.
INFO: Application startup complete.
ERROR: [Errno 98] error while attempting to bind on address ('0.0.0.0', 8080): address already in use
INFO: Waiting for application shutdown.
INFO: Application shutdown complete.

My command to start vllm:

python3 -m vllm.entrypoints.openai.api_server --model hugging-quants/Meta-Llama-3.1-70B-Instruct-AWQ-INT4 \
        --host 0.0.0.0 --port 8080  --seed 42 --trust-remote-code --disable-frontend-multiprocessing \
        --enable-chunked-prefill --tensor-parallel-size 2 --max-model-len 98304 >> "$LOG_FILE" 2>&1 &

If I change tensor-parallel-size from 2 to 1, no such issue.

docker image in use is "vllm/vllm-openai:v0.6.2".

Before submitting a new issue...

Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

The text was updated successfully, but these errors were encountered:

DarkLight1337 · 2024-10-10T06:05:16Z

ERROR: [Errno 98] error while attempting to bind on address ('0.0.0.0', 8080): address already in use

It should have been fixed by #8537. Can you try using the latest main branch of vLLM?

tonyaw · 2024-10-10T06:26:39Z

I just tried without "--disable-frontend-multiprocessing", it works on v0.6.2. Thanks!

Will have a try with following main later:
ttps://vllm-wheels.s3.us-west-2.amazonaws.com/nightly/vllm-1.0.0.dev-cp38-abi3-manylinux1_x86_64.whl

nikhilcms · 2024-10-10T15:44:29Z

Hi @DarkLight1337

for me these error not related to --disable-frontend-multiprocessing params , I'm using v0.6.2 without this params and pod restart automatically with "OSError: [Errno 98] Address already in use" error.

If I am correct, fix related to this issue already merged, It would be great if you push latest image to fix this issue

DarkLight1337 · 2024-10-10T16:08:36Z

Hi @DarkLight1337

for me these error not related to --disable-frontend-multiprocessing params , I'm using v0.6.2 without this params and pod restart automatically with "OSError: [Errno 98] Address already in use" error.

If I am correct, fix related to this issue already merged, It would be great if you push latest image to fix this issue

You can download images for specific commits as described here.

tonyaw added the bug Something isn't working label Oct 10, 2024

DarkLight1337 closed this as completed Oct 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: vllm v0.6.2 is crashed on multiple GPU #9225

[Bug]: vllm v0.6.2 is crashed on multiple GPU #9225

tonyaw commented Oct 10, 2024

DarkLight1337 commented Oct 10, 2024 •

edited

Loading

tonyaw commented Oct 10, 2024

nikhilcms commented Oct 10, 2024

DarkLight1337 commented Oct 10, 2024

[Bug]: vllm v0.6.2 is crashed on multiple GPU #9225

[Bug]: vllm v0.6.2 is crashed on multiple GPU #9225

Comments

tonyaw commented Oct 10, 2024

Your current environment

Model Input Dumps

🐛 Describe the bug

Before submitting a new issue...

DarkLight1337 commented Oct 10, 2024 • edited Loading

tonyaw commented Oct 10, 2024

nikhilcms commented Oct 10, 2024

DarkLight1337 commented Oct 10, 2024

DarkLight1337 commented Oct 10, 2024 •

edited

Loading