-
-
Notifications
You must be signed in to change notification settings - Fork 5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: Error Running Qwen2.5-7B-Instruct on CPU #9175
Comments
I run the docker with the main branch Dockerfile.cpu: "docker build -f Dockerfile.cpu -t vllm-cpu-env ." but it request wrong, I don't know reason, why??? request example: debug logging: INFO 10-09 02:47:14 logger.py:37] Received request chat-6e6b6292da0e43efafa4a55db7260b38: prompt: '<|im_start|>system\nYou are Qwen, created by Alibaba Cloud. You are a helpful assistant.\n\n# Tools\n\nYou may call one or more functions to assist with the user query.\n\nYou are provided with function signatures within XML tags:\n\n{"type": "function", "function": {"name": "cInBMXzQGq58", "description": "系统状态方面", "parameters": {"type": "object", "properties": {}, "required": []}}}\n\n\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\n<tool_call>\n{"name": , "arguments": }\n</tool_call><|im_end|>\n<|im_start|>user\n系统状态<|im_end|>\n<|im_start|>assistant\n', params: SamplingParams(n=1, best_of=1, presence_penalty=0.0, frequency_penalty=0.0, repetition_penalty=1.0, temperature=0.01, top_p=1.0, top_k=-1, min_p=0.0, seed=None, stop=[], stop_token_ids=[], include_stop_str_in_output=False, ignore_eos=False, max_tokens=8000, min_tokens=0, logprobs=None, prompt_logprobs=None, skip_special_tokens=True, spaces_between_special_tokens=True, truncate_prompt_tokens=None), guided_decoding=GuidedDecodingParams(json=None, regex=None, choice=None, grammar=None, json_object=None, backend=None, whitespace_pattern=None), prompt_token_ids: [151644, 8948, 198, 2610, 525, 1207, 16948, 11, 3465, 553, 54364, 14817, 13, 1446, 525, 264, 10950, 17847, 382, 2, 13852, 271, 2610, 1231, 1618, 825, 476, 803, 5746, 311, 7789, 448, 279, 1196, 3239, 382, 2610, 525, 3897, 448, 729, 32628, 2878, 366, 15918, 1472, 15918, 29, 11874, 9492, 510, 27, 15918, 397, 4913, 1313, 788, 330, 1688, 497, 330, 1688, 788, 5212, 606, 788, 330, 66, 641, 28942, 55, 89, 48, 38, 80, 20, 23, 497, 330, 4684, 788, 330, 72448, 44091, 99522, 497, 330, 13786, 788, 5212, 1313, 788, 330, 1700, 497, 330, 13193, 788, 16452, 330, 6279, 788, 3056, 3417, 532, 522, 15918, 1339, 2461, 1817, 729, 1618, 11, 470, 264, 2951, 1633, 448, 729, 829, 323, 5977, 2878, 220, 151657, 151658, 11874, 9492, 510, 151657, 198, 4913, 606, 788, 366, 1688, 11494, 8066, 330, 16370, 788, 366, 2116, 56080, 40432, 31296, 151658, 151645, 198, 151644, 872, 198, 72448, 44091, 151645, 198, 151644, 77091, 198], lora_request: None, prompt_adapter_request: None. ERROR 10-09 02:47:48 client.py:250] RuntimeError('Engine loop has died') INFO 10-09 02:47:49 metrics.py:351] Avg prompt throughput: 5.1 tokens/s, Avg generation throughput: 0.0 tokens/s, Running: 1 reqs, Swapped: 0 reqs, Pending: 0 reqs, GPU KV cache usage: 0.2%, CPU KV cache usage: 0.0%. |
@K-Mistele can you help look into the second issue? |
@njhill @robertgshaw2-neuralmagic the first issue is another case where the real exception cannot be seen from the stack trace. |
@DarkLight1337 I'll try to take a look soon re the error suppression. |
@njhill - took a quick look into these issues. Basically I think if we raise an error / exception anywhere outside the LLMEngine class, we are not propagating the full stack trace back (since we are just sending the exception itself over zmq) |
Your current environment
The output of `python collect_env.py`
Model Input Dumps
No response
🐛 Describe the bug
Installed vllm for CPU as mentioned in https://docs.vllm.ai/en/latest/getting_started/cpu-installation.html#
I run the docker by
docker compose
:Error:
Qwen2-7B-Instruct
has the same issue. (The strange thing is that the first request ofQwen2.5-7B-Instruct
was fine, and the second request was wrong.Qwen2-7B-Instruct
has an error occurred on the first request.)I have tried #9044 , that pull not solved my problem.
Before submitting a new issue...
The text was updated successfully, but these errors were encountered: