Support Model #88

duzw9311 · 2025-01-21T12:42:10Z

Whether this framework is suitable for Qwen2-VL?

duzw9311 · 2025-01-21T12:46:39Z

When I finish the build process and try to test using openai compatible requests I get the following error on the prefill node.

INFO: Started server process [7926]
INFO: Waiting for application startup.
INFO: Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:8100 (Press CTRL+C to quit)
INFO 01-21 20:35:45 logger.py:37] Received request cmpl-34ec2cb5ef59408696b3bd5fd3e85858-0: prompt: 'San Francisco is a', params: SamplingParams(n=1, presence_penalty=0.0, frequency_penalty=0.0, repetition_penalty=1.0, temperature=1.0, top_p=1.0, top_k=-1, min_p=0.0, seed=None, stop=[], stop_token_ids=[], bad_words=[], include_stop_str_in_output=False, ignore_eos=False, max_tokens=1, min_tokens=0, logprobs=None, prompt_logprobs=None, skip_special_tokens=True, spaces_between_special_tokens=True, truncate_prompt_tokens=None, guided_decoding=None), prompt_token_ids: [23729, 12879, 374, 264], lora_request: None, prompt_adapter_request: None.
INFO 01-21 20:35:45 engine.py:267] Added request cmpl-34ec2cb5ef59408696b3bd5fd3e85858-0.
INFO 01-21 20:35:45 model_runner_base.py:120] Writing input of failed execution to /tmp/err_execute_model_input_20250121-203545.pkl...
INFO 01-21 20:35:45 model_runner_base.py:149] Completed writing input of failed execution to /tmp/err_execute_model_input_20250121-203545.pkl.
CRITICAL 01-21 20:35:45 launcher.py:99] MQLLMEngine is already dead, terminating server process
INFO: 127.0.0.1:52348 - "POST /v1/completions HTTP/1.1" 500 Internal Server Error
ERROR 01-21 20:35:45 engine.py:135] AttributeError("Error in model execution (input dumped to /tmp/err_execute_model_input_20250121-203545.pkl): 'Qwen2VLForConditionalGeneration' object has no attribute 'model'")
ERROR 01-21 20:35:45 engine.py:135] Traceback (most recent call last):
ERROR 01-21 20:35:45 engine.py:135] File "/root/miniconda3/lib/python3.12/site-packages/vllm/worker/model_runner_base.py", line 116, in _wrapper
ERROR 01-21 20:35:45 engine.py:135] return func(*args, **kwargs)
ERROR 01-21 20:35:45 engine.py:135] ^^^^^^^^^^^^^^^^^^^^^
ERROR 01-21 20:35:45 engine.py:135] File "/root/miniconda3/lib/python3.12/site-packages/vllm/worker/model_runner.py", line 1708, in execute_model
ERROR 01-21 20:35:45 engine.py:135] get_kv_transfer_group().send_kv_caches_and_hidden_states(
ERROR 01-21 20:35:45 engine.py:135] File "/root/miniconda3/lib/python3.12/site-packages/vllm/distributed/kv_transfer/kv_transfer_agent.py", line 60, in send_kv_caches_and_hidden_states
ERROR 01-21 20:35:45 engine.py:135] self.connector.send_kv_caches_and_hidden_states(
ERROR 01-21 20:35:45 engine.py:135] File "/root/miniconda3/lib/python3.12/site-packages/vllm/distributed/kv_transfer/kv_connector/simple_connector.py", line 160, in send_kv_caches_and_hidden_states
ERROR 01-21 20:35:45 engine.py:135] start_layer = model_executable.model.start_layer
ERROR 01-21 20:35:45 engine.py:135] ^^^^^^^^^^^^^^^^^^^^^^
ERROR 01-21 20:35:45 engine.py:135] File "/root/miniconda3/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1931, in getattr
ERROR 01-21 20:35:45 engine.py:135] raise AttributeError(
ERROR 01-21 20:35:45 engine.py:135] AttributeError: 'Qwen2VLForConditionalGeneration' object has no attribute 'model'
ERROR 01-21 20:35:45 engine.py:135]
ERROR 01-21 20:35:45 engine.py:135] The above exception was the direct cause of the following exception:
ERROR 01-21 20:35:45 engine.py:135]
ERROR 01-21 20:35:45 engine.py:135] Traceback (most recent call last):
ERROR 01-21 20:35:45 engine.py:135] File "/root/miniconda3/lib/python3.12/site-packages/vllm/engine/multiprocessing/engine.py", line 133, in start
ERROR 01-21 20:35:45 engine.py:135] self.run_engine_loop()
ERROR 01-21 20:35:45 engine.py:135] File "/root/miniconda3/lib/python3.12/site-packages/vllm/engine/multiprocessing/engine.py", line 196, in run_engine_loop
ERROR 01-21 20:35:45 engine.py:135] request_outputs = self.engine_step()
ERROR 01-21 20:35:45 engine.py:135] ^^^^^^^^^^^^^^^^^^
ERROR 01-21 20:35:45 engine.py:135] File "/root/miniconda3/lib/python3.12/site-packages/vllm/engine/multiprocessing/engine.py", line 214, in engine_step
ERROR 01-21 20:35:45 engine.py:135] raise e
ERROR 01-21 20:35:45 engine.py:135] File "/root/miniconda3/lib/python3.12/site-packages/vllm/engine/multiprocessing/engine.py", line 205, in engine_step
ERROR 01-21 20:35:45 engine.py:135] return self.engine.step()
ERROR 01-21 20:35:45 engine.py:135] ^^^^^^^^^^^^^^^^^^
ERROR 01-21 20:35:45 engine.py:135] File "/root/miniconda3/lib/python3.12/site-packages/vllm/engine/llm_engine.py", line 1390, in step
ERROR 01-21 20:35:45 engine.py:135] outputs = self.model_executor.execute_model(
ERROR 01-21 20:35:45 engine.py:135] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 01-21 20:35:45 engine.py:135] File "/root/miniconda3/lib/python3.12/site-packages/vllm/executor/gpu_executor.py", line 88, in execute_model
ERROR 01-21 20:35:45 engine.py:135] output = self.driver_worker.execute_model(execute_model_req)
ERROR 01-21 20:35:45 engine.py:135] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 01-21 20:35:45 engine.py:135] File "/root/miniconda3/lib/python3.12/site-packages/vllm/worker/worker_base.py", line 343, in execute_model
ERROR 01-21 20:35:45 engine.py:135] output = self.model_runner.execute_model(
ERROR 01-21 20:35:45 engine.py:135] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 01-21 20:35:45 engine.py:135] File "/root/miniconda3/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
ERROR 01-21 20:35:45 engine.py:135] return func(*args, **kwargs)
ERROR 01-21 20:35:45 engine.py:135] ^^^^^^^^^^^^^^^^^^^^^
ERROR 01-21 20:35:45 engine.py:135] File "/root/miniconda3/lib/python3.12/site-packages/vllm/worker/model_runner_base.py", line 152, in _wrapper
ERROR 01-21 20:35:45 engine.py:135] raise type(err)(
ERROR 01-21 20:35:45 engine.py:135] AttributeError: Error in model execution (input dumped to /tmp/err_execute_model_input_20250121-203545.pkl): 'Qwen2VLForConditionalGeneration' object has no attribute 'model'
INFO: Shutting down
INFO: Waiting for application shutdown.
INFO: Application shutdown complete.
INFO: Finished server process [7926]

ShangmingCai · 2025-01-22T09:11:23Z

Whether this framework is suitable for Qwen2-VL?

Currently, VL models are not supported since they have different architectures. We will follow the vllm community and make different features compatible with each other, this is on the roadmap.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support Model #88

Support Model #88

duzw9311 commented Jan 21, 2025

duzw9311 commented Jan 21, 2025

ShangmingCai commented Jan 22, 2025

Support Model #88

Support Model #88

Comments

duzw9311 commented Jan 21, 2025

duzw9311 commented Jan 21, 2025

ShangmingCai commented Jan 22, 2025