Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Model #88

Open
duzw9311 opened this issue Jan 21, 2025 · 2 comments
Open

Support Model #88

duzw9311 opened this issue Jan 21, 2025 · 2 comments

Comments

@duzw9311
Copy link

Whether this framework is suitable for Qwen2-VL?

@duzw9311
Copy link
Author

When I finish the build process and try to test using openai compatible requests I get the following error on the prefill node.

INFO: Started server process [7926]
INFO: Waiting for application startup.
INFO: Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:8100 (Press CTRL+C to quit)
INFO 01-21 20:35:45 logger.py:37] Received request cmpl-34ec2cb5ef59408696b3bd5fd3e85858-0: prompt: 'San Francisco is a', params: SamplingParams(n=1, presence_penalty=0.0, frequency_penalty=0.0, repetition_penalty=1.0, temperature=1.0, top_p=1.0, top_k=-1, min_p=0.0, seed=None, stop=[], stop_token_ids=[], bad_words=[], include_stop_str_in_output=False, ignore_eos=False, max_tokens=1, min_tokens=0, logprobs=None, prompt_logprobs=None, skip_special_tokens=True, spaces_between_special_tokens=True, truncate_prompt_tokens=None, guided_decoding=None), prompt_token_ids: [23729, 12879, 374, 264], lora_request: None, prompt_adapter_request: None.
INFO 01-21 20:35:45 engine.py:267] Added request cmpl-34ec2cb5ef59408696b3bd5fd3e85858-0.
INFO 01-21 20:35:45 model_runner_base.py:120] Writing input of failed execution to /tmp/err_execute_model_input_20250121-203545.pkl...
INFO 01-21 20:35:45 model_runner_base.py:149] Completed writing input of failed execution to /tmp/err_execute_model_input_20250121-203545.pkl.
CRITICAL 01-21 20:35:45 launcher.py:99] MQLLMEngine is already dead, terminating server process
INFO: 127.0.0.1:52348 - "POST /v1/completions HTTP/1.1" 500 Internal Server Error
ERROR 01-21 20:35:45 engine.py:135] AttributeError("Error in model execution (input dumped to /tmp/err_execute_model_input_20250121-203545.pkl): 'Qwen2VLForConditionalGeneration' object has no attribute 'model'")
ERROR 01-21 20:35:45 engine.py:135] Traceback (most recent call last):
ERROR 01-21 20:35:45 engine.py:135] File "/root/miniconda3/lib/python3.12/site-packages/vllm/worker/model_runner_base.py", line 116, in _wrapper
ERROR 01-21 20:35:45 engine.py:135] return func(*args, **kwargs)
ERROR 01-21 20:35:45 engine.py:135] ^^^^^^^^^^^^^^^^^^^^^
ERROR 01-21 20:35:45 engine.py:135] File "/root/miniconda3/lib/python3.12/site-packages/vllm/worker/model_runner.py", line 1708, in execute_model
ERROR 01-21 20:35:45 engine.py:135] get_kv_transfer_group().send_kv_caches_and_hidden_states(
ERROR 01-21 20:35:45 engine.py:135] File "/root/miniconda3/lib/python3.12/site-packages/vllm/distributed/kv_transfer/kv_transfer_agent.py", line 60, in send_kv_caches_and_hidden_states
ERROR 01-21 20:35:45 engine.py:135] self.connector.send_kv_caches_and_hidden_states(
ERROR 01-21 20:35:45 engine.py:135] File "/root/miniconda3/lib/python3.12/site-packages/vllm/distributed/kv_transfer/kv_connector/simple_connector.py", line 160, in send_kv_caches_and_hidden_states
ERROR 01-21 20:35:45 engine.py:135] start_layer = model_executable.model.start_layer
ERROR 01-21 20:35:45 engine.py:135] ^^^^^^^^^^^^^^^^^^^^^^
ERROR 01-21 20:35:45 engine.py:135] File "/root/miniconda3/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1931, in getattr
ERROR 01-21 20:35:45 engine.py:135] raise AttributeError(
ERROR 01-21 20:35:45 engine.py:135] AttributeError: 'Qwen2VLForConditionalGeneration' object has no attribute 'model'
ERROR 01-21 20:35:45 engine.py:135]
ERROR 01-21 20:35:45 engine.py:135] The above exception was the direct cause of the following exception:
ERROR 01-21 20:35:45 engine.py:135]
ERROR 01-21 20:35:45 engine.py:135] Traceback (most recent call last):
ERROR 01-21 20:35:45 engine.py:135] File "/root/miniconda3/lib/python3.12/site-packages/vllm/engine/multiprocessing/engine.py", line 133, in start
ERROR 01-21 20:35:45 engine.py:135] self.run_engine_loop()
ERROR 01-21 20:35:45 engine.py:135] File "/root/miniconda3/lib/python3.12/site-packages/vllm/engine/multiprocessing/engine.py", line 196, in run_engine_loop
ERROR 01-21 20:35:45 engine.py:135] request_outputs = self.engine_step()
ERROR 01-21 20:35:45 engine.py:135] ^^^^^^^^^^^^^^^^^^
ERROR 01-21 20:35:45 engine.py:135] File "/root/miniconda3/lib/python3.12/site-packages/vllm/engine/multiprocessing/engine.py", line 214, in engine_step
ERROR 01-21 20:35:45 engine.py:135] raise e
ERROR 01-21 20:35:45 engine.py:135] File "/root/miniconda3/lib/python3.12/site-packages/vllm/engine/multiprocessing/engine.py", line 205, in engine_step
ERROR 01-21 20:35:45 engine.py:135] return self.engine.step()
ERROR 01-21 20:35:45 engine.py:135] ^^^^^^^^^^^^^^^^^^
ERROR 01-21 20:35:45 engine.py:135] File "/root/miniconda3/lib/python3.12/site-packages/vllm/engine/llm_engine.py", line 1390, in step
ERROR 01-21 20:35:45 engine.py:135] outputs = self.model_executor.execute_model(
ERROR 01-21 20:35:45 engine.py:135] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 01-21 20:35:45 engine.py:135] File "/root/miniconda3/lib/python3.12/site-packages/vllm/executor/gpu_executor.py", line 88, in execute_model
ERROR 01-21 20:35:45 engine.py:135] output = self.driver_worker.execute_model(execute_model_req)
ERROR 01-21 20:35:45 engine.py:135] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 01-21 20:35:45 engine.py:135] File "/root/miniconda3/lib/python3.12/site-packages/vllm/worker/worker_base.py", line 343, in execute_model
ERROR 01-21 20:35:45 engine.py:135] output = self.model_runner.execute_model(
ERROR 01-21 20:35:45 engine.py:135] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 01-21 20:35:45 engine.py:135] File "/root/miniconda3/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
ERROR 01-21 20:35:45 engine.py:135] return func(*args, **kwargs)
ERROR 01-21 20:35:45 engine.py:135] ^^^^^^^^^^^^^^^^^^^^^
ERROR 01-21 20:35:45 engine.py:135] File "/root/miniconda3/lib/python3.12/site-packages/vllm/worker/model_runner_base.py", line 152, in _wrapper
ERROR 01-21 20:35:45 engine.py:135] raise type(err)(
ERROR 01-21 20:35:45 engine.py:135] AttributeError: Error in model execution (input dumped to /tmp/err_execute_model_input_20250121-203545.pkl): 'Qwen2VLForConditionalGeneration' object has no attribute 'model'
INFO: Shutting down
INFO: Waiting for application shutdown.
INFO: Application shutdown complete.
INFO: Finished server process [7926]

@ShangmingCai
Copy link
Collaborator

Whether this framework is suitable for Qwen2-VL?

Currently, VL models are not supported since they have different architectures. We will follow the vllm community and make different features compatible with each other, this is on the roadmap.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants