[Bug]: VLLM does not support EAGLE Spec Decode when deploying EAGLE-Qwen2-7B-Instruct model #8849
Closed
1 task done
Labels
bug
Something isn't working
Your current environment
The output of `python collect_env.py`
Model Input Dumps
No response
🐛 Describe the bug
I can successfully deploy llama3-8b-instruct with EAGLE. But there is a problem when deploying qwen2-7b-instruct with EAGLE.
I have converted the EAGLE-Qwen2-7B-Instruct model according to[vllm/model_executor/models/eagle.py:L126](
vllm/vllm/model_executor/models/eagle.py
Line 126 in 8fae5ed
I tried the python code below
I encountered another error below:
AssertionError: Attempted to load weight (torch.Size([3584])) into parameter (torch.Size([3584, 7168]))
I lookup to the code [vllm/model_executor/models/eagle.py:L139](
vllm/vllm/model_executor/models/eagle.py
Line 139 in 8fae5ed
I think you only consider the name varieble startswith 'fc.' can only be 'fc.weight', but the fc layer of eagle-qwen2 has bias attribute, which means the name varieble can be 'fc.bias'.
I hope you can fix this in the upcoming upgrade!
Before submitting a new issue...
The text was updated successfully, but these errors were encountered: