Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VLLM does not support EAGLE Spec Decode when deploying EAGLE-Qwen2-7B-Instruct model #135

Open
crownz248 opened this issue Sep 25, 2024 · 1 comment

Comments

@crownz248
Copy link

I can successfully deploy llama3-8b-instruct with EAGLE. But there is a problem when deploying qwen2-7b-instruct with EAGLE.

I have converted the EAGLE-Qwen2-7B-Instruct model according tovllm/model_executor/models/eagle.py:L126.

I encountered another error below:

AssertionError: Attempted to load weight (torch.Size([3584])) into parameter (torch.Size([3584, 7168]))
I lookup to the code vllm/model_executor/models/eagle.py:L139 which is shown as below:

def load_weights(self, weights: Iterable[Tuple[str, torch.Tensor]]):
            ...
            elif name.startswith("fc."):
                weight_loader = getattr(self.fc.weight, "weight_loader",
                                        default_weight_loader)
                weight_loader(self.fc.weight, loaded_weight)
            ...

I think you only consider the name varieble startswith 'fc.' can only be 'fc.weight', but the fc layer of eagle-qwen2 has bias attribute, which means the name varieble can be 'fc.bias'.

Moreover, the qkv_proj layer of EAGLE-Qwen2-7B-Instruct also have bias.

I hope you can fix this in the upcoming upgrade!

@MMuzzammil1
Copy link

I think this issue has been fixed in the release v0.6.2 of vllm now. Please see this: vllm-project/vllm#8790.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants