-
-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Does it support multi-modal LLM like LLaVa? #1751
Comments
LLaVa needs to be added with our lightweight model addition process. Contribution welcomed! https://docs.vllm.ai/en/latest/models/adding_model.html |
You dont generally see an issue in using the lighweight model adidtion for the composed llava model with CLIP, vision encoder and LLM? |
It does turns out to be more complex. A working PR is here #3042 |
Closing as #3042 adds support for LLaVA1.5 |
is LLaVA-Next supported? |
@iamsaurabhgupt Feel free to take a look at the example here! #4194 (comment). One limitation we have now is that we don't support dynamic shape of input image, so the results might be slightly different from huggingface implementation, but we're on track to support this very soon! |
The given example shows how to use for LLM like llama.
But how can I use it to accelerate Visual Language Model like LLaVa?
The text was updated successfully, but these errors were encountered: