-
-
Notifications
You must be signed in to change notification settings - Fork 5.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Model]: Llava-Next-Video support #6571
Comments
Do you plan to make a PR for this? FYI, the support for multi-image (which is essentially what video Llava is doing) is indeed in our Q3 roadmap, so it would be great if we collaborate on the effort. |
Yes but I haven't finished yet. I am working on it. |
I will make a PR this week. It will support a dynamic number of input frames, which is important but not supported by SGLang.
|
The model to consider.
LLaVA-NeXT-Video* (LlavaNextVideoForConditionalGeneration)
The closest model vllm already supports.
Llava-Next (LlavaNextForConditionalGeneration)
What's your difficulty of supporting the model you want?
The text was updated successfully, but these errors were encountered: