Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[New Model]: Florence-2 #5934

Open
localbarrage opened this issue Jun 27, 2024 · 14 comments
Open

[New Model]: Florence-2 #5934

localbarrage opened this issue Jun 27, 2024 · 14 comments
Labels
new model Requests to new models

Comments

@localbarrage
Copy link

The model to consider.

https://huggingface.co/microsoft/Florence-2-base

The closest model vllm already supports.

phi-3v , its a vlm

What's your difficulty of supporting the model you want?

No response

@localbarrage localbarrage added the new model Requests to new models label Jun 27, 2024
@localbarrage localbarrage changed the title [New Model]: [New Model]: Florence-2 Jun 27, 2024
@chandeldivyam
Copy link

@DarkLight1337 Anyone working on this?

@DarkLight1337
Copy link
Member

No, but please wait for #5852 and #5276 to land first as they involve significant API changes for devs. In the meantime, you can take a look at at this guide to get an idea of how to implement a new model.

@chandeldivyam
Copy link

Thanks, checking the guide and the previous PRs of adding phi3-vision, also #5276

@fcakyon
Copy link

fcakyon commented Aug 16, 2024

Both #5852 and #5276 is merged. Do you still have plans to work on this PR @chandeldivyam ?

@chandeldivyam
Copy link

@fcakyon Thanks for the reminder, it actually slipped my mind. Yes, I need florence-2 for a project I was working on. So, as an alternative for quick prototyping, I created a flask server but it is not the ideal solution. I will pick it up in the next week. Thanks!

Are you working on something that would need it?

@fcakyon
Copy link

fcakyon commented Aug 17, 2024

@chandeldivyam Yes, I also need such a solution for my work. I'm trying to utilize https://github.com/Lightning-AI/LitServe since I only have a little experience with the vllm-project.

@chandeldivyam
Copy link

@fcakyon have you looked into any benchmarking for litserve? Also, I think using vllm would make sense if there are ton of parallel requests right?

@pseudotensor
Copy link

@chandeldivyam Would be great to see florence-2 in vllm.

@bhavnicksm
Copy link

Hey @chandeldivyam,
Is there a PR already to track the progress on Florence-2?
Would be great to have Florence-2 with vllm 😀

@SteveKo837
Copy link

SteveKo837 commented Sep 6, 2024

Since there's been no update on this issue, this week I referred to the guide here and looked at how to add Phi3-vision to vLLM. I implemented the registry, but I ran into the following issue:

File "/app/vllm/entrypoints/llm.py", line 177, in __init__
    self.llm_engine = LLMEngine.from_engine_args(
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/vllm/engine/llm_engine.py", line 541, in from_engine_args
    engine = cls(
             ^^^^
  File "/app/vllm/engine/llm_engine.py", line 302, in __init__
    self.model_executor = executor_class(
                          ^^^^^^^^^^^^^^^
  File "/app/vllm/executor/executor_base.py", line 47, in __init__
    self._init_executor()
  File "/app/vllm/executor/gpu_executor.py", line 38, in _init_executor
    self.driver_worker = self._create_worker()
                         ^^^^^^^^^^^^^^^^^^^^^
  File "/app/vllm/executor/gpu_executor.py", line 105, in _create_worker
    return create_worker(**self._get_create_worker_kwargs(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/vllm/executor/gpu_executor.py", line 24, in create_worker
    wrapper.init_worker(**kwargs)
  File "/app/vllm/worker/worker_base.py", line 449, in init_worker
    self.worker = worker_class(*args, **kwargs)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/vllm/worker/worker.py", line 101, in __init__
    self.model_runner: GPUModelRunnerBase = ModelRunnerClass(
                                            ^^^^^^^^^^^^^^^^^
  File "/app/vllm/worker/enc_dec_model_runner.py", line 115, in __init__
    assert_enc_dec_mr_supported_scenario(self)
  File "/app/vllm/worker/utils.py", line 43, in assert_enc_dec_mr_supported_scenario
    raise NotImplementedError(
NotImplementedError: Multimodal is not currently supported with encoder/decoder models.

This error indicates that the Florence2 configuration has is_encoder_decoder:true, but the current EncoderDecoderModelRunner does not support multimodal. I think finding a workaround will be difficult since we really need this support. Can anyone give advice or suggest what to do next?

@DarkLight1337
Copy link
Member

This error indicates that the Florence2 configuration has is_encoder_decoder:true, but the current EncoderDecoderModelRunner does not support multimodal. I think finding a workaround will be difficult since we really need this support. Can anyone give advice or suggest what to do next?

If only the language part of the model is using encoder-decoder (i.e. there is no cross-attention between text and visual features), then you can try implementing only the language part in vLLM first.

@SteveKo837
Copy link

SteveKo837 commented Sep 6, 2024

This error indicates that the Florence2 configuration has is_encoder_decoder:true, but the current EncoderDecoderModelRunner does not support multimodal. I think finding a workaround will be difficult since we really need this support. Can anyone give advice or suggest what to do next?

If only the language part of the model is using encoder-decoder (i.e. there is no cross-attention between text and visual features), then you can try implementing only the language part in vLLM first.

@DarkLight1337, thanks for your comment. I think I understand, and it seems feasible. Since Florence2 only uses the encoder-decoder for the language part, specifically in the Florence2LanguageModel class, I can implement the language part and the vision part (DaViT) separately, then combine them later. I just need to organize the massive 2800 lines in the original modeling_florence.py file properly.

@Akhilrajeevp
Copy link

Hey whats the update on this one?How to do i Run florence 2 using vllm?

@joaomsimoes
Copy link

+1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
new model Requests to new models
Projects
None yet
Development

No branches or pull requests

9 participants