Blip: get/set input embeddings correctly #34152

zucchini-nlp · 2024-10-14T09:27:52Z

What does this PR do?

Fixes #34109 and adds get_input_embeddings method to the retrieval model. Also fixes the same methods in BLIP model where we should be working with text embeddings. Returning vision embeddings will not be able to resize the vocab size

Added tests as those were all skipped and thus we never knew there was an issue

HuggingFaceDocBuilderDev · 2024-10-14T12:59:54Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

ArthurZucker

🤗 thanks

ArthurZucker · 2024-10-16T18:59:25Z

src/transformers/models/blip/modeling_blip.py

+    def get_input_embeddings(self):
+        return self.text_model.get_input_embeddings()
+
+    def set_input_embeddings(self, value):
+        self.text_model.set_input_embeddings(value)


I think if there is a text_config, we could automatically deduce this from the key which would be here text_model which to call? (thinking about general api-wise!)

hmm, i see that in PreTrainedModel we try to get the method from base_model and prob we can fallback to that by indicating the base_model_prefix

I am not very sure yet how the prefix is used when loading the model, so lemme quick check that state dict is still correctly loaded

Update: yes the idea works and loading happens same way as without the base_model_prefix. But some of the tests will fail because of the composite nature of BlipConfig (test_correct_missing_keys)

I will take this noted, and will add it to my TODO list. But I believe it would force us to refactor from_pretrained to work well with composite models

ArthurZucker · 2024-10-16T19:15:44Z

Also : we might need / want to force return_dict to TRUE, to avoid all the if else

ArthurZucker · 2024-10-16T19:15:55Z

would make it simpler!

ArthurZucker

🤗

ArthurZucker · 2024-10-18T15:43:27Z

src/transformers/models/blip_2/modeling_blip_2.py

@@ -1771,11 +1771,12 @@ def forward(
                decoder_attention_mask=decoder_attention_mask,
                output_attentions=output_attentions,
                output_hidden_states=output_hidden_states,
-                return_dict=return_dict,
+                return_dict=True,  # toggle for easier access to loss/logits below


Sorry 😐 realized this would break torch.script or fx export compatibility so maybe False by default ? (I might be wrong tho, but I don't think it's suported)

yeah, torchscript is not supported for BLIP afaik and the tests are disabled therefore. I guess in that case we don't need it to be False

No but you could script only the LM model and not the full model no?

I added torchscript tests and they are passing currently. FX test cannot be added because the model architecture is not in supported list

I don't think we should do False be default, as that would add more complexity than before when we passed the actual return_dict. We'd have to wrap outputs from tuple into the correct ModelOutputClass manually if return_dict. If you think we should still not set True by default let's get to the very first solution I proposed

Okay sounds good!

ArthurZucker

Let's go!

* set-get embeds * add tests * fix tests * remove * return dict True * fix tests * why did i remove this * enabel torchscript tests

zucchini-nlp added 4 commits October 14, 2024 10:26

set-get embeds

7207872

Merge remote-tracking branch 'upstream/main' into blip2

73ef59d

add tests

d44c840

fix tests

8523001

zucchini-nlp requested a review from ArthurZucker October 14, 2024 12:30

remove

01176de

ArthurZucker approved these changes Oct 16, 2024

View reviewed changes

zucchini-nlp added 2 commits October 18, 2024 11:15

return dict True

0297d71

fix tests

adf45ea

ArthurZucker approved these changes Oct 18, 2024

View reviewed changes

zucchini-nlp added 2 commits October 21, 2024 12:37

why did i remove this

4536e44

enabel torchscript tests

f760726

ArthurZucker approved these changes Oct 25, 2024

View reviewed changes

zucchini-nlp added 4 commits October 29, 2024 07:56

Merge branch 'main' into blip2

be37a96

Merge branch 'main' into blip2

ffbef7f

Merge branch 'main' into blip2

7e86be2

Merge branch 'main' into blip2

ea8fe84

zucchini-nlp merged commit 6beb3f1 into huggingface:main Nov 1, 2024
26 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Blip: get/set input embeddings correctly #34152

Blip: get/set input embeddings correctly #34152

zucchini-nlp commented Oct 14, 2024

HuggingFaceDocBuilderDev commented Oct 14, 2024

ArthurZucker left a comment

ArthurZucker Oct 16, 2024

zucchini-nlp Oct 18, 2024

zucchini-nlp Oct 18, 2024

ArthurZucker Oct 18, 2024

ArthurZucker commented Oct 16, 2024 •

edited

Loading

ArthurZucker commented Oct 16, 2024

ArthurZucker left a comment

ArthurZucker Oct 18, 2024

zucchini-nlp Oct 21, 2024

ArthurZucker Oct 22, 2024

zucchini-nlp Oct 23, 2024

ArthurZucker Oct 25, 2024

ArthurZucker left a comment

Blip: get/set input embeddings correctly #34152

Blip: get/set input embeddings correctly #34152

Conversation

zucchini-nlp commented Oct 14, 2024

What does this PR do?

HuggingFaceDocBuilderDev commented Oct 14, 2024

ArthurZucker left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ArthurZucker commented Oct 16, 2024 • edited Loading

ArthurZucker commented Oct 16, 2024

ArthurZucker left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ArthurZucker left a comment

Choose a reason for hiding this comment

ArthurZucker commented Oct 16, 2024 •

edited

Loading