Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add DINOv2 depth estimation #26092

Merged
merged 32 commits into from
Nov 13, 2023
Merged

Conversation

NielsRogge
Copy link
Contributor

@NielsRogge NielsRogge commented Sep 11, 2023

What does this PR do?

Fixes #26057

PR that implements a part of #25799. It extends the DPT framework to use the AutoBackbone class. Next it uses Dinov2Backbone to convert the DINOv2+DPT checkpoints released by the authors here.

To do:

  • add test in common backbone tests file to verify out_indices are saved properly => done in [AutoBackbone] Add test #26094
  • add support for transforms in DPTImageProcessor
  • add tests which test DPT with a backbone
  • add integration test and convert remaining checkpoints

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Sep 11, 2023

The documentation is not available anymore as the PR was closed or merged.

Copy link
Collaborator

@amyeroberts amyeroberts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding this!

Main comment is about backwards compatibility with config arguments. Otherwise looks good and just a few small nits.

cc @rafaelpadilla

src/transformers/models/dpt/image_processing_dpt.py Outdated Show resolved Hide resolved
src/transformers/models/dpt/image_processing_dpt.py Outdated Show resolved Hide resolved
tests/models/dpt/test_image_processing_dpt.py Show resolved Hide resolved
src/transformers/models/dpt/modeling_dpt.py Outdated Show resolved Hide resolved
src/transformers/models/dpt/modeling_dpt.py Outdated Show resolved Hide resolved
src/transformers/models/dpt/modeling_dpt.py Outdated Show resolved Hide resolved
src/transformers/models/dpt/modeling_dpt.py Show resolved Hide resolved
tests/models/dinov2/test_modeling_dinov2.py Outdated Show resolved Hide resolved
src/transformers/models/dpt/modeling_dpt.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a mile for adding support to more models!
✅ for the changes to the image processor
✅ for the conversion scripts, if the tests are optional and in separate functions.
✅ for the changes to dino, if we make them BC
🟨 for the changes to DPT, that's a lot of changes. @amyeroberts did not seem against it, but wondering if it makes sense to have a new model? Are we stuck because DINOv2 uses it and we would have a lack of consistency?

src/transformers/models/dinov2/modeling_dinov2.py Outdated Show resolved Hide resolved
src/transformers/models/dinov2/modeling_dinov2.py Outdated Show resolved Hide resolved
src/transformers/models/dinov2/modeling_dinov2.py Outdated Show resolved Hide resolved
src/transformers/models/dinov2/modeling_dinov2.py Outdated Show resolved Hide resolved
src/transformers/models/dinov2/modeling_dinov2.py Outdated Show resolved Hide resolved
src/transformers/models/dpt/modeling_dpt.py Outdated Show resolved Hide resolved
src/transformers/models/dpt/modeling_dpt.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@amyeroberts amyeroberts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for iterating!

Just a few small things to address before merging

src/transformers/models/dinov2/modeling_dinov2.py Outdated Show resolved Hide resolved
src/transformers/models/dinov2/modeling_dinov2.py Outdated Show resolved Hide resolved
src/transformers/models/dinov2/modeling_dinov2.py Outdated Show resolved Hide resolved
tests/models/dinov2/test_modeling_dinov2.py Outdated Show resolved Hide resolved
src/transformers/models/dinov2/modeling_dinov2.py Outdated Show resolved Hide resolved
src/transformers/models/dpt/image_processing_dpt.py Outdated Show resolved Hide resolved
src/transformers/models/dpt/modeling_dpt.py Outdated Show resolved Hide resolved
src/transformers/models/dpt/modeling_dpt.py Outdated Show resolved Hide resolved
src/transformers/models/dpt/modeling_dpt.py Show resolved Hide resolved
src/transformers/models/dpt/modeling_dpt.py Outdated Show resolved Hide resolved
@NielsRogge NielsRogge force-pushed the add_dinov2_depth branch 2 times, most recently from 7b09882 to 8cd7e50 Compare October 30, 2023 19:06
@NielsRogge
Copy link
Contributor Author

@amyeroberts thanks for your review, I've addressed all comments. Feel free to merge :)

Copy link
Collaborator

@amyeroberts amyeroberts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for iterating!

A few final small comments / things to address.

General comment: I completely agree with @ArthurZucker's comments about the added complexity to the model. I'm not a fan of adding this if/else structure in the forward method. However, I don't see an easy way to change because of the backbone_out_indices argument and the weight loading already mapping to self.dpt.

src/transformers/models/dpt/convert_dinov2_depth_to_hf.py Outdated Show resolved Hide resolved
src/transformers/models/dpt/image_processing_dpt.py Outdated Show resolved Hide resolved
src/transformers/models/dpt/image_processing_dpt.py Outdated Show resolved Hide resolved
src/transformers/models/dpt/image_processing_dpt.py Outdated Show resolved Hide resolved
src/transformers/models/dpt/image_processing_dpt.py Outdated Show resolved Hide resolved
src/transformers/models/dpt/image_processing_dpt.py Outdated Show resolved Hide resolved
src/transformers/models/dpt/image_processing_dpt.py Outdated Show resolved Hide resolved
src/transformers/models/dpt/image_processing_dpt.py Outdated Show resolved Hide resolved
src/transformers/models/dpt/image_processing_dpt.py Outdated Show resolved Hide resolved
tests/models/dpt/test_modeling_dpt_auto_backbone.py Outdated Show resolved Hide resolved
@NielsRogge
Copy link
Contributor Author

I've addressed all your comments, feel free to merge. To do:

  • update all checkpoints on the hub to use size_divisor rather than size_divisbility in the preprocessor_config.json

Copy link
Collaborator

@amyeroberts amyeroberts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for iterating!

@amyeroberts amyeroberts merged commit 2422c38 into huggingface:main Nov 13, 2023
3 checks passed
@amyeroberts
Copy link
Collaborator

@NielsRogge Thanks again for this addition - merged! I'll let you handle the updates to the configs on the hub.

@Starlento
Copy link

Hello guys. Thank you for adding this. I have a silly question that is there any plan that facebook will push this to the huggingface repo? Like facebook/dpt-dinov2-large-nyu? And could you can kindly inform me who is actually operating the companies accounts... Seems it is not all done by their side based on my guess.

@NielsRogge
Copy link
Contributor Author

Hi @Starlento,

All models are on the hub: https://huggingface.co/models?pipeline_tag=depth-estimation&other=dinov2&sort=trending. I'll open a PR to make this more explicit in the docs

EduardoPach pushed a commit to EduardoPach/transformers that referenced this pull request Nov 19, 2023
* First draft

* Fix style

* More improvements

* Fix tests

* Fix tests

* Convert checkpoint

* Improve DPTImageProcessor

* Remove scripts, improve conversion script

* Remove print statements

* Fix test

* Improve docstring

* More improvements

* Fix style

* Fix image processor

* Add tests

* Address comments

* Address comments

* Make bias backwards compatible

* Address comment

* Address comment

* Address comment

* Apply suggestions from code review

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Address comments

* Add flag

* Add tests

* Make tests smaller

* Use regular BackboneOutput

* Fix all tests

* Update test

* Convert more checkpoints

* Convert giant checkpoints, add integration test

* Rename size_divisibility to size_divisor

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
@lightonthefloor
Copy link

lightonthefloor commented Nov 22, 2023

Hey, guys, I want to know if it is possible to release the training part, and if it is possible, when will it be released?

hidden_states = backbone_hidden_states

patch_height, patch_width = None, None
if self.config.backbone_config is not None and self.config.is_hybrid is False:
Copy link

@idol791 idol791 Jan 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why calculate patch_height and patch_width (the input image size as multiples of patch_size) under these conditions only? If backbone_config is None, self.config.patch_size can be used instead of self.config.backbone_config.patch_size.
@NielsRogge

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Dinov2 for depth estimation
7 participants