-
Notifications
You must be signed in to change notification settings - Fork 112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Timm Models integration to Optimum-intel #404
Conversation
main_input_name = "pixel_values" | ||
|
||
|
||
class TimmModel(TimmPreTrainedModel): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need such a model hierarchy, with TimmPreTrainedModel, TimmModel, etc. At least Optimum-Intel is not the place for such classes. What is required is just to put the logic of model creation and prefetch into the .from_timm() method of OVModelForTimm. Moreover, OVModelForTimm itself is not needed since there is no intent to expose it. It can be just a utility function or class that is not inherited from HF hierarchy at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have removed the hierarchy as much as possible, leaving a small placeholder model to contain the timm
models.
return_dict=return_dict, | ||
) | ||
|
||
loss = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Loss is not needed at all in the context of further export to OpenVINO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have removed that section.
) | ||
|
||
|
||
class TimmImageProcessor(BaseImageProcessor, ImageFeatureExtractionMixin): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't expect but this is great that you created it. I just wonder how much from the overall Timm preprocessing options this class supports and how we will validate it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also wonder. Though I have added and used the class, it would be better if we had any methodical way for validating.
@AlexKoff88 I have removed a of unnecessary code, and abstractions. Please take a look and let me know your feedback. |
return cls.from_dict(config_dict, **kwargs) | ||
|
||
|
||
class TimmOnnxConfig(ViTOnnxConfig): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You used ViTOnnxConfig
as a base class here. Does it mean that only ViT models from Timm will work here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would inherit from VisionOnnxConfig
to make it safer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have fixed that.
@@ -619,6 +623,27 @@ def test_pipeline(self, model_arch): | |||
self.assertTrue(isinstance(outputs[0]["label"], str)) | |||
gc.collect() | |||
|
|||
@parameterized.expand(TIMM_MODELS) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should have a test which:
- load a timm model from hub
- saves it in IR locally (.save_pretrained)
- loads IR and does inference (random data is ok)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A test test_timm_save_and_infer
is added.
@fxmarty, @echarlaix, can you please take a look? |
The documentation is not available anymore as the PR was closed or merged. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left a few comments for tiny things to be fixed, but otherwise looks good to me, thank you!
It could be worth mentioning in the documentation as well that OVModelForImageClassification
can handle timm models.
class TimmOnnxConfig(VisionOnnxConfig): | ||
DEFAULT_TIMM_ONNX_OPSET = 13 | ||
outputs = OrderedDict([("logits", {0: "batch_size"})]) | ||
NORMALIZED_CONFIG_CLASS = NormalizedVisionConfig | ||
MIN_TORCH_VERSION = version.parse("1.11") | ||
|
||
@property | ||
def inputs(self) -> Dict[str, Dict[int, str]]: | ||
return {"pixel_values": {0: "batch_size", 1: "num_channels", 2: "height", 3: "width"}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this to support the ONNX export of timm models? Do you think it would be useful to move it (later) in optimum main repo?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good!
@sawradip Would you like to precise the doc / add an example, or should I merge as is? |
I will add some refernce to the doc as well add some description to the PR. |
@fxmarty, I see the following error in two PRs in a raw: "cannot import name 'is_safetensors_available' from 'diffusers.utils''. Do you have any idea how to fix it quickly? Do we need to update the requirements? |
@fxmarty , I have done some additions in docs, as well as added description to this PR. If everything seems alright, you can merge. |
@fxmarty There was one error with style, fixed it. Any feedback? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
@@ -17,6 +17,7 @@ | |||
"datasets>=1.4.0", | |||
"sentencepiece", | |||
"scipy", | |||
"timm", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think timm
should become a hard dependency
This PR is a part of Google Summer of Code 2023 project
Showcase performance of PyTorch Image Models (Timm) with OpenVINO
under Openvino-toolkit.Goal
Main goal of this PR is seamless integration of Timm library(hosting 1100+ pytorch computer vision models, as of writing this) into the OpenVino ecosystem. And as Timm is now part of huggingface ecosystem, the integrations done through Optimum-intel.
Project Details and Mentors
Contributor: Sawradip Saha
Organization: OpenVino-toolkit
Size of project: 175 hours
Mentors: Alexander Kozlov, Liubov Talamanova
Background
Timm is a collection of pre-trained and optimized models for deep learning in computer vision. By providing a wide array(1100+ as of now) of state-of-the-art models with ease of use, it encourages research and development in the field of computer vision, making cutting-edge technology accessible to both professionals and enthusiasts.
OpenVINO is a toolkit designed to fast-track the development of high-performance computer vision and deep learning inference. By offering optimization across multiple hardware types, including CPUs, GPUs, and FPGAs, OpenVINO allows for efficient deployment in various environments.
The integration of Timm with OpenVINO combines the robust and accessible models from Timm with the high-performance and flexibility of OpenVINO. This synergy enables enhanced performance and scalability, making it an ideal solution for various applications ranging from research to production deployment.
Implementation Details
As most of Timm models are for Image-classification or feature-extraction, the integration is done through
OVModelForImageClassification
, so the user can load models like any other Huggingface models from HuggingFace Hub. Note that, previously the attempt to load the models in this ways raised a number of errors and unexpected behavious, similar to loading throughAutoModelForImageClassification
some of which are mentioned here.Aditionally,
TimmImageProcessor
has been implemented, astransformers
didn't have any dedicated feature-extractor/image-precessor to handle the processor according to provided model config.Example:
With this PR we can load and use Timm models from hub like this:
or