-
Notifications
You must be signed in to change notification settings - Fork 462
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add target to resize transform for aspect ratio training (detection task) #823
Merged
Merged
Changes from 7 commits
Commits
Show all changes
17 commits
Select commit
Hold shift + click to select a range
f863474
feat: add target to resize for aspect ratio training
charlesmindee 935386f
fix: tests
charlesmindee 53130d9
fix: sorting
charlesmindee a439f7c
fix: typo
charlesmindee e236956
fix: typo
charlesmindee 8ee4d43
fix: symmetric padding case
charlesmindee 2ce8609
fix: unrotated case
charlesmindee f9e8f5c
fix: args of resize
charlesmindee d4b9350
fix: tests
charlesmindee c07fff1
Merge branch 'main' into ratio_det
charlesmindee 794fe0d
Merge branch 'main' into ratio_det
charlesmindee a029e47
fix: preprocessor files
charlesmindee de09f20
fix: validation set aspect ratio
charlesmindee 35c8b39
Merge branch 'main' into ratio_det
charlesmindee 4c7dc53
fix: requested changes
charlesmindee a40ca1c
fix: equested changes
charlesmindee b8dfb34
fix: style
charlesmindee File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking about it, I think we should have:
And I'm not sure what would be the best, but probably having
Resize
(image only), andSampleResize
(image + target), would be relevant. What do you think?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes we can split the function with 1 for the targets and 1 for the image, but I am not sure it is relevant to have a
Resize
and aSampleResize
, because if you want to keep the aspect ratio of the image while resizing, you will almost always pad (except if you give only 1 target size), and this will modify the targets. TheResize
function won't be able to preserve the aspect ratio of the images, and this would require to differentiate the function we are using in the training scripts and in the preprocessor regarding the option selected by the user (preserve_aspect_ratio
or not). This can even be dangerous if someone tries to modify the aspect ratio inResize
without changing the targets I think, do yo agree ?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And for a differentiation of the target and image function, I think it will complexify the code because we use many attributes of the
Resize
class to resize both the image and the target (self.symmetric_pad
,self.preserve_aspect_ratio
,self.output_size
) which would be passed as arguments and we have also many shared computations for the image and the target that would need to be done twice (or added in the signature of the functions but I think it will make the code less understandable):offset
,raw_shape,
andimg.shape
is also used in the target computation, what do you think @fg-mindee ?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
About splitting the two transforms, the reason behind this is that the target, depending on the task, will not necessarily be modified. So either we can add a lot of cases (for each target type) in the Resize transform, or we could have a Resize that only transforms the image, and
SampleResize
that inherits from it and transforms the target.I agree this would be dangerous for something to add this without changing the target, but transforms will only be played with by people willing to train models. So I would argue it's safe to assume they either use our default training script or have a good understanding of what is going on under the hood 🤷♂️
However, I fully agree about the duplication of symmetric pad / computation duplication for some aspects. I don't have an ideal suggestion for this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the
SampleResize
transform, you want it to inherit fromResize
, but since they won't have the same signature so I don't really see why we do that ? @fg-mindeeThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can always refactor this later on anyway! Feel free to implement it the way you prefer (the optional target passing you suggested is probably the best one)