-
Notifications
You must be signed in to change notification settings - Fork 462
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add aspect ratio for ocr predictor #835
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR!
I added a few comments, could you confirm whether this improves perf with the existing checkpoints? (it isn't specified in the PR details, so I prefer to make sure we use everything)
If that improves for existing checkpoints (trained with images that were stretched, that should strongly indicate that we should update our resizing strategy for detection training asap :))
Yes @fg-mindee it doesn't hurt performances for now, and I will retrain all detection models preserving the aspect ratio to have a coherent pipeline, because it is way better not to deform writtings on documents for the model convergence! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
Codecov Report
@@ Coverage Diff @@
## main #835 +/- ##
==========================================
- Coverage 95.99% 95.73% -0.27%
==========================================
Files 131 133 +2
Lines 5042 5131 +89
==========================================
+ Hits 4840 4912 +72
- Misses 202 219 +17
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
This PR adds the option to preserve aspect ratio of the pages in the whole OCRPredictor pipeline, which leads to much better results on the detection task.
Any feedback is welcome!