-
Notifications
You must be signed in to change notification settings - Fork 462
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rotate page #488
Rotate page #488
Conversation
# Conflicts: # doctr/models/core.py
# Conflicts: # doctr/models/_utils.py # doctr/models/core.py
Codecov Report
@@ Coverage Diff @@
## main #488 +/- ##
==========================================
+ Coverage 96.15% 96.32% +0.17%
==========================================
Files 114 117 +3
Lines 4447 4523 +76
==========================================
+ Hits 4276 4357 +81
+ Misses 171 166 -5
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot to jump back into this 🙏
I added a few comments but for me, there is only one major part missing:
- with this PR, we rotate the page firsthand before any DL model forward
- the process is almost identical, but the localizations of words have to be remapped into the original pages (meaning if we detect that the page has a 30° orientation, we forward its straightened version, if we detect a straight bounding box in this, we have to reproject it correctly so that the localization refers to the original page)
Let me know what you think :)
@Rob192 also, if you don't minde merging the |
|
Hello, Thanks for the review ! My objective is now to use geometry.rotate_image to rotate the documents before going into the predictor and then use the same function to rotate every box, compare to the center of the document previously used. However, I am feeling a bit confused regarding the import numpy as np
import matplotlib.pyplot as plt
import torch
import tensorflow as tf
from doctr.utils import geometry
from doctr.transforms.functional.tensorflow import rotate as rotate_tf
from doctr.transforms.functional.pytorch import rotate as rotate_to
_, axes = plt.subplots(3, 1)
img = np.ones((32, 64, 3), dtype=np.float32)
rotated = geometry.rotate_image(img, 45., expand=True)
print(rotated.shape)
axes[0].imshow(rotated)
input_t = torch.ones((3, 32, 64), dtype=torch.float32)
boxes = np.array([
[15, 20, 35, 30]
])
r_img, r_boxes = rotate_to(input_t, boxes, angle=45., expand=True)
print(r_img.shape)
axes[1].imshow(r_img.permute((1,2,0)))
input_t = tf.ones((32, 64, 3), dtype=tf.float32)
boxes = np.array([
[15, 20, 35, 30]
])
r_img, r_boxes = rotate_tf(input_t, boxes, angle=45., expand=True)
print(r_img.shape)
axes[2].imshow(r_img) Do you think I could modify |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the edits! Regarding your question about rotate_image
, it's still an ongoing implementation so we can add optional features in it. But to keep the PR short, perhaps this should be implemented in the predictor directly? (we can refactor in another PR once we get this working)
Hello @fg-mindee !
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks again Rob!
I think there are still a few edges cases we are not covering but this is taking a very solid shape!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still have a few comments, but the major one being about the reprojection of rotated bounding boxes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is one merge conflict to resolve, minor tweaks in the unittest and we can merge!
Sorry about this 🙃
OK @fg-mindee ! Now I got it for the testing ! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, thanks a lot @Rob192 ! @charlesmindee would you mind taking a look at it as well before we merge? 🙏
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR !
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry Robin, a minor thing I missed in my last review to adjust 😅
That shouldn't change much but especially on this feature it's important to have a good understanding of each argument influence on the final output!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There seems to be a failing test: you might have to pass preserve_origin_shape=True
where I added the comment 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Making that last argument non-optional and we're good!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All good 🙌 Thanks a lot!
Yeah ! Thanks a lot for your support in reviewing this ! |
This is part of the solutions discussed in #225 and address the point "Integrate the feature in the predictor while being disabled by default" of the "Page-level orientation" part.
It implements the function
rotate_page
and tracks the 'page_angles'