-
Notifications
You must be signed in to change notification settings - Fork 462
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes needed to be able to use doctr on AWS Lambda #1017
Conversation
Hi @mtvch 👋, Some points:
https://github.com/mindee/doctr/blob/main/tests/common/test_utils_multithreading.py
About the casting to list in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello @mtvch, thank you very much for opening this PR and to bring us this use case !
I would suggest some changes in the code and also I agree with @felixdittrich92 on the parts that he highlighted
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR!
I agree with most of what has already been said! A few points:
- we need some unittests for those
- the docstring of each function should reflect any behaviour change (since now env var will modify/override the behaviour)
- the casting of the multiprocessing output is being handled by another PR, it's a dangerous change that should be taken care of outside of here :)
Cheers ✌️
Thank you all for responses! |
Hi @mtvch 👋 , there is currently an open PR to do this #931 |
Hello, @felixdittrich92! |
Codecov Report
@@ Coverage Diff @@
## main #1017 +/- ##
=======================================
Coverage 94.93% 94.94%
=======================================
Files 135 135
Lines 5590 5599 +9
=======================================
+ Hits 5307 5316 +9
Misses 283 283
Flags with carried forward coverage won't be shown. Click here to find out more.
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for applying the changes 👍
I have added some minor comments :)
for mypy : make quality
:)
@fharper can you review the minor docs change please ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the last commits! A few comments left :)
Hi @mtvch 👋 , |
Hello, @felixdittrich92! There are two things I didn't do as you suggested:
Did you mean these two? If not, please, could you tell me what suggestions I missed and I will implement them |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mtvch thanks for applying all the requested changes i have added some last minor points then we are good to go 🤗
@felixdittrich92 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for all the updates. LGTM 👍
I would like to use
doctr
onAWS Lambda
, but there is a restriction: it gives a function write access only under/tmp
folder.I have found two places where
doctr
claims write access outside this folder:ThreadPool
from python'smultiprocessing
package: it uses/dev/shm
folder for shared memory which is not present onAWS Lambda
.That's why I think it would be nice to add an option for user to decide where models should be cached and whether to use
multiprocessing
or not.