-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
data transfer model hook (+ refactor) #1756
Conversation
Codecov Report
@@ Coverage Diff @@
## master #1756 +/- ##
======================================
Coverage 86% 86%
======================================
Files 74 75 +1
Lines 4713 4705 -8
======================================
- Hits 4070 4064 -6
+ Misses 643 641 -2 |
This pull request is now in conflict... :( |
06c1ac2
to
7aa093d
Compare
This pull request is now in conflict... :( |
7aa093d
to
cbdefc8
Compare
This pull request is now in conflict... :( |
18dff98
to
c7e4493
Compare
c7e4493
to
c884f68
Compare
@justusschock @Borda I factored out the transfer function to a utility and wanted to ask you if it is a good place to put it next to the apply_to_collections. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just some missing docstrings :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 🚀
97c9fdb
to
ea9fab3
Compare
@williamFalcon true that is confusing. |
To confirm, this does not work for DDP, because in DDP we use the default scatter to move the tensors, correct? Is there a way to similarly customize this behavior for DDP? |
@ZhaofengWu I did not know this. I thought the Trainer always called the same function to move data to the device. I searched in the PyTorch docs for DP and DDP but it seems to me it is not possible to override the scattering of custom batch objects. I guess the best we could do is add a note to our docs? |
I don't know enough about this to know if there's any workaround. It'd be great if this override consistently works in all scenarios, but I guess if it doesn't work, it doesn't work. |
But yes, in any case, at least there should be a note. |
@awaelchli Do you want me to open a separate issue? |
yes good idea, could you do that please? |
* refactor and added hook variant a variant b add test revert rename add changelog docs * resolve merge duplication * overridden typo * fix test * tpu id * raise if TPU not available * re-use apply_to_collection function for parsing collections * comment * make utility function available to user * documentation * move changelog entry to top * fix tpu transfer call * fix call * remove hardcoded string * improve test * call model hook by default * Apply suggestions from code review * rename utility function Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Before submitting
What does this PR do?
Part 1 of feature #1245 by introducing a hook.
Open Questions:
Other PRs that are blocked by this: #1729, #1526
Link to test TPU works with this branch:
https://colab.research.google.com/drive/1wy6sbl8Bh6S3QaHzBsJNbhQ6UX1IC864?usp=sharing
PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.
Did you have fun?
Make sure you had fun coding 🙃