-
Notifications
You must be signed in to change notification settings - Fork 415
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clear list
states (i.e. delete their contents), not reassign the default []
#2493
Clear list
states (i.e. delete their contents), not reassign the default []
#2493
Conversation
…behaviour produced memory leak from list[Tensor] states
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice, can we also have a test to cover this edge-case?
…enced, and hence not automatically garbage collected). Fixed failing test (want to check list state, but assigned Tensor)
Absolutely! Added 1fa7077 to help illustrate the issue (and how the fix helps) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your PR and that's certainly a great find.
That being said, I am actually not sure what's the desired behavior here.
I can see good reasons for both ways.
I.e. It might be very confusing for people if they keep a reference to things why this one suddenly empty, whereas with just instantiating a new list we're doing everything correctly on our side and still give the user the possibility to retain a reference to the old state if they wish to. So this would only lead to a memory leak in user code, not on our side.
Interesting - I hadn't thought of that scenario being desirable... I would argue though, the exiting behaviour is certainly unexpected (according to the current documentation). My reading of In my specific use-case, I'm fitting a If you're keen to support the ability to retain references to |
I think you have some valid arguments there. I'm fine making an opinionated move here. Could you just add that to the documentation and refer people to use copy/deepcopy if the want to retain the states? |
…t care must be taken when referencing them
Brilliant, thanks! I've updated the docstring/documentation - let me know if it needs rewording/expanding/etc... Unfortunately I couldn't get Thanks again |
@SkafteNicki thoughts? |
@dominicgkerr there are too many failing tests, could you pls have look... |
…om:dominicgkerr/torchmetrics into bugfix/2492-clear-list-states-not-reassign
Head branch was pushed to by a user without write access
@SkafteNicki Nice (5758977) - was just looking at the |
@dominicgkerr you are welcome :) |
for more information, see https://pre-commit.ci
…t-reassign' into bugfix/2492-clear-list-states-not-reassign # Conflicts: # src/torchmetrics/metric.py
…void memory leakage
for more information, see https://pre-commit.ci
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## master #2493 +/- ##
======================================
Coverage 69% 69%
======================================
Files 307 307
Lines 17396 17404 +8
======================================
+ Hits 11981 11989 +8
Misses 5415 5415 |
…ces to avoid memory leakage" This reverts commit ef27215.
…checking .reset clears memory allocated during update (memory should be allowed to grow, as long as discarded safely)
I think I've got things working (better?) now... @SkafteNicki / 5758977 was a huge help (thanks!), as I hadn't spotted the caching behaviour inside Rather than simply using After some digging, I came to the conclusion that I claim, memory should be allowed to increase here (as users might legitimately want to collect lots of observations during fitting, and combine them inside I appreciate changing (failing) tests to pass the CI is pretty suspect - hopefully the above explains why I did (let me know if not!) |
Should the same be applied here as well? torchmetrics/src/torchmetrics/metric.py Line 530 in d528131
|
Rather than overwriting
list
states (with[]
), call.clear()
to correctly (more robustly?) delete/freeTensor
elements. Previous behaviour results in CPU (possibly GPU also) memory leakWhat does this PR do?
Fixes #2492 (maybe #2481 also)
Before submitting
Did you make sure to update the docs?PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.
Did you have fun?
Make sure you had fun coding 🙃
📚 Documentation preview 📚: https://torchmetrics--2493.org.readthedocs.build/en/2493/