-
Notifications
You must be signed in to change notification settings - Fork 19.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix save_model
and load_model
#19924
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for the fix.
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #19924 +/- ##
===========================================
- Coverage 79.02% 67.21% -11.81%
===========================================
Files 499 499
Lines 46436 46448 +12
Branches 8548 8550 +2
===========================================
- Hits 36695 31222 -5473
- Misses 8015 13567 +5552
+ Partials 1726 1659 -67
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
What test can we add to prevent a similar issue in the future? |
The reduant |
I will try to add the tests for OSError tomorrow. (using |
Hi @james77777778 Could you check #19921 (comment) ? seemingly the fix doesn't work. |
Hey @WeichenXu123 The issue may stem from a race condition in |
Yes. It only occurs when multiple processes concurrently loading the model |
Thanks for the information. I have added a test to verify the concurrently loading in #19927 . |
Should fix #19921
The root cause is that #19852 will incorrectly duplicate
*.weights.h5
insave_model
.And we should consider falling back to the original behavior if the system is read-only in
load_model
.Sorry for the inconvenience!