-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[bugfix] Logging only on not should_accumulate()
during training
#5417
Conversation
Codecov Report
@@ Coverage Diff @@
## master #5417 +/- ##
======================================
Coverage 93% 93%
======================================
Files 134 134
Lines 9996 9996
======================================
Hits 9313 9313
Misses 683 683 |
Not familiar with W&B. Can you see the raw data for the epoch graph? Doesn't look right, does it? |
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great fix :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, just check back-compatibity and chlog
@@ -158,7 +158,7 @@ def cache_training_step_metrics(self, opt_closure_result): | |||
self.logged_metrics.update(logged_metrics_tmp) | |||
self.cached_results.legacy_batch_log_metrics.update(logged_metrics_tmp) | |||
|
|||
def log_metrics(self, metrics, grad_norm_dic, step=None, log_train_step_metrics=False): | |||
def log_metrics(self, metrics, grad_norm_dic, step=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it used by a user, right? then let's hold back compatibility with API and add a warning...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is internal. LoggerConnector class
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not it is not.
What does this PR do?
This PR fixes visual logging with accumulated_grad_batches > 1.
Solution: We can't assume the metrics can be averaged, so we will log only when
optimizer_step
will be called.Previously
Now.
Code used to generate the visualization
Fixes #5405 <- this links related issue to this PR
Before submitting
PR review
Anyone in the community is free to review the PR once the tests have passed.
Before you start reviewing make sure you have read Review guidelines. In short, see the following bullet-list:
Did you have fun?
Make sure you had fun coding 🙃