-
Notifications
You must be signed in to change notification settings - Fork 259
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add metrics to collect during training #3
Milestone
Comments
wanchaol
changed the title
Add metrics to collective during training
Add metrics to collect during training
Jan 23, 2024
I can work on this one, with gpu stats first. |
and also flop counter |
This was
linked to
pull requests
Feb 16, 2024
gnadathur
modified the milestones:
torchtrain infrastructure building,
First OSS release of Torchtrain
Mar 4, 2024
Closed
Merged
@lessw2020 , @tianyu-l -- Can we close this issue ? |
closing as per conversation w/ @lessw2020 |
jinsun-yoo
pushed a commit
to jinsun-yoo/torchtitan
that referenced
this issue
Oct 30, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
see https://github.com/pytorch-labs/torchtrain/blob/main/train.py#L87
we should have the following metrics associated with the train steps:
The text was updated successfully, but these errors were encountered: