-
Notifications
You must be signed in to change notification settings - Fork 221
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Tensorboard] Log text prediction in evaluation #163
Comments
@TevenLeScao also suggested that we make inference work in Meg-DS. Very simple greedy search. The motivation is that teacher forcing won't tell us much about the model (it's very similar to validation loss), whereas greedy search will show the models actually infers. Personally I don't agree with the statement that teacher forcing won't tell us much, but I do agree that running actual inference in Meg-DS will probably allow us to notice bugs very quickly. |
Hey @thomasw21. Is this still needed? If so I'd love to take it on. |
Hey! We have finished training BLOOM so the tensorboard integration might not be required anymore. However I think having a generation engine in Meg-DS would he greatly appeciated as we currently rely on our |
I see I'd like to help with that then. Where would be the best place for having that generate engine? |
@KMFODA @thomasw21 , #328 |
IMO this issue is different, we want to have a inference mechanism from Meg-DS, without having to convert to
|
Sorry. I'm new to this repo. I meant to ask where in the repo itself should this generate engine live? |
Hmm, @thomasw21 |
@KMFODA currently, I am planning to create a standalone library. For now, I am adding to this repo itself. |
I mean you can probably create a |
@thomasw21 , I am not sure how this differs from the PR I pointed above ^^. Can you explain? |
If you don't have |
oh, I think I understand the issue now. |
@mayank31398 yup! Essentially this is what this issue is about. |
* support flash_attn * update * update
A very useful tool in order to understand model performance beyond obtaining loss: Actually show what are the predictions.
It'd be very useful to be able to "see" the output of the model during evaluation in text format. These should be logged in tensorboard. Tensorboard likely supports markdown style where you can put prediction in bold.
Maybe we can only print out the first batch as we should get a good amount of example from it.
The text was updated successfully, but these errors were encountered: