Add option to log validation generations to wandb #177
+69
−2
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Motivation
Often the summary of average/max/min reward is not enough information, and it's helpful to look at some real-world generations to see how the model's actual behavior is changing over time. This can be particularly helpful for debugging issues like the generation being cut off before reasoning finishes.
Change
This PR introduces a new
trainer.val_generations_to_log_to_wandb
config value, with a default of 0. If set to a number larger than 0, it logs that number of inputs/outputs/scores each time the validation set is generated and scored. It uses a wandb Table to do so, adding a single row for each validation set run.I choose to log the data in this format because it allows a user to easily see how the outputs for a given input change over time by looking down a column vertically.
Screenshot
Note: if there's already another way to accomplish this easily let me know! I was surprised not to find a way to see sample generations because I find that quite useful, so let me know if I'm missing something.