-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add model-last saving mechanism to pretraining #12459
Add model-last saving mechanism to pretraining #12459
Conversation
I'll fix and add some unit tests. If we decide to implement this feature, do we want to mention this in the documentation? |
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
And I would definitely add this to the docs. |
The CI seems to have gotten into a weird state, let me close and reopen this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me.
The one thing I wonder about is whether this could influence existing workflows somehow, like users iterating over all .bin
files in the output folder, assuming the names all look like modelX.bin
with X
an int
, and that that could now fail when they encounter model-last.bin
.
Could we make this opt-in for the upcoming bugfix release? Or at least provide an easy switch to turn this off if users don't want the additional duplicate file?
* Adjust pretrain command * chane naming and add finally block * Add unit test * Add unit test assertions * Update spacy/training/pretrain.py Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com> * change finally block * Add to docs * Update website/docs/usage/embeddings-transformers.mdx * Add flag to skip saving model-last --------- Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
Description
This PR takes inspiration from the training loop to save the last epoch of pretraining as
model_last.bin
instead ofmodel<last_epoch>.bin
. The PR aims to make it easier for users to work with pretrained weights in an automated workflow (e.g. spaCy projects) and reduce manual adjustment based on the setmax_epochs
in the training config.Types of change
Feature
Checklist