Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add model-last saving mechanism to pretraining #12459

Merged
merged 9 commits into from
Apr 3, 2023

Conversation

thomashacker
Copy link
Contributor

@thomashacker thomashacker commented Mar 23, 2023

Description

This PR takes inspiration from the training loop to save the last epoch of pretraining as model_last.bin instead of model<last_epoch>.bin. The PR aims to make it easier for users to work with pretrained weights in an automated workflow (e.g. spaCy projects) and reduce manual adjustment based on the set max_epochs in the training config.

Types of change

Feature

Checklist

  • I confirm that I have the right to submit this contribution under the project's MIT license.
  • I ran the tests, and all new and existing tests passed.
  • My changes don't require a change to the documentation, or if they do, I've added all required information.

@thomashacker thomashacker added enhancement Feature requests and improvements feat / tok2vec Feature: Token-to-vector layer and pretraining labels Mar 23, 2023
@thomashacker
Copy link
Contributor Author

thomashacker commented Mar 23, 2023

I'll fix and add some unit tests. If we decide to implement this feature, do we want to mention this in the documentation?

@thomashacker thomashacker changed the title Add model-latest saving mechanism to pretraining Add model-last saving mechanism to pretraining Mar 27, 2023
thomashacker and others added 2 commits March 27, 2023 14:12
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
@adrianeboyd
Copy link
Contributor

And I would definitely add this to the docs.

@adrianeboyd
Copy link
Contributor

The CI seems to have gotten into a weird state, let me close and reopen this.

@adrianeboyd adrianeboyd reopened this Mar 29, 2023
Copy link
Member

@svlandeg svlandeg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

The one thing I wonder about is whether this could influence existing workflows somehow, like users iterating over all .bin files in the output folder, assuming the names all look like modelX.bin with X an int, and that that could now fail when they encounter model-last.bin.

Could we make this opt-in for the upcoming bugfix release? Or at least provide an easy switch to turn this off if users don't want the additional duplicate file?

@adrianeboyd adrianeboyd merged commit de32011 into explosion:master Apr 3, 2023
adrianeboyd added a commit to adrianeboyd/spaCy that referenced this pull request Apr 3, 2023
* Adjust pretrain command

* chane naming and add finally block

* Add unit test

* Add unit test assertions

* Update spacy/training/pretrain.py

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>

* change finally block

* Add to docs

* Update website/docs/usage/embeddings-transformers.mdx

* Add flag to skip saving model-last

---------

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Feature requests and improvements feat / tok2vec Feature: Token-to-vector layer and pretraining
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants