v3.0.1 - Patch introducing new Trainer features, model card improvements and evaluator fixes
This patch release introduces some improvements for the SentenceTransformerTrainer, as well as some updates for the automatic model card generation. It also patches some minor evaluator bugs and a bug with MatryoshkaLoss
. Lastly, every single Sentence Transformer model can now be saved and loaded with the safer model.safetensors
files.
Install this version with
# Full installation:
pip install sentence-transformers[train]==3.0.1
# Inference only:
pip install sentence-transformers==3.0.1
SentenceTransformerTrainer improvements
- Implement gradient checkpointing for lower memory usage during training (#2717)
- Implement support for
push_to_hub=True
Training Argument, also implementtrainer.push_to_hub(...)
(#2718)
Model Cards
This patch release improves on the automatically generated model cards in several ways:
- Your training datasets are now automatically linked if they're on Hugging Face (#2711)
- A new
generated_from_trainer
tag is now also added (#2710) - The automatically included widget examples are now improved, especially for question-answering. Previously, the widget could give examples of comparing two questions with eachother (#2713)
- If you save a model locally, then load it again and upload it, it would previously still show
...
# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
...
This now gets replaced with your new model ID on Hugging Face (#2714)
- The exact training dataset size is now included in the model metadata, rather than as a bucket of e.g. 1K<n<10K (#2728)
Evaluators fixes
- The primary metric of evaluators in
SequentialEvaluator
would be ignored in thescores
calculation (#2700) - Fix confusing print statement in TranslationEvaluator when using
print_wrong_matches=True
(#1894) - Fix bug that prevents you from customizing the
primary_metric
inInformationRetrievalEvaluator
(#2701) - Allow passing a list of evaluators to the STTrainer rather than a
SequentialEvaluator
(#2717)
Losses fixes
- Fix
MatryoshkaLoss
crash if the first dimension is not the biggest (#2719)
Security
- Integrate safetensors with all modules, including Dense, LSTM, CNN, etc. to prevent needing pickled
pytorch_model.bin
anymore (#2722)
All changes
- updating to evaluation_strategy by @higorsilvaa in #2686
- fix loss link by @Samoed in #2690
- Fix bug that restricts users from specifying custom primary_function in InformationRetrievalEvaluator by @hetulvp in #2701
- Fix a bug in SequentialEvaluator to use primary_metric if defined in evaluator. by @hetulvp in #2700
- [
fix
] Always override the originally saved version in the ST config by @tomaarsen in #2709 - [
model cards
] Also include HF datasets in the model card metadata by @tomaarsen in #2711 - Add "generated_from_trainer" tag to auto-generated model cards by @tomaarsen in #2710
- Fix confusing print statement in TranslationEvaluator by @NathanS-Git in #1894
- [
model cards
] Improve the widget example selection: not based on embeddings, better for QA by @tomaarsen in #2713 - [
model cards
] Replace 'sentence_transformers_model_id' from reused model if possible by @tomaarsen in #2714 - [
feat
] Allow passing a list of evaluators to the Trainer by @tomaarsen in #2716 - [
fix
] Fix gradient checkpointing to allow for much lower memory usage by @tomaarsen in #2717 - [
fix
] Implementcreate_model_card
on the Trainer, allowing args.push_to_hub=True by @tomaarsen in #2718 - [
fix
] FixMatryoshkaLoss
crash if the first dimension is not the biggest by @tomaarsen in #2719 - Update models_en_sentence_embeddings.html by @saikartheekb in #2720
- [
typing
] Improve typing for many functions & addpy.typed
to satisfymypy
by @tomaarsen in #2724 - [
fix
] Fix edge case with evaluator being None by @tomaarsen in #2726 - [
simplify
] Set can_return_loss=True globally, instead of via the data collator by @tomaarsen in #2727 - [
feat
] Integrate safetensors with Dense, etc. modules too. by @tomaarsen in #2722 - [
model cards
] Specify the exact dataset size as a tag, will be bucketized by HF by @tomaarsen in #2728
New Contributors
- @higorsilvaa made their first contribution in #2686
- @hetulvp made their first contribution in #2701
- @NathanS-Git made their first contribution in #1894
- @saikartheekb made their first contribution in #2720
Full Changelog: v3.0.0...v3.0.1