What's New

New Tasks

added musr by @clefourrier in #375
Adds Global MLMU by @hynky1999 in #426
Add new Arabic benchmarks (5) and enhance existing tasks by @alielfilali01 in #372

New Features

Evaluate a model already loaded in memory for training / evaluation loop by @clefourrier in #390
Allowing a single prompt to use several formats for one eval by @clefourrier in #398
Autoscaling inference endpoints hardware by @clefourrier in #412
CLI new look and features (using typer) by @NathanHB in #407
Better Looking and more functional logging by @NathanHB in #415
Add litellm backend by @JoelNiklaus in #385

More Translation Literals by the Community

add bashkir variants by @AigizK in #374
add Shan (shn) translation literals by @NoerNova in #376
Add Udmurt (udm) translation literals by @codemurt in #381
This PR adds translation literals for Belarusian language. by @Kryuski in #382
added tatar literals by @gaydmi in #383

New Doc

Add doc-builder doc-pr-upload GH Action by @albertvillanova in #411
Set up docs by @albertvillanova in #403
Add docstring docs by @albertvillanova in #413
Add missing models to docs by @albertvillanova in #419
Update docs about inference endpoints by @albertvillanova in #432
Upgrade deprecated GH Action cache@v2 by @albertvillanova in #456
Add EvaluationTracker to docs and fix its docstring by @albertvillanova in #464
Checkout PR merge commit for CI tests by @albertvillanova in #468

Bug Fixes and Refacto

Allow AdapterModels to have custom tokens by @mapmeld in #306
Homogeneize generation params by @clefourrier in #428
fix: cache directory variable by @NazimHAli in #378
Add trufflehog secrets detection by @albertvillanova in #429
greedy_until() fix by @vsabolcec in #344
Fixes a TypeError for generative metrics. by @JoelNiklaus in #386
Speed up Bootstrapping Computation by @JoelNiklaus in #409
Fix imports from model_config by @albertvillanova in #443
Fix wrong instructions and code for custom tasks by @albertvillanova in #450
Fix minor typos by @albertvillanova in #449
fix model parallel by @NathanHB in #481
add configs with their models by @clefourrier in #421
Fixes a TypeError in Sacrebleu. by @JoelNiklaus in #387
fix ukr/rus by @hynky1999 in #394
fix repeated cleanup by @anton-l in #399
Update instance type/size in endpoint model_config example by @albertvillanova in #401
Considering the case empty request list is given to base model by @sadra-barikbin in #250
Fix a tiny bug in PromptManager::FewShotSampler::_init_fewshot_sampling_random by @sadra-barikbin in #423
Fix splitting for generative tasks by @NathanHB in #400
Fixes an error with getting the golds from the formatted_docs. by @JoelNiklaus in #388
Fix ignored reuse_existing in config file by @albertvillanova in #431
Deprecate Obsolete Config Properties by @ParagEkbote in #433
fix: LightevalTaskConfig.stop_sequence attribute by @ryan-minato in #463
fix: scorer attribute initialization in ROUGE by @ryan-minato in #471
Delete endpoint on InferenceEndpointTimeoutError by @albertvillanova in #475
Remove unnecessary deepcopy in evaluation_tracker by @albertvillanova in #459
fix: CACHE_DIR Default Value in Accelerate Pipeline by @ryan-minato in #461
Fix warning about precedence of custom tasks over default ones in registry by @albertvillanova in #466
Implement TGI model config from path by @albertvillanova in #448

Significant community contributions

The following contributors have made significant changes to the library over the last release:

@clefourrier
- added musr (#375)
- Update README.md
- Use the programmatic interface using an already in memory loaded model (#390)
- Pr sadra (#393)
- Allowing a single prompt to use several formats for one eval (#398)
- Autoscaling inference endpoints (#412)
- add configs with their models (#421)
- Fix custom arabic tasks (#440)
- Adds serverless endpoints back (#445)
- Homogeneize generation params (#428)
@JoelNiklaus
- Fixes a TypeError for generative metrics. (#386)
- Fixes a TypeError in Sacrebleu. (#387)
- Fixes an error with getting the golds from the formatted_docs. (#388)
- Speed up Bootstrapping Computation (#409)
- Add litellm inference (#385)
@albertvillanova
- Update instance type/size in endpoint model_config example (#401)
- Typo in feature-request.md (#406)
- Add doc-builder doc-pr-upload GH Action (#411)
- Set up docs (#403)
- Add docstring docs (#413)
- Add missing models to docs (#419)
- Add trufflehog secrets detection (#429)
- Update docs about inference endpoints (#432)
- Fix ignored reuse_existing in config file (#431)
- Test inference endpoint model config parsing from path (#434)
- Fix imports from model_config (#443)
- Fix wrong instructions and code for custom tasks (#450)
- Fix minor typos (#449)
- Implement TGI model config from path (#448)
- Upgrade deprecated GH Action cache@v2 (#456)
- Add EvaluationTracker to docs and fix its docstring (#464)
- Remove unnecessary deepcopy in evaluation_tracker (#459)
- Fix warning about precedence of custom tasks over default ones in registry (#466)
- Checkout PR merge commit for CI tests (#468)
- Delete endpoint on InferenceEndpointTimeoutError (#475)
@NathanHB
- Fix splitting for generative tasks (#400)
- Nathan refacto cli (#407)
- redo logging (#415)
- option to list custom tasks (#425)
- fix model parallel (#481)
@ParagEkbote
- Deprecate Obsolete Config Properties (#433)
@alielfilali01
- Add new Arabic benchmarks (5) and enhance existing tasks (#372)
- Update arabic_evals.py: Fix custom arabic tasks [2nd attempt] (#444)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.7.0

What's New

New Tasks

New Features

More Translation Literals by the Community

New Doc

Bug Fixes and Refacto

Significant community contributions

Contributors