🚀 Composer v0.13.1

Introducing the `composer` PyPi package!

Composer v0.13.1 is released!

Composer can also now be installed using the new composer PyPi package via pip:

pip install composer==0.13.1

The legacy package name still works via pip:

pip install mosaicml==0.13.1

Note: The mosaicml==0.13.0 PyPi package was yanked due to some minor packaging issues discovered after release. The package was re-released as Composer v0.13.1, thus these release notes contain details for both v0.13.0 and v0.13.1.

New Features

🤙 New and Updated Callbacks
- New HealthChecker Callback (#2002)
  
  The callback will log a warning if the GPUs on a given node appear to be in poor health (low utilization). The callback can also be configured to send a Slack message!
```
from composer import Trainer
from composer.callbacks import HealthChecker

# Warn if GPU utilization difference drops below 10%
health_checker = HealthChecker(
    threshold = 10
)

# Construct Trainer
trainer = Trainer(
    ...,
    callbacks=health_checker,
)

# Train!
trainer.fit()
```
- Updated MemoryMonitor to use GigaBytes (GB) units (#1940)
- New RuntimeEstimator Callback (#1991)
  
  Estimate the remaining runtime of your job! Approximates the time remaining by observing the throughput and comparing to the number of batches remaining.
```
from composer import Trainer
from composer.callbacks import RuntimeEstimator

# Construct trainer with RuntimeEstimator callback
trainer = Trainer(
    ...,
    callbacks=RuntimeEestimator(),
)

# Train!
trainer.fit()
```
- Updated SpeedMonitor throughput metrics (#1987)
  
  Expands throughput metrics to track relative to several different time units and per device:
  - throughput/batches_per_sec and throughput/device/batches_per_sec
  - throughput/tokens_per_sec and throughput/device/tokens_per_sec
  - throughput/flops_per_sec and throughput/device/flops_per_sec
  - throughput/device/samples_per_sec
  Also adds throughput/device/mfu metric to compute per device MFU. Simply enable the SpeedMonitor callback per usual to log these new metrics! Please see SpeedMonitor documentation for more information.

⣿ FSDP Sharded Checkpoints (#1902)

Users can now specify the state_dict_type in the fsdp_config dictionary to enable sharded checkpoints. For example:

from composer import Trainer

fsdp_confnig = {
    'sharding_strategy': 'FULL_SHARD',
    'state_dict_type': 'local',
}

trainer = Trainer(
    ...,
    fsdp_config=fsdp_config,
    save_folder='checkpoints',
    save_filename='ba{batch}_rank{rank}.pt',
    save_interval='10ba',
)

Please see the PyTorch FSDP docs and Composer's Distributed Training notes for more information.

🤗 HuggingFace Improvements
- Update HuggingFaceModel class to support encoder-decoder batches without decoder_input_ids (#1950)
- Allow evaluation metrics to be passed to HuggingFaceModel directly (#1971)
- Add a utility function to load a Composer checkpoint of a HuggingFaceModel and write out the expected config.json and pytorch_model.bin in the HuggingFace pretrained folder (#1974)
🛟 Nvidia H100 Alpha Support - Added amp_fp8 data type

In preparation for H100's arrival, we've added the amp_fp8 precision type. Currently setting amp_fp8 specifies a new precision context using transformer_engine.pytorch.fp8_autocast. For more details, please see Nvidia's new Transformer Engine and the specific fp8 recipe we utilize.
```
from composer import Trainer

trainer = Trainer(
    ...,
    precision='amp_fp8',
)
```

API changes

The torchmetrics package has been upgraded to 0.11.x.

The torchmetrics.Accuracy metric now requires a task argument which can take on a value of binary, multiclass or multilabel. Please see Torchmetrics Accuracy docs for details.

Additonally, since specifying value='multiclass' requires an additional field of num_classes to be specified, we've had to update ComposerClassifier to accept the additional num_classes argument. Please see PR's #2017 and #2025 for additional details
Surgery algorithms used in functional form return a value of None (#1543)

Deprecations

Deprecate HFCrossEntropy and Perplexity (#1857)
Remove Jenkins CI (#1943, #1954)
Change Deprecation Warnings to Warnings for specifying ProgressBarLogger and ConsoleLogger to loggers (#1846)

Bug Fixes

Fixed an issue introduced in 0.12.1 where HuggingFaceModel crashes if config.return_dict = False (#1948)
Refactor EMA to improve memory efficiency (#1941)
Make wandb checkpoint logging compatible with wandb model registry (#1973)
Fix ICL race conditions (#1978)
Update epoch metric name to trainer/epoch (#1986)
reset scaler (#1999)
Bug/sync optimization logger across ranks (#1970)
Update Docker images to fix resolve vulnerability scan issues (#2007)
Fix eval duplicate logging issue (#2018)
extend test and patch bug (#2028)
Protect for missing slack_sdk import (#2031)

Known Issues

Docker Image Security Vulnerability
- CVE-2022-45907: The mosaicml/pytorch:1.12.1*, mosaicml/pytorch:1.11.0*, mosaicml/pytorch_vision:1.12.1* and mosaicml/pytorch_vision:1.11.0* images are impacted and currently supported for legacy use cases. We recommend users upgrade to images with PyTorch >1.13. The affected images will be removed in the next Composer release.

What's Changed

Raise error if max duration is in epochs and dataloader is infinite by @dakinggg in #1942
Bump traitlets from 5.8.0 to 5.9.0 by @dependabot in #1946
Deprecate HFCrossEntropy and Perplexity by @dakinggg in #1857
Change functional surgery method return values to None by @nik-mosaic in #1543
Retire Jenkins by @bandish-shah in #1943
Update MCP GHA Name by @mvpatel2000 in #1951
update memory monitor by @mvpatel2000 in #1940
Move ffcv up in test order by @dskhudia in #1953
Fix memory monitor test by @mvpatel2000 in #1957
Fix model surgery failure due to functional API change by @nik-mosaic in #1949
Change how we check for forwards args in models for HF models by @bcui19 in #1955
add return dict false test and bug fix by @dakinggg in #1948
remove jenkins ci by @mvpatel2000 in #1954
add support for enc-dec batches without decoder_input_ids by @dakinggg in #1950
Refactor EMA to improve memory efficiency by @coryMosaicML in #1941
Add warning for untrusted checkpoints by @mvpatel2000 in #1959
permit opt tokenizer by @bmosaicml in #1958
GHA Docker build flow for PR's by @bandish-shah in #1883
Update download badge link to pepy by @karan6181 in #1966
Update python version in setup.py and fixed pypi download badge by @karan6181 in #1969
allow eval metrics to be passed in to HuggingFaceModel directly by @dakinggg in #1971
Make wandb checkpoint logging compatible with wandb model registry by @growlix in #1973
Add support for FP8 on H100 using NVidia's TransformerEngine by @dskhudia in #1965
Util for writing HuggingFace save_pretrained from a composer checkpoint by @dakinggg in #1974
Enable sharded checkpoint save and load (support local, sharded, and full state dicts for FSDP) by @eracah in #1902
Bump custom-inherit from 2.4.0 to 2.4.1 by @dependabot in #1981
Bump gitpython from 3.1.30 to 3.1.31 by @dependabot in #1982
Fix ICL race conditions by @dakinggg in #1978
add map location to huggingface utils by @dakinggg in #1980
fix log epoch by @mvpatel2000 in #1986
GHA release workflow, refactor PR and Daily workflows by @bandish-shah in #1968
Remove python-version input from Daily CPU tests by @bandish-shah in #1989
Add some logic to pass the correct github ref to mcp script by @bandish-shah in #1990
Fix typo in docstring for eval with missing space by @mvpatel2000 in #1992
Fix failing sharded_checkpoint tests that fail when pytorch 1.13 is not installed by @eracah in #1988
Add merge_group event trigger to GHA daily workflow by @bandish-shah in #1996
Runtime estimator by @mvpatel2000 in #1991
Reset scaler state by @mvpatel2000 in #1999
Speed monitor refactor by @mvpatel2000 in #1987
Test hf fsdp by @dakinggg in #1972
Bug/sync optimization logger across ranks by @bmosaicml in #1970
Fix optimizer monitor test gating with FSDP by @mvpatel2000 in #2000
Low precision groupnorm by @mvpatel2000 in #1976
Bump coverage[toml] from 7.1.0 to 7.2.1 by @dependabot in #2008
Update docs to include runtime estimator by @mvpatel2000 in #2009
Tag surgery algorithms LPLN and LPGN by @mvpatel2000 in #2011
Update SpeedMonitor short-description for docs table by @mvpatel2000 in #2010
Update Low Precision LayerNorm arguments by @nik-mosaic in #1994
Medical Segmentation Example Typo by @mvpatel2000 in #2014
Update wallclock logging to default hours by @mvpatel2000 in #2005
Add HealthChecker Callback by @hanlint in #2002
Allow FX graph mode post-training dynamic quantisation of BlurConv2d operations. by @BrettRyland in #1995
Add multi-gpu testing to test_algorithm_resumption by @eracah in #2016
Add backwards compatible checkpoint loading for EMA by @coryMosaicML in #2012
fsdp with custom process groups by @vchiley in #2006
Patch Speed Monitor MFU by @mvpatel2000 in #2013
Remove runtime estimator state dict by @mvpatel2000 in #2015
Update Docker images to fix resolve vulnerability scan issues by @bandish-shah in #2007
Change Deprecation Warnings to Warnings for specifying ProgressBarLogger and ConsoleLogger to loggers by @eracah in #1846
Fix eval duplicate logging issue by @mvpatel2000 in #2018
Add workflow_dispatch trigger to pr-docker workflow by @bandish-shah in #2019
Bump streaming version to less than 0.4.0 by @karan6181 in #2020
Upgrade ipython installed in Docker images by @bandish-shah in #2021
Upgrade torchmetrics by @nik-mosaic in #2017
Complete upgrade of torchmetrics accuracy by @nik-mosaic in #2025
Bump version to v0.13.0 by @bandish-shah in #2024

New Contributors

@BrettRyland made their first contribution in #1995

Full Changelog: v0.12.1...v0.13.1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.13.1

🚀 Composer v0.13.1

Introducing the `composer` PyPi package!

New Features

API changes

Deprecations

Bug Fixes

Known Issues

What's Changed

New Contributors

Contributors

v0.13.1

🚀 Composer v0.13.1

Introducing the composer PyPi package!

New Features

API changes

Deprecations

Bug Fixes

Known Issues

What's Changed

New Contributors

Contributors

Introducing the `composer` PyPi package!