-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Signed-off-by: Wenqi Li <wenqil@nvidia.com>
- Loading branch information
Showing
4 changed files
with
64 additions
and
1 deletion.
There are no files selected for viewing
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -6,5 +6,6 @@ What's New | |
.. toctree:: | ||
:maxdepth: 1 | ||
|
||
whatsnew_0_7.md | ||
whatsnew_0_6.md | ||
whatsnew_0_5.md |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,62 @@ | ||
# What's new in 0.7 🎉🎉 | ||
|
||
- Performance enhancements with profiling and tuning guides | ||
- Major usability improvements in `monai.transforms` | ||
- Reimplementing state-of-the-art Kaggle solutions | ||
- Vision-language multimodal transformer architectures | ||
|
||
## Performance enhancements with profiling and tuning guides | ||
|
||
Model training is often a time-consuming step during deep learning development, | ||
especially for medical imaging applications. Even with powerful hardware (e.g. | ||
CPU/GPU with large RAM), the workflows often require careful profiling and | ||
tuning to achieve high performance. MONAI has been focusing on performance | ||
enhancements, and in this version, [a fast model training | ||
guide](https://github.com/Project-MONAI/tutorials/blob/master/acceleration/fast_model_training_guide.md) | ||
is provided to help build highly performant workflows, with a comprehensive | ||
overview of the profiling tools and practical strategies. The following figure | ||
shows the use of [Nvidia Nsight™ Systems](https://developer.nvidia.com/nsight-systems) for system-wide performance analysis during | ||
a performance enhancement study. | ||
![nsight_vis](../images/nsight_comparison.png) | ||
|
||
With the performance profiling and enhancements, several typical use cases were studied to | ||
improve the training efficiency. The following figure shows that fast | ||
training using MONAI can be 20 times faster than a regular baseline ([learn | ||
more](https://github.com/Project-MONAI/tutorials/blob/master/acceleration/fast_training_tutorial.ipynb)). | ||
![fast_training](../images/fast_training.png) | ||
|
||
## Major usability improvements in `monai.transforms` for NumPy/PyTorch inputs and backends | ||
|
||
MONAI starts to roll out major usability enhancements for the | ||
`monai.transforms` module. Many transforms are now supporting both NumPy and | ||
PyTorch, as input types and computational backends. | ||
|
||
One benefit of these enhancements is that the users can now better leverage the | ||
GPUs for preprocessing. By transferring the input data onto GPU using | ||
`ToTensor` or `EnsureType`, and applying the GPU-based transforms to the data, | ||
[the tutorial of spleen | ||
segmentation](https://github.com/Project-MONAI/tutorials/blob/master/acceleration/fast_training_tutorial.ipynb) | ||
shows the great potential of using the flexible modules for fast and efficient | ||
training. | ||
|
||
## Reimplementing state-of-the-art Kaggle solutions | ||
|
||
With this release, we actively evaluate and enhance the quality and flexibility | ||
of the MONAI core modules, using the public Kaggle challenge as a testbed. [A | ||
reimplementation](https://github.com/Project-MONAI/tutorials/tree/master/kaggle/RANZCR/4th_place_solution) | ||
of a state-of-the-art solution at [Kaggle RANZCR CLiP - Catheter and Line | ||
Position | ||
Challenge](https://www.kaggle.com/c/ranzcr-clip-catheter-line-classification) | ||
is made available in this version. | ||
|
||
## Vision-language multimodal transformers | ||
|
||
In this release, MONAI adds support for training multimodal (vision + language) | ||
transformers that can handle both image and textual data. MONAI introduces the | ||
`TransCheX` model which consists of vision, language, and mixed-modality | ||
transformer layers for processing chest X-ray and their corresponding | ||
radiological reports within a unified framework. In addition to `TransCheX`, | ||
users have the flexibility to alter the architecture by varying the number of | ||
vision, language and mixed-modality layers and customizing the classification | ||
head. In addition, the model can be initialized from pre-trained BERT language | ||
models for fine-tuning. |