adds whatsnew 07

Signed-off-by: Wenqi Li <wenqil@nvidia.com>
Project-MONAI · Sep 23, 2021 · 8700eca · 8700eca
1 parent 7d1cc11
commit 8700eca
Show file tree

Hide file tree

Showing 4 changed files with 64 additions and 1 deletion.
diff --git a/docs/images/nsight_comparison.png b/docs/images/nsight_comparison.png
diff --git a/docs/source/whatsnew.rst b/docs/source/whatsnew.rst
@@ -6,5 +6,6 @@ What's New
 .. toctree::
    :maxdepth: 1
 
+   whatsnew_0_7.md
    whatsnew_0_6.md
    whatsnew_0_5.md
diff --git a/docs/source/whatsnew_0_6.md b/docs/source/whatsnew_0_6.md
@@ -1,4 +1,4 @@
-# What's new in 0.6 🎉🎉
+# What's new in 0.6
 
 - Decollating mini-batches as an essential post-processing step
 - Pythonic APIs to load the pretrained models from Clara Train MMARs

diff --git a/docs/source/whatsnew_0_7.md b/docs/source/whatsnew_0_7.md
@@ -0,0 +1,62 @@
+# What's new in 0.7 🎉🎉
+
+- Performance enhancements with profiling and tuning guides
+- Major usability improvements in `monai.transforms`
+- Reimplementing state-of-the-art Kaggle solutions
+- Vision-language multimodal transformer architectures
+
+## Performance enhancements with profiling and tuning guides
+
+Model training is often a time-consuming step during deep learning development,
+especially for medical imaging applications. Even with powerful hardware (e.g.
+CPU/GPU with large RAM), the workflows often require careful profiling and
+tuning to achieve high performance. MONAI has been focusing on performance
+enhancements, and in this version, [a fast model training
+guide](https://github.com/Project-MONAI/tutorials/blob/master/acceleration/fast_model_training_guide.md)
+is provided to help build highly performant workflows, with a comprehensive
+overview of the profiling tools and practical strategies.  The following figure
+shows the use of [Nvidia Nsight™ Systems](https://developer.nvidia.com/nsight-systems) for system-wide performance analysis during
+a performance enhancement study.
+![nsight_vis](../images/nsight_comparison.png)
+
+With the performance profiling and enhancements, several typical use cases were studied to
+improve the training efficiency.  The following figure shows that fast
+training using MONAI can be 20 times faster than a regular baseline ([learn
+more](https://github.com/Project-MONAI/tutorials/blob/master/acceleration/fast_training_tutorial.ipynb)).
+![fast_training](../images/fast_training.png)
+
+## Major usability improvements in `monai.transforms` for NumPy/PyTorch inputs and backends
+
+ MONAI starts to roll out major usability enhancements for the
+ `monai.transforms` module. Many transforms are now supporting both NumPy and
+ PyTorch, as input types and computational backends.
+
+One benefit of these enhancements is that the users can now better leverage the
+GPUs for preprocessing. By transferring the input data onto GPU using
+`ToTensor` or `EnsureType`, and applying the GPU-based transforms to the data,
+[the tutorial of spleen
+segmentation](https://github.com/Project-MONAI/tutorials/blob/master/acceleration/fast_training_tutorial.ipynb)
+shows the great potential of using the flexible modules for fast and efficient
+training.
+
+## Reimplementing state-of-the-art Kaggle solutions
+
+With this release, we actively evaluate and enhance the quality and flexibility
+of the MONAI core modules, using the public Kaggle challenge as a testbed. [A
+reimplementation](https://github.com/Project-MONAI/tutorials/tree/master/kaggle/RANZCR/4th_place_solution)
+of a state-of-the-art solution at [Kaggle RANZCR CLiP - Catheter and Line
+Position
+Challenge](https://www.kaggle.com/c/ranzcr-clip-catheter-line-classification)
+is made available in this version.
+
+##  Vision-language multimodal transformers
+
+In this release, MONAI adds support for training multimodal (vision + language)
+transformers that can handle both image and textual data. MONAI introduces the
+`TransCheX` model which consists of vision, language, and mixed-modality
+transformer layers for processing chest X-ray and their corresponding
+radiological reports within a unified framework. In addition to `TransCheX`,
+users have the flexibility to alter the architecture by varying the number of
+vision, language and mixed-modality layers and customizing the classification
+head. In addition, the model can be initialized from pre-trained BERT language
+models for fine-tuning.