Skip to content

Commit

Permalink
Merge back 1.2.1 RC1 & RC2 (#2086)
Browse files Browse the repository at this point in the history
* Upgrade mmdeploy==0.14.0 from official PyPI (#2047)

* Bug fix: value of validation variable is changed after auto decrease batch size (#2053)

* Integrate new ignored loss in semantic segmentation (#2065)

* Remove unused modules in semantic segmentation (#2068)

* Add doc for fast data loading (#2069)

* Bug fix: set gpu_ids properly (#2071)

* Bug fix: Progress goes 100% and back 0 % repeatedly during auto decrease bs in Geti (#2074)

* Fix tiling 0 stride issue in parameter adapter (#2078)

* Update instance-segmentation tutorial documentation (#2082)

* Optimize YOLOX data pipeline and add unit test for get_subset of Datu… (#2075)

* Tiling Spatial Concatenation for OpenVINO IR (#2052)

* Add spatial concatenation to deployment demo (#2089)

---------

Signed-off-by: Songki Choi <songki.choi@intel.com>
Co-authored-by: Eunwoo Shin <eunwoo.shin@intel.com>
Co-authored-by: Soobee Lee <soobee.lee@intel.com>
Co-authored-by: Inhyuk Cho <andy.inhyuk.jo@intel.com>
Co-authored-by: Eugene Liu <eugene.liu@intel.com>
Co-authored-by: Harim Kang <harim.kang@intel.com>
Co-authored-by: Jaeguk Hyun <jaeguk.hyun@intel.com>
  • Loading branch information
7 people authored May 3, 2023
1 parent e67ce53 commit a1f098d
Show file tree
Hide file tree
Showing 27 changed files with 604 additions and 215 deletions.
14 changes: 14 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,20 @@ All notable changes to this project will be documented in this file.
- OpenVINO(==2022.3) IR inference is not working well on 2-stage models (e.g. Mask-RCNN) exported from torch==1.13.1
(working well up to torch==1.12.1) (<https://github.com/openvinotoolkit/training_extensions/issues/1906>)

## \[v1.2.1\]

### Enhancements

- Upgrade mmdeploy==0.14.0 from official PyPI (<https://github.com/openvinotoolkit/training_extensions/pull/2047>)
- Integrate new ignored loss in semantic segmentation (<https://github.com/openvinotoolkit/training_extensions/pull/2065>)
- Optimize YOLOX data pipeline (<https://github.com/openvinotoolkit/training_extensions/pull/2075>)
- Tiling Spatial Concatenation for OpenVINO IR (<https://github.com/openvinotoolkit/training_extensions/pull/2052>)

### Bug fixes

- Bug fix: value of validation variable is changed after auto decrease batch size (<https://github.com/openvinotoolkit/training_extensions/pull/2053>)
- Fix tiling 0 stride issue in parameter adapter (<https://github.com/openvinotoolkit/training_extensions/pull/2078>)

## \[v1.2.0\]

### New features
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
Fast Data Loading
=================

OpenVINO™ Training Extensions provides several ways to boost model training speed,
one of which is fast data loading.


===================
Faster Augmentation
===================


******
AugMix
******
AugMix [1]_ is a simple yet powerful augmentation technique
to improve robustness and uncertainty estimates of image classification task.
OpenVINO™ Training Extensions implemented it in `Cython <https://cython.org/>`_ for faster augmentation.
Users do not need to configure anything as cythonized AugMix is used by default.



=======
Caching
=======


*****************
In-Memory Caching
*****************
OpenVINO™ Training Extensions provides in-memory caching for decoded images in main memory.
If the batch size is large, such as for classification tasks, or if dataset contains
high-resolution images, image decoding can account for a non-negligible overhead
in data pre-processing.
One can enable in-memory caching for maximizing GPU utilization and reducing model
training time in those cases.


.. code-block::
$ otx train --mem-cache-size=8GB ..
***************
Storage Caching
***************

OpenVINO™ Training Extensions uses `Datumaro <https://github.com/openvinotoolkit/datumaro>`_
under the hood for dataset managements.
Since Datumaro `supports <https://openvinotoolkit.github.io/datumaro/latest/docs/explanation/formats/arrow.html>`_
`Apache Arrow <https://arrow.apache.org/overview/>`_, OpenVINO™ Training Extensions
can exploit fast data loading using memory-mapped arrow file at the expanse of storage consumtion.


.. code-block::
$ otx train .. params --algo_backend.storage_cache_scheme JPEG/75
The cache would be saved in ``$HOME/.cache/otx`` by default.
One could change it by modifying ``OTX_CACHE`` environment variable.


.. code-block::
$ OTX_CACHE=/path/to/cache otx train .. params --algo_backend.storage_cache_scheme JPEG/75
Please refere `Datumaro document <https://openvinotoolkit.github.io/datumaro/latest/docs/explanation/formats/arrow.html#export-to-arrow>`_
for available schemes to choose but we recommend ``JPEG/75`` for fast data loaidng.

.. [1] Dan Hendrycks, Norman Mu, Ekin D. Cubuk, Barret Zoph, Justin Gilmer, and Balaji Lakshminarayanan. "AugMix: A Simple Data Processing Method to Improve Robustness and Uncertainty" International Conference on Learning Representations. 2020.
Original file line number Diff line number Diff line change
Expand Up @@ -11,3 +11,4 @@ Additional Features
auto_configuration
xai
noisy_label_detection
fast_data_loading
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
Noisy label detection
Noisy Label Detection
=====================

OpenVINO™ Training Extensions provide a feature for detecting noisy labels during model training.
Expand Down
12 changes: 12 additions & 0 deletions docs/source/guide/get_started/quick_start_guide/cli_commands.rst
Original file line number Diff line number Diff line change
Expand Up @@ -273,6 +273,18 @@ For example, that is how you can change the learning rate and the batch size for
--learning_parameters.batch_size 16 \
--learning_parameters.learning_rate 0.001
You could also enable storage caching to boost data loading at the expanse of storage:

.. code-block::
(otx) ...$ otx train SSD --train-data-roots <path/to/train/root> \
--val-data-roots <path/to/val/root> \
params \
--algo_backend.storage_cache_scheme JPEG/75
.. note::
Not all templates support stroage cache. We are working on extending supported templates.


As can be seen from the parameters list, the model can be trained using multiple GPUs. To do so, you simply need to specify a comma-separated list of GPU indices after the ``--gpus`` argument. It will start the distributed data-parallel training with the GPUs you have specified.

Expand Down
Loading

0 comments on commit a1f098d

Please sign in to comment.