forked from openvinotoolkit/training_extensions
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add doc for fast data loading (openvinotoolkit#2069)
* docs: add fast data loading * docs: add augmix * docs: use reference * docs: make first character of words capital * docs: add simple example in cli command
- Loading branch information
1 parent
1608b9f
commit ea7fb12
Showing
4 changed files
with
87 additions
and
1 deletion.
There are no files selected for viewing
73 changes: 73 additions & 0 deletions
73
docs/source/guide/explanation/additional_features/fast_data_loading.rst
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,73 @@ | ||
Fast Data Loading | ||
================= | ||
|
||
OpenVINO™ Training Extensions provides several ways to boost model training speed, | ||
one of which is fast data loading. | ||
|
||
|
||
=================== | ||
Faster Augmentation | ||
=================== | ||
|
||
|
||
****** | ||
AugMix | ||
****** | ||
AugMix [1]_ is a simple yet powerful augmentation technique | ||
to improve robustness and uncertainty estimates of image classification task. | ||
OpenVINO™ Training Extensions implemented it in `Cython <https://cython.org/>`_ for faster augmentation. | ||
Users do not need to configure anything as cythonized AugMix is used by default. | ||
|
||
|
||
|
||
======= | ||
Caching | ||
======= | ||
|
||
|
||
***************** | ||
In-Memory Caching | ||
***************** | ||
OpenVINO™ Training Extensions provides in-memory caching for decoded images in main memory. | ||
If the batch size is large, such as for classification tasks, or if dataset contains | ||
high-resolution images, image decoding can account for a non-negligible overhead | ||
in data pre-processing. | ||
One can enable in-memory caching for maximizing GPU utilization and reducing model | ||
training time in those cases. | ||
|
||
|
||
.. code-block:: | ||
$ otx train --mem-cache-size=8GB .. | ||
*************** | ||
Storage Caching | ||
*************** | ||
|
||
OpenVINO™ Training Extensions uses `Datumaro <https://github.com/openvinotoolkit/datumaro>`_ | ||
under the hood for dataset managements. | ||
Since Datumaro `supports <https://openvinotoolkit.github.io/datumaro/latest/docs/explanation/formats/arrow.html>`_ | ||
`Apache Arrow <https://arrow.apache.org/overview/>`_, OpenVINO™ Training Extensions | ||
can exploit fast data loading using memory-mapped arrow file at the expanse of storage consumtion. | ||
|
||
|
||
.. code-block:: | ||
$ otx train .. params --algo_backend.storage_cache_scheme JPEG/75 | ||
The cache would be saved in ``$HOME/.cache/otx`` by default. | ||
One could change it by modifying ``OTX_CACHE`` environment variable. | ||
|
||
|
||
.. code-block:: | ||
$ OTX_CACHE=/path/to/cache otx train .. params --algo_backend.storage_cache_scheme JPEG/75 | ||
Please refere `Datumaro document <https://openvinotoolkit.github.io/datumaro/latest/docs/explanation/formats/arrow.html#export-to-arrow>`_ | ||
for available schemes to choose but we recommend ``JPEG/75`` for fast data loaidng. | ||
|
||
.. [1] Dan Hendrycks, Norman Mu, Ekin D. Cubuk, Barret Zoph, Justin Gilmer, and Balaji Lakshminarayanan. "AugMix: A Simple Data Processing Method to Improve Robustness and Uncertainty" International Conference on Learning Representations. 2020. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -11,3 +11,4 @@ Additional Features | |
auto_configuration | ||
xai | ||
noisy_label_detection | ||
fast_data_loading |
2 changes: 1 addition & 1 deletion
2
docs/source/guide/explanation/additional_features/noisy_label_detection.rst
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters