Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add tiling documentation #2204

Closed
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
eca1336
add tiling doc
eugene123tw May 30, 2023
10c63f1
add tiling strategy
eugene123tw May 30, 2023
e15e59c
update tiling doc
eugene123tw May 30, 2023
d4d305d
Merge branch 'develop' into eugene/tiling-doc
eugene123tw Jun 12, 2023
73e72f2
update doc
eugene123tw Jun 12, 2023
1de5686
update
eugene123tw Jun 12, 2023
a87efc2
update
eugene123tw Jun 12, 2023
d4c5a1d
update
eugene123tw Jun 12, 2023
1d1595c
update
eugene123tw Jun 12, 2023
c13a79d
Update base.txt
yunchu Jul 7, 2023
228a8ed
Update __init__.py
yunchu Jul 7, 2023
5dfd5fd
Update requirements.txt
yunchu Jul 7, 2023
e67d261
Temporarily skip visual prompting openvino integration test (#2323)
sungchul2 Jul 10, 2023
344f526
Fix import dm.DatasetSubset (#2324)
vinnamkim Jul 10, 2023
cfd7706
Fix semantic segmentation soft prediction dtype (#2322)
negvet Jul 10, 2023
c3dd4aa
Contrain yapf verison lesser than 0.40.0 (#2328)
eunwoosh Jul 10, 2023
2c56a82
Merge branch 'develop' into eugene/tiling-doc
eugene123tw Jul 10, 2023
a2e744b
update tiling doc
eugene123tw Jul 10, 2023
508c1c3
Merge branch 'develop' into eugene/tiling-doc
eugene123tw Jul 10, 2023
9fd8a05
add extra info
eugene123tw Jul 10, 2023
19477a3
Fix detection e2e tests (#2327)
jaegukhyun Jul 10, 2023
6b51ba0
Mergeback: Label addtion/deletion 1.2.4 --> 1.4.0 (#2326)
sungmanc Jul 11, 2023
b47ebf3
Bump datumaro up to 1.4.0rc2 (#2332)
yunchu Jul 11, 2023
63b8c50
Merge branch 'releases/1.4.0' into eugene/tiling-doc
eugene123tw Jul 11, 2023
976e2fb
Merge branch 'releases/1.4.0' into eugene/tiling-doc
eugene123tw Jul 11, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -12,3 +12,4 @@ Additional Features
xai
noisy_label_detection
fast_data_loading
tiling
190 changes: 190 additions & 0 deletions docs/source/guide/explanation/additional_features/tiling.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,190 @@
Improve Small Object Detection with Image Tiling
*************************************************

The OpenVINO Training Extensions introduces the concept of image tiling to enhance the accuracy of detection algorithms and instance segmentation algorithms, particularly for small and densely packed objects in high-resolution images.

Image tiling involves dividing the original full-resolution image into multiple smaller tiles or patches. This division allows objects within the tiles to appear larger in relation to the tile size, effectively addressing the challenge of objects becoming nearly invisible in deeper layers of feature maps due to downsampling operations. Image tiling proves especially beneficial for datasets where objects can be as small as 20 by 20 pixels in a 4K image.

However, it's important to consider the trade-off associated with image tiling. Dividing a single image sample into several tiles increases the number of samples for training, evaluation, and testing. This trade-off impacts the execution speed, as processing more images requires additional computational resources. To strike a balance between patch size and computational efficiency, the OpenVINO Training incorporates tile dataset samples and adaptive tiling parameter optimization. These features enable the proper tuning of tile size and other tiling-related parameters to ensure efficient execution without compromising accuracy.

By leveraging image tiling, the OpenVINO Training Extensions empowers detection and instance segmentation algorithms to effectively detect and localize small and crowded objects in large-resolution images, ultimately leading to improved overall performance and accuracy.

Tiling Strategies
=================
Below we provided an example of tiling used on one of the image from `DOTA <https://captain-whu.github.io/DOTA/dataset.html>`_.

.. image:: ../../../../utils/images/dota_tiling_example.jpg
:width: 800
:alt: this image uploaded from this `source <https://captain-whu.github.io/DOTA/dataset.html>`_


In this example, the full image is cropped into 9 tiles. During training, only the tiles with annotations (bounding boxes or masks) are used for training.

During evaluation in training, only the tiles with annotations are used for evaluation, and evaluation is performed at the tile level.

During testing, each tile is processed and predicted separately. The tiles are then stitched back together to form the full image, and the tile predictions are merged to form the full image prediction.
eugene123tw marked this conversation as resolved.
Show resolved Hide resolved

The tiling strategy is implemented in the OpenVINO Training Extensions through the following steps:

.. code-block::

* Training: Create an ImageTilingDataset with annotated tiles -> Train with annotated tile images -> Evaluate on annotated tiles
* Testing: Create an ImageTilingDataset including all tiles -> Test with all tile images -> Stitching -> Merge tile-level predictions -> Full Image Prediction

.. note::

While running `ote eval` on models trained with tiling enabled, the evaluation will be performed on all tiles, this process includes merging all the tile-level prediction.
The below context will be provided during evaluation:

.. code-block::

[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 650/650, 17.2 task/s, elapsed: 38s, ETA: 0s
==== merge: 7.326097726821899 sec ====


Enable Tiling via OTX Training CLI
==================================

Currently, tiling is supported for both detection and instance segmentation models. Please refer to :doc:`../algorithms/object_detection/object_detection` and :doc:`../algorithms/segmentation/instance_segmentation` for more details.

To enable tiling in OTX training, set ``tiling_parameters.enable_tiling`` parameter to 1. Here's an example of enabling tiling for the SSD model template:

.. code-block::

otx train Custom_Object_Detection_Gen3_SSD --train-data-roots tests/assets/small_objects --val-data-roots tests/assets/small_objects params --tiling_parameters.enable_tiling 1
eugene123tw marked this conversation as resolved.
Show resolved Hide resolved

.. note::

To learn how to deploy the trained model and run the exported demo, refer to :doc:`../../tutorials/base/deploy`.

To learn how to run the demo in CLI and visualize results, refer to :doc:`../../tutorials/base/demo`.

Enable Tiling via OTX Build
===========================
Here's another way of enabling tiling for the SSD model template using the workspace:

.. code-block::

otx build Custom_Object_Detection_Gen3_SSD --train-data-roots tests/assets/small_objects --val-data-roots tests/assets/small_objects

The above command will create a workspace folder with the necessary files for training under ``otx-workspace-DETECTION``.

You can then train the model with tiling enabled using the following command without specifying any data-related paths:

.. code-block::

cd otx-workspace-DETECTION
otx train params --tiling_parameters.enable_tiling 1

Alternatively, you can update the ``tiling_parameters`` in ``configuration.yaml`` file under the workspace folder to configure tiling parameters:

.. code-block::

hyper_parameters:
parameter_overrides:
tiling_parameters:
enable_tiling:
default_value: true

And then train the model with tiling enabled using the following command:

.. code-block::

otx train


Tile Size and Tile Overlap Optimization
-----------------------------------------
By default, the OpenVINO Training Extensions automatically optimize tile size and tile overlap to ensure efficient execution without compromising accuracy.

To strike a balance between patch size and computational efficiency, the OpenVINO Training Extensions incorporate adaptive tiling parameter optimization. These features enable the proper tuning of tile size and other tiling-related parameters to ensure efficient execution without compromising accuracy.

Adaptive tiling parameter optimization works by finding the average object size in the training dataset and using that to determine the tile size. Currently, the average object size to tile size ratio is set to 3%. For example, if the average object size is 100x100 pixels, the tile size will be around 577x577 pixels.

This computation is performed by dividing the average object size by the desired object size ratio (default: 3%) and then taking the square root. This ensures that the objects are large enough to be detected by the model. The object size to tile size ratio can also be configured with ``tiling_parameters.object_tile_ratio`` parameter.

Here's an example of setting the object size ratio to 5%:

.. code-block::

otx train Custom_Object_Detection_Gen3_SSD
--train-data-roots tests/assets/small_objects \
--val-data-roots tests/assets/small_objects \
params --tiling_parameters.enable_tiling 1 \ # enable tiling
--tiling_parameters.enable_adaptive_params 1 \ # enable automatic tiling parameter optimization
--tiling_parameters.object_tile_ratio 0.05 \ # set the object size ratio to 5%

After determining the tile size, the tile overlap is computed by dividing the largest object size in the training dataset by the adaptive tile size.
This calculation ensures that the largest object on the border of a tile is not split into two tiles and is covered by adjacent tiles.

You can also manually configure the tile overlap using ``tiling_parameters.tile_overlap parameter`` parameter. For more details, please refer to the section on `Manual Tiling Parameter Configuration`_ .


Tiling Sampling Strategy
------------------------
To accelerate the training process, the OpenVINO Training Extensions introduces a tile sampling strategy. This strategy involves randomly sampling a percentage of tile images from the dataset to be used for training.

Since training and validation on all tiles from a high-resolution image dataset can be time-consuming, sampling the tile dataset can significantly reduce the training and validation time.

It's important to note that sampling is applied to the training and validation datasets, not the test dataset.

This can be configured with ``tiling_parameters.tile_sampling_ratio`` parameter. Here's an example of setting the tile sampling ratio to 50%:

.. code-block::

otx train Custom_Object_Detection_Gen3_SSD
--train-data-roots tests/assets/small_objects \
--val-data-roots tests/assets/small_objects \
params --tiling_parameters.enable_tiling 1 \ # enable tiling
--tiling_parameters.enable_adaptive_params 1 \ # enable automatic tiling parameter optimization
--tiling_parameters.tile_sampling_ratio 0.5 \ # set the tile sampling ratio to 50%


Manual Tiling Parameter Configuration
-------------------------------------

Users can disable adaptive tiling and customize the tiling process by setting the following parameters:

.. code-block::

otx train Custom_Object_Detection_Gen3_SSD
--train-data-roots tests/assets/small_objects \
--val-data-roots tests/assets/small_objects \
params --tiling_parameters.enable_tiling 1 \ # enable tiling
--tiling_parameters.enable_adaptive_params 0 \ # disable automatic tiling parameter optimization
--tiling_parameters.tile_size 512 \ # tile size configured to 512x512
--tiling_parameters.tile_overlap 0.1 \ # 10% overlap between tiles

By specifying these parameters, automatic tiling parameter optimization is disabled, and the tile size is configured to 512x512 pixels with a 10% overlap between tiles.

The following parameters can be configured to customize the tiling process:

- ``tiling_parameters.enable_tiling``: Enable or disable tiling (0 or 1)
- ``tiling_parameters.enable_adaptive_params``: Enable or disable adaptive tiling parameter optimization (0 or 1)
- ``tiling_parameters.object_tile_ratio``: Ratio of average object size to tile size (float between 0.0 and 1.0)
- ``tiling_parameters.tile_size``: Tile edge length in pixels (integer between 100 and 4096)
- ``tiling_parameters.tile_overlap``: The overlap between adjacent tiles as a percentage (float between 0.0 and 1.0)
- ``tiling_parameters.tile_sampling_ratio``: The percentage of tiles to sample from the dataset (float between 0.0 and 1.0)


Run Tiling on OpenVINO Exported Model
======================================

After training a model with tiling enabled, you can export the model to OpenVINO IR format using the following command:

.. code-block::

otx export Custom_Object_Detection_Gen3_SSD --load-weights <path_to_trained_model>/weights.pth --output <path_to_exported_model>


After exporting the model, you can run inference on the exported model using the following command:

.. code-block::

ote eval Custom_Object_Detection_Gen3_SSD --test-data-roots tests/assets/small_objects --load-weights <path_to_exported_model>/openvino.xml

.. warning::
When tiling is enabled, there is a trade-off between speed and accuracy as it increases the number of images to be processed.
As a result, longer training and inference times are expected. If you encounter GPU out of memory errors,
you can mitigate the issue by reducing the number of batches through the command-line interface (CLI) or
by adjusting the batch size value in ``template.yaml`` file located in the workspace.
Binary file added docs/utils/images/dota_tiling_example.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1 change: 1 addition & 0 deletions requirements/action.txt
Original file line number Diff line number Diff line change
Expand Up @@ -4,3 +4,4 @@ mmcv-full==1.7.0
mmaction2==0.24.1
mmdet==2.28.1
mmdeploy==0.14.0
yapf<0.40.0 # it should be removed after https://github.com/google/yapf/issues/1118 is solved
2 changes: 1 addition & 1 deletion requirements/base.txt
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ natsort>=6.0.0
prettytable
protobuf>=3.20.0
pyyaml
datumaro@ git+https://github.com/openvinotoolkit/datumaro@3e77b3138d063db68a4efba3c03a6bac7df086b1#egg=datumaro
datumaro==1.4.0rc2
psutil
scipy>=1.8
bayesian-optimization>=1.2.0
Expand Down
1 change: 1 addition & 0 deletions requirements/classification.txt
Original file line number Diff line number Diff line change
Expand Up @@ -5,3 +5,4 @@ mmcls==0.25.0
timm==0.6.12
mmdeploy==0.14.0
pytorchcv
yapf<0.40.0 # it should be removed after https://github.com/google/yapf/issues/1118 is solved
1 change: 1 addition & 0 deletions requirements/detection.txt
Original file line number Diff line number Diff line change
Expand Up @@ -8,3 +8,4 @@ timm==0.6.12
mmdeploy==0.14.0
mmengine==0.7.4
scikit-image
yapf<0.40.0 # it should be removed after https://github.com/google/yapf/issues/1118 is solved
1 change: 1 addition & 0 deletions requirements/segmentation.txt
Original file line number Diff line number Diff line change
Expand Up @@ -7,3 +7,4 @@ mmdeploy==0.14.0
timm==0.6.12
pytorchcv
einops==0.6.1
yapf<0.40.0 # it should be removed after https://github.com/google/yapf/issues/1118 is solved
2 changes: 1 addition & 1 deletion src/otx/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,5 +3,5 @@
# Copyright (C) 2021-2023 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

__version__ = "1.4.0rc0"
__version__ = "1.4.0rc1"
# NOTE: Sync w/ src/otx/api/usecases/exportable_code/demo/requirements.txt on release
Original file line number Diff line number Diff line change
Expand Up @@ -416,7 +416,10 @@ def evaluate(
)

eval_results["MHAcc"] = total_acc
eval_results["avgClsAcc"] = total_acc_sl / self.hierarchical_info["num_multiclass_heads"]
if self.hierarchical_info["num_multiclass_heads"] > 0:
eval_results["avgClsAcc"] = total_acc_sl / self.hierarchical_info["num_multiclass_heads"]
else:
eval_results["avgClsAcc"] = total_acc_sl
eval_results["mAP"] = mAP_value
eval_results["accuracy"] = total_acc

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,14 @@
logger = get_logger()


def is_hierarchical_chkpt(chkpt: dict):
"""Detect whether previous checkpoint is hierarchical or not."""
for k, v in chkpt.items():
if "fc" in k:
return True
return False


@CLASSIFIERS.register_module()
class SAMImageClassifier(SAMClassifierMixin, ClsLossDynamicsTrackingMixin, ImageClassifier):
"""SAM-enabled ImageClassifier."""
Expand Down Expand Up @@ -193,11 +201,19 @@ def load_state_dict_pre_hook(module, state_dict, prefix, *args, **kwargs): # no
def load_state_dict_mixing_hook(
model, model_classes, chkpt_classes, chkpt_dict, prefix, *args, **kwargs
): # pylint: disable=unused-argument, too-many-branches, too-many-locals
"""Modify input state_dict according to class name matching before weight loading."""
"""Modify input state_dict according to class name matching before weight loading.

If previous training is hierarchical training,
then the current training should be hierarchical training. vice versa.

"""
backbone_type = type(model.backbone).__name__
if backbone_type not in ["OTXMobileNetV3", "OTXEfficientNet", "OTXEfficientNetV2"]:
return

if model.hierarchical != is_hierarchical_chkpt(chkpt_dict):
return

# Dst to src mapping index
model_classes = list(model_classes)
chkpt_classes = list(chkpt_classes)
Expand Down Expand Up @@ -249,13 +265,15 @@ def load_state_dict_mixing_hook(
continue

# Mix weights
chkpt_param = chkpt_dict[chkpt_name]
for module, c in enumerate(model2chkpt):
if c >= 0:
model_param[module].copy_(chkpt_param[c])
# NOTE: Label mix is not supported for H-label classification.
if not model.hierarchical:
chkpt_param = chkpt_dict[chkpt_name]
for module, c in enumerate(model2chkpt):
if c >= 0:
model_param[module].copy_(chkpt_param[c])

# Replace checkpoint weight by mixed weights
chkpt_dict[chkpt_name] = model_param
# Replace checkpoint weight by mixed weights
chkpt_dict[chkpt_name] = model_param

def extract_feat(self, img):
"""Directly extract features from the backbone + neck.
Expand Down
22 changes: 15 additions & 7 deletions src/otx/algorithms/classification/task.py
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,8 @@
from otx.api.entities.inference_parameters import (
default_progress_callback as default_infer_progress_callback,
)
from otx.api.entities.label import LabelEntity
from otx.api.entities.label_schema import LabelGroup
from otx.api.entities.metadata import FloatMetadata, FloatType
from otx.api.entities.metrics import (
CurveMetric,
Expand Down Expand Up @@ -125,16 +127,22 @@ def __init__(self, task_environment: TaskEnvironment, output_path: Optional[str]
if self._task_environment.model is not None:
self._load_model()

def _is_multi_label(self, label_groups: List[LabelGroup], all_labels: List[LabelEntity]):
"""Check whether the current training mode is multi-label or not."""
# NOTE: In the current Geti, multi-label should have `___` symbol for all group names.
find_multilabel_symbol = ["___" in getattr(i, "name", "") for i in label_groups]
return (
(len(label_groups) > 1) and (len(label_groups) == len(all_labels)) and (False not in find_multilabel_symbol)
)

def _set_train_mode(self):
self._multilabel = len(self._task_environment.label_schema.get_groups(False)) > 1 and len(
self._task_environment.label_schema.get_groups(False)
) == len(
self._task_environment.get_labels(include_empty=False)
) # noqa:E127
label_groups = self._task_environment.label_schema.get_groups(include_empty=False)
all_labels = self._task_environment.label_schema.get_labels(include_empty=False)

self._multilabel = self._is_multi_label(label_groups, all_labels)
if self._multilabel:
logger.info("Classification mode: multilabel")

if not self._multilabel and len(self._task_environment.label_schema.get_groups(False)) > 1:
elif len(label_groups) > 1:
logger.info("Classification mode: hierarchical")
self._hierarchical = True
self._hierarchical_info = get_hierarchical_info(self._task_environment.label_schema)
Expand Down
2 changes: 2 additions & 0 deletions src/otx/algorithms/segmentation/adapters/openvino/task.py
Original file line number Diff line number Diff line change
Expand Up @@ -240,6 +240,8 @@ def add_prediction(
current_label_soft_prediction = soft_prediction[:, :, label_index]
if process_soft_prediction:
current_label_soft_prediction = get_activation_map(current_label_soft_prediction)
else:
current_label_soft_prediction = (current_label_soft_prediction * 255).astype(np.uint8)
result_media = ResultMediaEntity(
name=label.name,
type="soft_prediction",
Expand Down
2 changes: 2 additions & 0 deletions src/otx/algorithms/segmentation/task.py
Original file line number Diff line number Diff line change
Expand Up @@ -290,6 +290,8 @@ def _add_predictions_to_dataset(self, prediction_results, dataset, dump_soft_pre
current_label_soft_prediction = soft_prediction[:, :, label_index]
if process_soft_prediction:
current_label_soft_prediction = get_activation_map(current_label_soft_prediction)
else:
current_label_soft_prediction = (current_label_soft_prediction * 255).astype(np.uint8)
result_media = ResultMediaEntity(
name=label.name,
type="soft_prediction",
Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
openvino==2023.0
openvino-model-api==0.1.2
otx @ git+https://github.com/openvinotoolkit/training_extensions/@77b635f4fba0a8acca221ec7e8b1fadd734358da#egg=otx
otx==1.4.0rc1
numpy>=1.21.0,<=1.23.5 # np.bool was removed in 1.24.0 which was used in openvino runtime
Loading