Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update semantic segmentation transforms to use OTX's instead of torchvision #3724

Merged
merged 35 commits into from
Jul 15, 2024
Merged
Show file tree
Hide file tree
Changes from 34 commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
bda46b6
Refine data parts in recipes
sungchul2 Jul 8, 2024
19f5f51
Remove `torchvision_base.yaml` dependency
sungchul2 Jul 8, 2024
2c3062c
Update
sungchul2 Jul 8, 2024
7288349
Refine iseg recipes
sungchul2 Jul 8, 2024
a510f18
Fix unit test
sungchul2 Jul 8, 2024
e43cf37
Update iseg ov model
sungchul2 Jul 8, 2024
d4f48a6
Refine rotated det recipes
sungchul2 Jul 8, 2024
b095694
Refine sseg recipes
sungchul2 Jul 8, 2024
403d35a
Refine vpm recipes
sungchul2 Jul 8, 2024
7f66645
Fix indent
sungchul2 Jul 8, 2024
d90b5d4
Refine zsl recipes
sungchul2 Jul 8, 2024
502163b
Add model recipe and update det recipes
sungchul2 Jul 8, 2024
22d648b
Add base engine recipe and update recipes for det
sungchul2 Jul 8, 2024
2c03cb0
Fix unit test and make recipes more readable
sungchul2 Jul 9, 2024
6cc2a64
Update other tasks' recipes following det
sungchul2 Jul 9, 2024
2ea2389
Remove model components that are redundantly written
sungchul2 Jul 9, 2024
207a0cc
pre-commit
sungchul2 Jul 9, 2024
6157217
Revert `to_tv_image` to True
sungchul2 Jul 10, 2024
2ea6ee8
Revert maskrcnn_r50_tv due to overflow issue
sungchul2 Jul 10, 2024
de8fc91
Merge branch 'develop' into CVS-146127-refine-recipes
sungchul2 Jul 10, 2024
9428ce8
pre-commit
sungchul2 Jul 10, 2024
71b3cab
Revert "Revert maskrcnn_r50_tv due to overflow issue"
sungchul2 Jul 10, 2024
e5104b0
Revert engine
sungchul2 Jul 10, 2024
6135260
Revert model and update dino_v2_seg
sungchul2 Jul 10, 2024
7849186
pre-commit
sungchul2 Jul 10, 2024
0eab3c4
Update transforms for sseg and edit entity name of mask to `masks` fo…
sungchul2 Jul 11, 2024
c17cfb5
Update recipe to use OTX's
sungchul2 Jul 11, 2024
381cf02
Merge branch 'develop' into update-sseg-tv-to-otx
sungchul2 Jul 11, 2024
b8e4bf3
Update `dino_v2` data config
sungchul2 Jul 11, 2024
cdff341
Update
sungchul2 Jul 11, 2024
4711dbb
Fix unit test
sungchul2 Jul 11, 2024
61e8550
Update `RandomResizedCrop` for masks
sungchul2 Jul 12, 2024
2e32819
Update recipes to use `RandomResizedCrop` instead of `RandomResize` a…
sungchul2 Jul 12, 2024
b8307a6
Remove unused argument in recipes
sungchul2 Jul 12, 2024
a7fdfe0
Update type annotation
sungchul2 Jul 12, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 3 additions & 4 deletions src/otx/core/data/dataset/segmentation.py
Original file line number Diff line number Diff line change
Expand Up @@ -213,11 +213,10 @@ def _get_item_impl(self, index: int) -> SegDataEntity | None:
image_color_channel=self.image_color_channel,
ignored_labels=ignored_labels,
),
gt_seg_map=tv_tensors.Mask(
mask,
),
masks=tv_tensors.Mask(mask[None]),
)
return self._apply_transforms(entity)
transformed_entity = self._apply_transforms(entity)
return transformed_entity.wrap(masks=transformed_entity.masks[0]) if transformed_entity else None

@property
def collate_fn(self) -> Callable:
Expand Down
6 changes: 3 additions & 3 deletions src/otx/core/data/entity/segmentation.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,15 +24,15 @@
class SegDataEntity(OTXDataEntity):
"""Data entity for segmentation task.

:param gt_seg_map: mask annotations
:param mask: mask annotations
"""

@property
def task(self) -> OTXTaskType:
"""OTX Task type definition."""
return OTXTaskType.SEMANTIC_SEGMENTATION

gt_seg_map: tv_tensors.Mask
masks: tv_tensors.Mask


@dataclass
Expand Down Expand Up @@ -66,7 +66,7 @@ def collate_fn(
batch_size=batch_data.batch_size,
images=batch_data.images,
imgs_info=batch_data.imgs_info,
masks=[entity.gt_seg_map for entity in entities],
masks=[entity.masks for entity in entities],
)

def pin_memory(self) -> SegBatchDataEntity:
Expand Down
4 changes: 2 additions & 2 deletions src/otx/core/data/transform_libs/mmseg.py
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ def transform(self, results: dict) -> dict:
msg = "__otx__ key should be passed from the previous pipeline (LoadImageFromFile)"
raise RuntimeError(msg)
if isinstance(otx_data_entity, SegDataEntity):
gt_masks = otx_data_entity.gt_seg_map.numpy()
gt_masks = otx_data_entity.masks.numpy()
results["gt_seg_map"] = gt_masks
# we need this to properly handle seg maps during transforms
results["seg_fields"] = ["gt_seg_map"]
Expand Down Expand Up @@ -69,7 +69,7 @@ def transform(self, results: dict) -> SegDataEntity:
return SegDataEntity(
image=image,
img_info=image_info,
gt_seg_map=masks,
masks=masks,
)


Expand Down
165 changes: 99 additions & 66 deletions src/otx/core/data/transform_libs/torchvision.py

Large diffs are not rendered by default.

2 changes: 2 additions & 0 deletions src/otx/recipe/_base_/data/instance_segmentation.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ train_subset:
num_workers: 2
sampler:
class_path: torch.utils.data.RandomSampler

val_subset:
subset_name: val
transform_lib_type: TORCHVISION
Expand All @@ -26,6 +27,7 @@ val_subset:
num_workers: 2
sampler:
class_path: torch.utils.data.RandomSampler

test_subset:
subset_name: test
transform_lib_type: TORCHVISION
Expand Down
2 changes: 2 additions & 0 deletions src/otx/recipe/_base_/data/rotated_detection.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ train_subset:
num_workers: 2
sampler:
class_path: torch.utils.data.RandomSampler

val_subset:
subset_name: val
transform_lib_type: TORCHVISION
Expand All @@ -26,6 +27,7 @@ val_subset:
num_workers: 2
sampler:
class_path: torch.utils.data.RandomSampler

test_subset:
subset_name: test
transform_lib_type: TORCHVISION
Expand Down
27 changes: 17 additions & 10 deletions src/otx/recipe/_base_/data/semantic_segmentation.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -13,22 +13,23 @@ train_subset:
transform_lib_type: TORCHVISION
to_tv_image: true
transforms:
- class_path: torchvision.transforms.v2.RandomResizedCrop
- class_path: otx.core.data.transform_libs.torchvision.RandomResizedCrop
init_args:
size:
scale:
- 512
- 512
scale:
crop_ratio_range:
- 0.2
- 1.0
ratio:
aspect_ratio_range:
- 0.5
- 2.0
antialias: true
transform_mask: true
- class_path: otx.core.data.transform_libs.torchvision.PhotoMetricDistortion
- class_path: otx.core.data.transform_libs.torchvision.RandomFlip
init_args:
prob: 0.5
is_numpy_to_tvtensor: true
- class_path: torchvision.transforms.v2.RandomHorizontalFlip
- class_path: torchvision.transforms.v2.ToDtype
init_args:
dtype: ${as_torch_dtype:torch.float32}
Expand All @@ -38,18 +39,21 @@ train_subset:
std: [58.395, 57.12, 57.375]
sampler:
class_path: torch.utils.data.RandomSampler

val_subset:
subset_name: val
batch_size: 8
num_workers: 4
transform_lib_type: TORCHVISION
to_tv_image: true
transforms:
- class_path: torchvision.transforms.v2.Resize
- class_path: otx.core.data.transform_libs.torchvision.Resize
init_args:
size:
scale:
- 512
- 512
transform_mask: true
is_numpy_to_tvtensor: true
- class_path: torchvision.transforms.v2.ToDtype
init_args:
dtype: ${as_torch_dtype:torch.float32}
Expand All @@ -59,18 +63,21 @@ val_subset:
std: [58.395, 57.12, 57.375]
sampler:
class_path: torch.utils.data.RandomSampler

test_subset:
subset_name: test
num_workers: 4
batch_size: 8
transform_lib_type: TORCHVISION
to_tv_image: true
transforms:
- class_path: torchvision.transforms.v2.Resize
- class_path: otx.core.data.transform_libs.torchvision.Resize
init_args:
size:
scale:
- 512
- 512
transform_mask: true
is_numpy_to_tvtensor: true
- class_path: torchvision.transforms.v2.ToDtype
init_args:
dtype: ${as_torch_dtype:torch.float32}
Expand Down
1 change: 0 additions & 1 deletion src/otx/recipe/classification/h_label_cls/deit_tiny.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,6 @@ overrides:
- class_path: otx.core.data.transform_libs.torchvision.RandomResizedCrop
init_args:
scale: 224
backend: cv2
- class_path: otx.core.data.transform_libs.torchvision.RandomFlip
init_args:
prob: 0.5
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,6 @@ overrides:
- class_path: otx.core.data.transform_libs.torchvision.EfficientNetRandomCrop
init_args:
scale: 224
backend: cv2
- class_path: otx.core.data.transform_libs.torchvision.RandomFlip
init_args:
prob: 0.5
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,6 @@ overrides:
- class_path: otx.core.data.transform_libs.torchvision.EfficientNetRandomCrop
init_args:
scale: 224
backend: cv2
- class_path: otx.core.data.transform_libs.torchvision.RandomFlip
init_args:
prob: 0.5
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,6 @@ overrides:
- class_path: otx.core.data.transform_libs.torchvision.RandomResizedCrop
init_args:
scale: 224
backend: cv2
- class_path: otx.core.data.transform_libs.torchvision.RandomFlip
init_args:
prob: 0.5
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,6 @@ overrides:
- class_path: otx.core.data.transform_libs.torchvision.EfficientNetRandomCrop
init_args:
scale: 224
backend: cv2
- class_path: otx.core.data.transform_libs.torchvision.RandomFlip
init_args:
prob: 0.5
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,6 @@ overrides:
- class_path: otx.core.data.transform_libs.torchvision.EfficientNetRandomCrop
init_args:
scale: 224
backend: cv2
- class_path: otx.core.data.transform_libs.torchvision.RandomFlip
init_args:
prob: 0.5
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,6 @@ overrides:
- class_path: otx.core.data.transform_libs.torchvision.RandomResizedCrop
init_args:
scale: 224
backend: cv2
- class_path: otx.core.data.transform_libs.torchvision.RandomFlip
init_args:
prob: 0.5
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,6 @@ overrides:
- class_path: otx.core.data.transform_libs.torchvision.RandomResizedCrop
init_args:
scale: 224
backend: cv2
- class_path: otx.core.data.transform_libs.torchvision.RandomFlip
init_args:
prob: 0.5
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,6 @@ overrides:
- class_path: otx.core.data.transform_libs.torchvision.RandomResizedCrop
init_args:
scale: 224
backend: cv2
is_numpy_to_tvtensor: true
sampler:
class_path: otx.algo.samplers.balanced_sampler.BalancedSampler
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,6 @@ overrides:
- class_path: otx.core.data.transform_libs.torchvision.EfficientNetRandomCrop
init_args:
scale: 224
backend: cv2
- class_path: otx.core.data.transform_libs.torchvision.RandomFlip
init_args:
prob: 0.5
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,6 @@ overrides:
- class_path: otx.core.data.transform_libs.torchvision.EfficientNetRandomCrop
init_args:
scale: 224
backend: cv2
- class_path: otx.core.data.transform_libs.torchvision.RandomFlip
init_args:
prob: 0.5
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,6 @@ overrides:
- class_path: otx.core.data.transform_libs.torchvision.RandomResizedCrop
init_args:
scale: 224
backend: cv2
- class_path: otx.core.data.transform_libs.torchvision.RandomFlip
init_args:
prob: 0.5
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,6 @@ overrides:
- class_path: otx.core.data.transform_libs.torchvision.RandomResizedCrop
init_args:
scale: 224
backend: cv2
- class_path: otx.core.data.transform_libs.torchvision.RandomFlip
init_args:
prob: 0.5
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,6 @@ overrides:
- class_path: otx.core.data.transform_libs.torchvision.EfficientNetRandomCrop
init_args:
scale: 224
backend: cv2
- class_path: otx.core.data.transform_libs.torchvision.RandomFlip
init_args:
prob: 0.5
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,6 @@ overrides:
- class_path: otx.core.data.transform_libs.torchvision.EfficientNetRandomCrop
init_args:
scale: 224
backend: cv2
- class_path: otx.core.data.transform_libs.torchvision.RandomFlip
init_args:
prob: 0.5
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,6 @@ overrides:
- class_path: otx.core.data.transform_libs.torchvision.RandomResizedCrop
init_args:
scale: 224
backend: cv2
- class_path: otx.core.data.transform_libs.torchvision.RandomFlip
init_args:
prob: 0.5
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,6 @@ overrides:
- class_path: otx.core.data.transform_libs.torchvision.EfficientNetRandomCrop
init_args:
scale: 224
backend: cv2
- class_path: otx.core.data.transform_libs.torchvision.RandomFlip
init_args:
prob: 0.5
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,6 @@ overrides:
- class_path: otx.core.data.transform_libs.torchvision.EfficientNetRandomCrop
init_args:
scale: 224
backend: cv2
- class_path: otx.core.data.transform_libs.torchvision.RandomFlip
init_args:
prob: 0.5
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,6 @@ overrides:
- class_path: otx.core.data.transform_libs.torchvision.RandomResizedCrop
init_args:
scale: 224
backend: cv2
- class_path: otx.core.data.transform_libs.torchvision.RandomFlip
init_args:
prob: 0.5
Expand Down
25 changes: 15 additions & 10 deletions src/otx/recipe/semantic_segmentation/dino_v2.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -42,22 +42,23 @@ overrides:
data:
train_subset:
transforms:
- class_path: torchvision.transforms.v2.RandomResizedCrop
- class_path: otx.core.data.transform_libs.torchvision.RandomResizedCrop
init_args:
size:
scale:
- 560
- 560
scale:
crop_ratio_range:
- 0.2
- 1.0
ratio:
aspect_ratio_range:
- 0.5
- 2.0
antialias: true
transform_mask: true
- class_path: otx.core.data.transform_libs.torchvision.PhotoMetricDistortion
- class_path: otx.core.data.transform_libs.torchvision.RandomFlip
init_args:
prob: 0.5
is_numpy_to_tvtensor: true
- class_path: torchvision.transforms.v2.RandomHorizontalFlip
- class_path: torchvision.transforms.v2.ToDtype
init_args:
dtype: ${as_torch_dtype:torch.float32}
Expand All @@ -68,11 +69,13 @@ overrides:

val_subset:
transforms:
- class_path: torchvision.transforms.v2.Resize
- class_path: otx.core.data.transform_libs.torchvision.Resize
init_args:
size:
scale:
- 560
- 560
transform_mask: true
is_numpy_to_tvtensor: true
- class_path: torchvision.transforms.v2.ToDtype
init_args:
dtype: ${as_torch_dtype:torch.float32}
Expand All @@ -83,11 +86,13 @@ overrides:

test_subset:
transforms:
- class_path: torchvision.transforms.v2.Resize
- class_path: otx.core.data.transform_libs.torchvision.Resize
init_args:
size:
scale:
- 560
- 560
transform_mask: true
is_numpy_to_tvtensor: true
- class_path: torchvision.transforms.v2.ToDtype
init_args:
dtype: ${as_torch_dtype:torch.float32}
Expand Down
4 changes: 2 additions & 2 deletions tests/unit/core/data/test_dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -102,8 +102,8 @@ def test_ignore_index(self, fxt_mock_dm_subset):
# The mask is np.eye(10) with label_id = 0,
# so that the diagonal is filled with zero
# and others are filled with ignore_index.
gt_seg_map = next(iter(dataset)).gt_seg_map
assert gt_seg_map.sum() == (10 * 10 - 10) * 100
masks = next(iter(dataset)).masks
assert masks.sum() == (10 * 10 - 10) * 100

def test_overflown_ignore_index(self, fxt_mock_dm_subset):
dataset = OTXSegmentationDataset(
Expand Down
Loading