Skip to content

Commit

Permalink
Support DOTA data format for oriented object detection task (#1475)
Browse files Browse the repository at this point in the history
<!-- Contributing guide:
https://github.com/openvinotoolkit/datumaro/blob/develop/CONTRIBUTING.md
-->

### Summary

<!--
Resolves #111 and #222.
Depends on #1000 (for series of dependent commits).

This PR introduces this capability to make the project better in this
and that.

- Added this feature
- Removed that feature
- Fixed the problem #1234
-->

### How to test
<!-- Describe the testing procedure for reviewers, if changes are
not fully covered by unit tests or manual testing can be complicated.
-->

### Checklist
<!-- Put an 'x' in all the boxes that apply -->
- [x] I have added unit tests to cover my changes.​
- [ ] I have added integration tests to cover my changes.​
- [x] I have added the description of my changes into
[CHANGELOG](https://github.com/openvinotoolkit/datumaro/blob/develop/CHANGELOG.md).​
- [x] I have updated the
[documentation](https://github.com/openvinotoolkit/datumaro/tree/develop/docs)
accordingly

### License

- [x] I submit _my code changes_ under the same [MIT
License](https://github.com/openvinotoolkit/datumaro/blob/develop/LICENSE)
that covers the project.
  Feel free to contact the maintainers if that's a concern.
- [x] I have updated the license header for each file (see an example
below).

```python
# Copyright (C) 2024 Intel Corporation
#
# SPDX-License-Identifier: MIT
```
  • Loading branch information
wonjuleee authored Apr 26, 2024
1 parent f9a25f5 commit 711d6b5
Show file tree
Hide file tree
Showing 14 changed files with 510 additions and 3 deletions.
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
(<https://github.com/openvinotoolkit/datumaro/pull/1422>)
- Add AnnotationType.rotated_bbox for oriented object detection
(<https://github.com/openvinotoolkit/datumaro/pull/1459>)
- Add DOTA data format for oriented object detection task
(<https://github.com/openvinotoolkit/datumaro/pull/1475>)

### Enhancements
- Fix ambiguous COCO format detector
Expand Down
47 changes: 47 additions & 0 deletions docs/source/docs/data-formats/formats/dota.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
# DOTA
## Format specification
[DOTA](https://captain-whu.github.io/DOTA/dataset.html) is a collection of 2K aerial images for a rotated object detection task.
Each objects are annotated with 4 coordinates for representing oriented bounding boxes, a label among 15 classes (baseball-diamond, basketball-court, bridge, ground-track-field, harbor, helicopter, large-vehicle, plane, roundabout, ship, small-vehicle, soccer-ball-field, storage-tank, swimming-pool, tennis-court) and a difficulty.

## Import DOTA dataset
A Datumaro project with a DOTA source can be created in the following way:

```bash
datum project create
datum project import --format dota <path/to/dataset>
```

Or, using Python API:

```python
import datumaro as dm

dataset = dm.Dataset.import_from('<path/to/dataset>', 'dota')
```

## Directory structure
<!--lint disable fenced-code-flag-->
```
dota/
├── train # Subset directory
│ ├── images
│ │ ├── img1.jpg # Image file
│ │ ├── img2.jpg # Image file
│ │ └── ...
│ ├── labelTxt
│ │ ├── img1.txt # Annotation file
│ │ ├── img2.txt # Annotation file
│ │ └── ...
├── val # Subset directory
│ ├── images
│ │ ├── img3.jpg # Image file
│ │ ├── img4.jpg # Image file
│ │ └── ...
│ ├── labelTxt
│ │ ├── img3.txt # Annotation file
│ │ ├── img4.txt # Annotation file
│ │ └── ...
└── ...
```
## Annotation Txt file
The example of `<image_id>.txt` is given by [DOTA annotation format](https://captain-whu.github.io/DOTA/dataset.html).
5 changes: 5 additions & 0 deletions docs/source/docs/data-formats/formats/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ Supported Data Formats
cvat
datumaro_binary
datumaro
dota
icdar
image_zip
imagenet
Expand Down Expand Up @@ -104,6 +105,10 @@ Supported Data Formats
* `Format specification <https://opencv.github.io/cvat/docs/manual/advanced/xml_format>`_
* `Dataset example <https://github.com/openvinotoolkit/datumaro/tree/develop/tests/assets/cvat_dataset>`_
* `Format documentation <cvat.md>`_
* DOTA (``detection_rotated``)
* `Format specification <https://captain-whu.github.io/DOTA/dataset.html>`_
* `Dataset example <https://github.com/openvinotoolkit/datumaro/tree/develop/tests/assets/dota_dataset>`_
* `Format documentation <dota.md>`_
* ICDAR13/15 (``word recognition``, ``text localization``, ``text segmentation``)
* `Format specification <https://rrc.cvc.uab.es/?ch=2>`_
* `Dataset example <https://github.com/openvinotoolkit/datumaro/tree/develop/tests/assets/icdar_dataset>`_
Expand Down
260 changes: 260 additions & 0 deletions src/datumaro/plugins/data_formats/dota.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,260 @@
# Copyright (C) 2024 Intel Corporation
#
# SPDX-License-Identifier: MIT

from __future__ import annotations

import logging as log
import os
import os.path as osp
from typing import Any, Dict, List, Optional, Type, TypeVar

from datumaro.components.annotation import Annotation, AnnotationType, LabelCategories, RotatedBbox
from datumaro.components.dataset_base import DEFAULT_SUBSET_NAME, DatasetItem, SubsetBase
from datumaro.components.errors import (
DatasetExportError,
DatasetImportError,
InvalidAnnotationError,
MediaTypeError,
)
from datumaro.components.exporter import Exporter
from datumaro.components.format_detection import FormatDetectionConfidence, FormatDetectionContext
from datumaro.components.importer import ImportContext, Importer
from datumaro.components.media import Image
from datumaro.components.task import TaskAnnotationMapping
from datumaro.util.image import IMAGE_EXTENSIONS
from datumaro.util.os_util import find_files

T = TypeVar("T")


class DotaFormat:
ANNOTATION_DIR = "labelTxt"
IMAGE_DIR = "images"


class DotaBase(SubsetBase):
def __init__(
self,
path: Optional[List[str]] = None,
*,
img_path: Optional[str] = None,
subset: Optional[str] = None,
ctx: Optional[ImportContext] = None,
) -> None:
super().__init__(subset=subset, ctx=ctx)

if not osp.isdir(path):
raise DatasetImportError(f"Can't find annotation directory {path}")

self._path = path

super().__init__(subset=subset, ctx=ctx)

self._img_files = self._load_img_files(img_path)
self._label_categories = self._load_categories(path)
self._categories = {AnnotationType.label: self._label_categories}

self._items = self._load_items(path)
self._task_type = TaskAnnotationMapping().get_task(self._ann_types)

def _load_img_files(self, rootpath: str) -> Dict[str, str]:
return {
self._get_fname(img_file): img_file
for img_file in find_files(rootpath, IMAGE_EXTENSIONS, recursive=True, max_depth=2)
}

def _load_categories(self, path):
label_names = []
for ann_file in os.listdir(path):
label_names.extend(
self._parse_annotations(
ann_file=osp.join(self._path, ann_file), only_label_names=True
)
)

label_categories = LabelCategories()
for label_name in sorted(set(label_names)):
label_categories.add(label_name)

return label_categories

def _load_items(self, path):
items = []
for ann_file in os.listdir(path):
fname = osp.splitext(ann_file)[0]
img = Image.from_file(path=self._img_files[fname])
anns = self._parse_annotations(
ann_file=osp.join(self._path, ann_file), only_label_names=False
)
items.append(DatasetItem(id=fname, subset=self._subset, media=img, annotations=anns))
return items

def _get_fname(self, fpath: str) -> str:
return osp.splitext(osp.basename(fpath))[0]

def _parse_annotations(
self,
ann_file: str,
only_label_names: bool,
) -> List[Annotation]:
lines = []
with open(ann_file, "r", encoding="utf-8") as f:
for line in f:
line = line.strip()
if line:
lines.append(line)

annotations = []
for line in lines:
parts = line.split()
if len(parts) != 10:
log.debug(
f"Unexpected field count {len(parts)} in the bbox description. "
"Expected 10 fields (8 coordinates for rectangle, category, and difficulty)."
)
continue

label_name = self._parse_field(parts[-2], str, "label_name")

if only_label_names:
annotations.append(label_name)
continue

label_id, _ = self._label_categories.find(label_name)
coords = [
(
self._parse_field(parts[i], float, "coords"),
self._parse_field(parts[i + 1], float, "coords"),
)
for i in range(0, 8, 2)
]
difficulty = self._parse_field(parts[-1], int, "difficulty")

annotations.append(
RotatedBbox.from_rectangle(
coords, label=label_id, attributes={"difficulty": difficulty}
)
)
self._ann_types.add(AnnotationType.rotated_bbox)

return annotations

@staticmethod
def _parse_field(value: str, desired_type: Type[T], field_name: str) -> T:
try:
return desired_type(value)
except Exception as e:
raise InvalidAnnotationError(
f"Can't parse {field_name} from '{value}'. Expected {desired_type}"
) from e


class DotaImporter(Importer):
_ANNO_EXT = ".txt"

@classmethod
def detect(cls, context: FormatDetectionContext) -> FormatDetectionConfidence:
context.require_file("**/" + DotaFormat.ANNOTATION_DIR + "/*" + cls._ANNO_EXT)
return FormatDetectionConfidence.MEDIUM

@classmethod
def find_sources(cls, path: str) -> List[Dict[str, Any]]:
sources = cls._find_sources_recursive(
path=path,
ext=cls._ANNO_EXT,
dirname=DotaFormat.ANNOTATION_DIR,
extractor_name="dota",
)

data_paths = set()
for source in sources:
url = osp.dirname(source["url"])
subset_name = osp.relpath(source["url"], path).split(osp.sep)[0]
data_paths.add((subset_name, url))

return [
{
"url": ann_dir,
"format": "dota",
"options": {
"subset": subset,
"img_path": osp.join(path, subset, DotaFormat.IMAGE_DIR),
},
}
for subset, ann_dir in data_paths
]

@classmethod
def get_file_extensions(cls) -> List[str]:
return [cls._ANNO_EXT]


class DotaExporter(Exporter):
DEFAULT_IMAGE_EXT = ".png"

def _apply_impl(self):
extractor = self._extractor
save_dir = self._save_dir

if self._extractor.media_type() and not issubclass(self._extractor.media_type(), Image):
raise MediaTypeError("Media type is not an image")

os.makedirs(save_dir, exist_ok=True)

label_categories = extractor.categories()[AnnotationType.label]

subsets = self._extractor.subsets()
for subset_name, subset in subsets.items():
if not subset_name or subset_name == DEFAULT_SUBSET_NAME:
subset_name = DEFAULT_SUBSET_NAME

subset_dir = osp.join(save_dir, subset_name)
os.makedirs(subset_dir, exist_ok=True)

for item in subset:
try:
self._export_media(item, subset_dir)
self._export_item_annotation(item, subset_dir, label_categories)

except Exception as e:
self._ctx.error_policy.report_item_error(e, item_id=(item.id, item.subset))

def _export_media(self, item: DatasetItem, subset_dir: str) -> str:
try:
if not item.media or not (item.media.has_data or item.media.has_size):
raise DatasetExportError(
"Failed to export item '%s': " "item has no image info" % item.id
)

image_name = self._make_image_filename(item)
image_fpath = osp.join(subset_dir, DotaFormat.IMAGE_DIR, image_name)

if self._save_media:
self._save_image(item, image_fpath)

except Exception as e:
self._ctx.error_policy.report_item_error(e, item_id=(item.id, item.subset))

def _export_item_annotation(
self, item: DatasetItem, subset_dir: str, label_categories: LabelCategories
) -> None:
try:
annotations = ""
for bbox in item.annotations:
if not isinstance(bbox, RotatedBbox) or bbox.label is None:
continue
coords = bbox.as_polygon()
coords = " ".join("%.2f %.2f" % (x, y) for x, y in coords)
label_name = label_categories[bbox.label].name
difficulty = bbox.attributes.get("difficulty", 0)
annotations += "%s %s %s\n" % (coords, label_name, difficulty)

annotation_path = osp.join(subset_dir, DotaFormat.ANNOTATION_DIR, "%s.txt" % item.id)
os.makedirs(osp.dirname(annotation_path), exist_ok=True)

with open(annotation_path, "w", encoding="utf-8") as f:
f.write(annotations)

except Exception as e:
self._ctx.error_policy.report_item_error(e, item_id=(item.id, item.subset))
16 changes: 16 additions & 0 deletions src/datumaro/plugins/data_formats/roboflow/importer.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@

import os
import os.path as osp
import warnings
from collections import defaultdict
from glob import glob
from io import TextIOWrapper
Expand Down Expand Up @@ -194,6 +195,21 @@ class RoboflowYoloObbImporter(RoboflowYoloImporter):
FORMAT_EXT = ".txt"
ANN_DIR_NAME = "labelTxt/"

@classmethod
def detect(cls, context: FormatDetectionContext) -> FormatDetectionConfidence:
warnings.warn(
f"FormatDetectionConfidence of '{cls.FORMAT}' is lowered because of 'dota' format support. "
f"It will be deprecated in datumaro==1.8.0.",
DeprecationWarning,
)
with context.require_any():
with context.alternative():
cls._check_ann_file(
context.require_file("**/" + cls.ANN_DIR_NAME + "*" + cls.FORMAT_EXT), context
)

return FormatDetectionConfidence.LOW

@classmethod
def _check_ann_file_impl(cls, fp: TextIOWrapper) -> bool:
for line in fp:
Expand Down
20 changes: 20 additions & 0 deletions src/datumaro/plugins/specs.json
Original file line number Diff line number Diff line change
Expand Up @@ -459,6 +459,26 @@
]
}
},
{
"import_path": "datumaro.plugins.data_formats.dota.DotaBase",
"plugin_name": "dota",
"plugin_type": "DatasetBase"
},
{
"import_path": "datumaro.plugins.data_formats.dota.DotaExporter",
"plugin_name": "dota",
"plugin_type": "Exporter"
},
{
"import_path": "datumaro.plugins.data_formats.dota.DotaImporter",
"plugin_name": "dota",
"plugin_type": "Importer",
"metadata": {
"file_extensions": [
".txt"
]
}
},
{
"import_path": "datumaro.plugins.data_formats.icdar.base.IcdarTextLocalizationBase",
"plugin_name": "icdar_text_localization",
Expand Down
Binary file added tests/assets/dota_dataset/train/images/train_001.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added tests/assets/dota_dataset/train/images/train_002.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1 change: 1 addition & 0 deletions tests/assets/dota_dataset/train/labelTxt/train_001.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
0 0 0 2 2 2 2 0 label_0 0
1 change: 1 addition & 0 deletions tests/assets/dota_dataset/train/labelTxt/train_002.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
1 1 1 5 5 5 5 1 label_1 0
Binary file added tests/assets/dota_dataset/val/images/val_001.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading

0 comments on commit 711d6b5

Please sign in to comment.