Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

masks_to_bounding_boxes op #4290

Merged
merged 40 commits into from
Sep 21, 2021
Merged
Show file tree
Hide file tree
Changes from 7 commits
Commits
Show all changes
40 commits
Select commit Hold shift + click to select a range
cf51379
ops.masks_to_bounding_boxes
0x00b1 Aug 17, 2021
c67e035
test fixtures
0x00b1 Aug 18, 2021
3830dd1
unit test
0x00b1 Aug 18, 2021
926d444
Merge branch 'master' into issues/3960
0x00b1 Aug 18, 2021
f777416
ignore lint e201 and e202 for in-lined matrix
0x00b1 Aug 18, 2021
cd46aa7
ignore e121 and e241 linting rules for in-lined matrix
0x00b1 Aug 18, 2021
712131e
draft gallery example text
0x00b1 Aug 18, 2021
b6f5c42
removed type annotations from pytest fixtures
0x00b1 Aug 31, 2021
b555c68
inlined fixture
0x00b1 Aug 31, 2021
fc26f3a
renamed masks_to_bounding_boxes to masks_to_boxes
0x00b1 Aug 31, 2021
c4d3045
reformat inline array
0x00b1 Aug 31, 2021
4589951
import cleanup
0x00b1 Aug 31, 2021
6b19d67
moved masks_to_boxes into boxes module
0x00b1 Sep 1, 2021
c6c89ec
docstring cleanup
0x00b1 Sep 1, 2021
16a99a9
updated docstring
0x00b1 Sep 15, 2021
7115320
fix formatting issue
0x00b1 Sep 15, 2021
f4796d2
Merge branch 'main' into issues/3960
datumbox Sep 15, 2021
a070133
Merge branch 'master' of https://github.com/pytorch/vision into issue…
0x00b1 Sep 15, 2021
0131db3
Merge branch 'issues/3960' of https://github.com/0x00b1/vision into i…
0x00b1 Sep 15, 2021
0a23bcf
gallery example
0x00b1 Sep 17, 2021
db8fb7b
use torch
0x00b1 Sep 17, 2021
f7a2c1e
use torch
0x00b1 Sep 17, 2021
c7dfcdf
use torch
0x00b1 Sep 17, 2021
5e6198a
use torch
0x00b1 Sep 17, 2021
7c78271
updated docs and test
0x00b1 Sep 17, 2021
b9055c2
cleanup
0x00b1 Sep 17, 2021
6c630c5
Merge branch 'main' into issues/3960
0x00b1 Sep 17, 2021
540c6a1
updated import
0x00b1 Sep 17, 2021
8e4fc2f
Merge branch 'main' into issues/3960
0x00b1 Sep 17, 2021
4c78297
use torch
0x00b1 Sep 20, 2021
140e429
Update gallery/plot_repurposing_annotations.py
0x00b1 Sep 20, 2021
8f2cd4a
Update gallery/plot_repurposing_annotations.py
0x00b1 Sep 20, 2021
7252723
Update gallery/plot_repurposing_annotations.py
0x00b1 Sep 20, 2021
26f68af
Merge branch 'main' into issues/3960
0x00b1 Sep 20, 2021
2c2d5dd
Autodoc
0x00b1 Sep 21, 2021
3a91957
use torch instead of numpy in tests
0x00b1 Sep 21, 2021
e24805c
fix build_docs failure
0x00b1 Sep 21, 2021
65404e9
Merge branch 'main' into issues/3960
0x00b1 Sep 21, 2021
6c89be7
Closing quotes.
datumbox Sep 21, 2021
b2a907c
Merge branch 'main' into issues/3960
datumbox Sep 21, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
73 changes: 73 additions & 0 deletions gallery/plot_repurposing_annotations.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
"""
=======================
Repurposing annotations
=======================

The following example illustrates the operations available in :ref:`the torchvision.ops module <ops>` for repurposing
Copy link
Contributor

@oke-aditya oke-aditya Sep 20, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After some debugging I found out the reason for build_docs CI failure. The problem is torchvision.ops does not have a nice index on right side (basically a html link to #ops like transforms has). This causes CI failure.

We need to remove the ref, and it will work fine. This is slightly hacky fix, but works fine.
I tried running it locally. I could build the gallery example. It looks nice.

Suggested change
The following example illustrates the operations available in :ref:`the torchvision.ops module <ops>` for repurposing
The following example illustrates the operations available in the torchvision.ops module for repurposing

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! I appreciate the debugging.

object localization annotations for different tasks (e.g. transforming masks used by instance and panoptic
segmentation methods into bounding boxes used by object detection methods).
"""

from PIL import Image
from pathlib import Path
import matplotlib.pyplot as plt
import numpy as np

import torch
import torchvision.transforms as T


plt.rcParams["savefig.bbox"] = 'tight'
orig_img = Image.open(Path('assets') / 'astronaut.jpg')
# if you change the seed, make sure that the randomly-applied transforms
# properly show that the image can be both transformed and *not* transformed!
torch.manual_seed(0)


def plot(imgs, with_orig=True, row_title=None, **imshow_kwargs):
if not isinstance(imgs[0], list):
# Make a 2d grid even if there's just 1 row
imgs = [imgs]

num_rows = len(imgs)
num_cols = len(imgs[0]) + with_orig
fig, axs = plt.subplots(nrows=num_rows, ncols=num_cols, squeeze=False)
for row_idx, row in enumerate(imgs):
row = [orig_img] + row if with_orig else row
for col_idx, img in enumerate(row):
ax = axs[row_idx, col_idx]
ax.imshow(np.asarray(img), **imshow_kwargs)
ax.set(xticklabels=[], yticklabels=[], xticks=[], yticks=[])

if with_orig:
axs[0, 0].set(title='Original image')
axs[0, 0].title.set_size(8)
if row_title is not None:
for row_idx in range(num_rows):
axs[row_idx, 0].set(ylabel=row_title[row_idx])

plt.tight_layout()

####################################
# Masks
# --------------------------------------
0x00b1 marked this conversation as resolved.
Show resolved Hide resolved
# In tasks like instance and panoptic segmentation, masks are commonly defined, and are defined by this package,
# as a multi-dimensional array (e.g. a NumPy array or a PyTorch tensor) with the following shape:
#
# (objects, height, width)
#
# Where objects is the number of annotated objects in the image. Each (height, width) object corresponds to exactly
# one object. For example, if your input image has the dimensions 224 x 224 and has four annotated objects the shape
# of your masks annotation has the following shape:
#
# (4, 224, 224).
#
# A nice property of masks is that they can be easily repurposed to be used in methods to solve a variety of object
# localization tasks.
#
# Masks to bounding boxes
# ~~~~~~~~~~~~~~~~~~~~~~~
0x00b1 marked this conversation as resolved.
Show resolved Hide resolved
# For example, the masks to bounding_boxes operation can be used to transform masks into bounding boxes that can be
# used in methods like Faster RCNN and YOLO.
padded_imgs = [T.Pad(padding=padding)(orig_img) for padding in (3, 10, 30, 50)]
plot(padded_imgs)
Binary file added test/assets/labeled_image.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added test/assets/masks.tiff
Binary file not shown.
43 changes: 43 additions & 0 deletions test/test_masks_to_bounding_boxes.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
import os.path

import PIL.Image
import numpy
import pytest
import torch

import torchvision.ops

ASSETS_DIRECTORY = os.path.join(os.path.dirname(os.path.abspath(__file__)), "assets")


@pytest.fixture
def labeled_image() -> torch.Tensor:
0x00b1 marked this conversation as resolved.
Show resolved Hide resolved
with PIL.Image.open(os.path.join(ASSETS_DIRECTORY, "labeled_image.png")) as image:
return torch.tensor(numpy.array(image, numpy.int))


@pytest.fixture
def masks() -> torch.Tensor:
0x00b1 marked this conversation as resolved.
Show resolved Hide resolved
with PIL.Image.open(os.path.join(ASSETS_DIRECTORY, "masks.tiff")) as image:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think it would be possible to write a test without the need for new images and hard-coded coordinates?

Ideally, we could generate random masks and have a super simple version of masks_to_boxes which we could use as the reference implementation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep. I wrote about this elsewhere in the thread. I'd love to add a generator for various outputs similar to the function @goldsborough and I wrote for scikit-image (skimage.draw.random_shapes). However, would you mind if I did this in a follow-up commit?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@NicolasHug a friendly bump

frames = numpy.zeros((image.n_frames, image.height, image.width), numpy.int)

for index in range(image.n_frames):
image.seek(index)

frames[index] = numpy.array(image)

return torch.tensor(frames)


def test_masks_to_bounding_boxes(masks):
expected = torch.tensor(
[[ 127., 2., 165., 40. ], # noqa: E121, E201, E202, E241
0x00b1 marked this conversation as resolved.
Show resolved Hide resolved
[ 4., 100., 88., 184. ], # noqa: E201, E202, E241
[ 168., 189., 294., 300. ], # noqa: E201, E202, E241
[ 556., 272., 700., 416. ], # noqa: E201, E202, E241
[ 800., 560., 990., 725. ], # noqa: E201, E202, E241
[ 294., 828., 594., 1092. ], # noqa: E201, E202, E241
[ 756., 1036., 1064., 1491. ]] # noqa: E201, E202, E241
)

torch.testing.assert_close(torchvision.ops.masks_to_bounding_boxes(masks), expected)
3 changes: 2 additions & 1 deletion torchvision/ops/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
from .poolers import MultiScaleRoIAlign
from .feature_pyramid_network import FeaturePyramidNetwork
from .focal_loss import sigmoid_focal_loss
from ._masks_to_bounding_boxes import masks_to_bounding_boxes

from ._register_onnx_ops import _register_custom_op

Expand All @@ -20,5 +21,5 @@
'box_area', 'box_iou', 'generalized_box_iou', 'roi_align', 'RoIAlign', 'roi_pool',
'RoIPool', 'ps_roi_align', 'PSRoIAlign', 'ps_roi_pool',
'PSRoIPool', 'MultiScaleRoIAlign', 'FeaturePyramidNetwork',
'sigmoid_focal_loss'
'sigmoid_focal_loss', 'masks_to_bounding_boxes'
]
26 changes: 26 additions & 0 deletions torchvision/ops/_masks_to_bounding_boxes.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
import torch


def masks_to_bounding_boxes(masks: torch.Tensor) -> torch.Tensor:
0x00b1 marked this conversation as resolved.
Show resolved Hide resolved
"""Compute the bounding boxes around the provided masks
The masks should be in format [N, H, W] where N is the number of masks, (H, W) are the spatial dimensions.
Returns a [N, 4] tensors, with the boxes in xyxy format
0x00b1 marked this conversation as resolved.
Show resolved Hide resolved
"""
if masks.numel() == 0:
return torch.zeros((0, 4), device=masks.device)

h, w = masks.shape[-2:]
0x00b1 marked this conversation as resolved.
Show resolved Hide resolved

y = torch.arange(0, h, dtype=torch.float)
x = torch.arange(0, w, dtype=torch.float)
y, x = torch.meshgrid(y, x)

x_mask = masks * x.unsqueeze(0)
x_max = x_mask.flatten(1).max(-1)[0]
x_min = x_mask.masked_fill(~(masks.bool()), 1e8).flatten(1).min(-1)[0]

y_mask = masks * y.unsqueeze(0)
y_max = y_mask.flatten(1).max(-1)[0]
y_min = y_mask.masked_fill(~(masks.bool()), 1e8).flatten(1).min(-1)[0]

return torch.stack([x_min, y_min, x_max, y_max], 1)