Skip to content
This repository was archived by the owner on Oct 31, 2023. It is now read-only.

Run coco panoptic dataset #337

Open
YuShen1116 opened this issue Jan 13, 2019 · 4 comments
Open

Run coco panoptic dataset #337

YuShen1116 opened this issue Jan 13, 2019 · 4 comments
Labels

Comments

@YuShen1116
Copy link

Hi,

I'm trying to do some experiments about panoptic segmentation on the coco-panoptic dataset. Which files should I change to let the data load function can distinguish object and stuff classes?

Thank you!

@fmassa
Copy link
Contributor

fmassa commented Jan 14, 2019

Hi,

This will require a few changes. Here is what I would do:

Create a Target class

Currently, we only have target classes which correspond to boxes or regions, and they all are tied to a BoxList. For panoptic segmentation, we would instead have data which is not tied to regions.
My take on this is that we should create a Target class in https://github.com/facebookresearch/maskrcnn-benchmark/tree/master/maskrcnn_benchmark/structures, which will hold an arbitrary number of objects internally (boxes and semantic segmentation for now). They should implement the crop, rotate and transpose methods, by just dispatching to those methods to all underlying objects it hold.

This will also require writing a Segmentation (better names welcome) class which will simply hold the segmentation mask for the whole image, and will implement the three methods above. This is just for convenience, and to have everything following the same API.

Modify the implementations to get the box from the target

The easiest way to acomplish this would be to change

proposals, proposal_losses = self.rpn(images, features, targets)
if self.roi_heads:
x, result, detector_losses = self.roi_heads(features, proposals, targets)

and instead have something like

region_targets = targets.regions  # this is a Target instance, get the BoxList from it
proposals, proposal_losses = self.rpn(images, features, region_targets)
if self.roi_heads:
        x, result, detector_losses = self.roi_heads(features, proposals, region_targets)

Make datasets return a Target

This should be fairly easy, and consists of adding after

target = target.clip_to_image(remove_empty=True)

something like

target = Target(region=target)

I believe this is pretty much it for the changes to the library.

Add your own modifications to handle panoptic task

The rest is actually up to you on how you'll handle it, as it's still an open research question.

Let me know what you think, I'd me more than happy to merge a PR which implements what I've mentioned just above!

@fmassa fmassa added enhancement New feature or request contributions welcome labels Jan 14, 2019
@YuShen1116
Copy link
Author

YuShen1116 commented Jan 14, 2019

Thank you for your response!

I guess I also need to modify the

https://github.com/facebookresearch/maskrcnn-benchmark/blob/master/maskrcnn_benchmark/data/datasets/coco.py

to load panoptic annotations since its JSON file format is different with coco detection, right?

@fmassa
Copy link
Contributor

fmassa commented Jan 14, 2019

Yeah, definitely, you need to have the full segmentation mask there, as well as adding extra branches in your model etc.

I was explaining the changes that were necessary only for the current detection approach to still be valid, while enabling users to build on top of it in order for panoptic segmentation to work

@karanchahal
Copy link
Contributor

Hello @fmassa

I've been trying to get semantic segmentation to work but I'm running into trouble while computing the loss function. Assuming my image is 880 by 800 in height and width. The number of semantic segmentation classes is 2 and batch size is 4.

My predicted mask is of size 4, 2, 800, 800.

mask = tensor((4,2,800,800))
My ground truth is of size 4, 800, 800 where each cell denotes the label of that pixel. In this case it is 0 or 1.
labels = tensor((4,800,800))

I'm using the cross entropy loss function to calculate the loss of the predicted mask. However the Panoptic FPN paper tells us to only compute the loss over positive labels( the ones having label 1 in this case).

I'm having trouble with this step, how should I compute this loss in Pytorch code ?
The torch.nonzero(labels > 0).squeeze(1) line is giving me indexing errors....

Any help would be really appreciated !

Regards,
Karan

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

3 participants