Extended semantic segmentation to image segmentation #27039

merveenoyan · 2023-10-24T13:53:22Z

This PR extends semantic segmentation guide to cover two other segmentation types (except for the big fine-tuning part) and compares them. cc @NielsRogge as discussed

docs/source/en/_toctree.yml

HuggingFaceDocBuilderDev · 2023-10-24T14:11:51Z

The documentation is not available anymore as the PR was closed or merged.

docs/source/en/tasks/semantic_segmentation.md

pcuenca · 2023-10-25T10:06:19Z

docs/source/en/tasks/semantic_segmentation.md

+results = panoptic_segmentation(Image.open(image))
+results
+```
+As you can see below, every pixel gets classified and there are multiple instances for car again.


How can we see in this output that every pixel is classifed?

I fixed this sentence.

pcuenca · 2023-10-25T10:08:24Z

docs/source/en/tasks/semantic_segmentation.md

+<div class="flex justify-center">
+     <img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/tasks/segmentation-comparison.png" alt="Segmentation Maps Compared"/>
+</div>


I'd maybe use the same order you used in the exposition: Reference, Semantic Segmentation, Instance Segmentation, Panoptic Segmentation.

The Instance Segmentation Output appears to contain more classes than "car" and "person", but the model output above didn't. Perhaps we could make it consistent?

Surprisingly that building is classified as car, and this is one of the best (maybe it is the best) instance segmentation models on Hub (mask2former). I'd rather not modify?

amyeroberts

Thanks for adding this!

+1 to all of @pcuenca's comments

docs/source/en/tasks/semantic_segmentation.md

MKhalusova

I agree with the comments from @pcuenca and @amyeroberts. We probably should also add a couple of headers.
Right now, the right-side navigation looks like this:

## Load SceneParse150 dataset
## Preprocess
## Evaluate
## Train
## Inference

I would suggest the following structure:

## Types of segmentation
## Fine-tune a semantic segmentation model
### Load SceneParse150 dataset
### Preprocess
### Evaluate
### Train
### Inference

Also, in the inference example at the end of the fine-tuning section, we can probably leave only the example of doing inference manually, since we already show inference examples with a pipeline at the beginning of the doc.

docs/source/en/tasks/semantic_segmentation.md

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

merveenoyan · 2023-11-09T14:38:30Z

I addressed all the comments. Sorry I deprioritized it shortly.

amyeroberts

Thanks! Just one small nit

docs/source/en/tasks/semantic_segmentation.md

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

NielsRogge · 2023-11-20T10:34:44Z

docs/source/en/tasks/semantic_segmentation.md

+## Types of Segmentation
+
+Semantic segmentation assigns a label or class to every single pixel in an image. Let's take a look at a semantic segmentation model output. It will assign the same class to every instance of an object it comes across in an image, for example, all cats will be labeled as "cat" instead of "cat-1", "cat-2".
+We can use transformers' image segmentation pipeline to quickly infer a semantic segmentation model. Let's take a look at the example image.


Suggested change

We can use transformers' image segmentation pipeline to quickly infer a semantic segmentation model. Let's take a look at the example image.

We can use Transformers' image segmentation pipeline to quickly infer with a semantic segmentation model called [SegFormer](model_doc/segformer). Let's take a look at the example image.

Not sure the link here will work

I think it would

docs/source/en/tasks/semantic_segmentation.md

NielsRogge · 2023-11-20T10:36:21Z

docs/source/en/tasks/semantic_segmentation.md

+     <img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/tasks/semantic_segmentation_output.png" alt="Semantic Segmentation Output"/>
+</div>
+
+In instance segmentation, the goal is not to classify every pixel, but to predict a mask for **every instance of an object** in a given image. We will use [facebook/mask2former-swin-large-cityscapes-instance](https://huggingface.co/facebook/mask2former-swin-large-cityscapes-instance) for this.


Suggested change

In instance segmentation, the goal is not to classify every pixel, but to predict a mask for **every instance of an object** in a given image. We will use [facebook/mask2former-swin-large-cityscapes-instance](https://huggingface.co/facebook/mask2former-swin-large-cityscapes-instance) for this.

In instance segmentation, the goal is not to classify every pixel, but to predict a mask for **every instance of a class** in a given image. We will use [facebook/mask2former-swin-large-cityscapes-instance](https://huggingface.co/facebook/mask2former-swin-large-cityscapes-instance) for this.

I would add here that instance segmentation is very similar to object detection: you want to get a set of instances out of your image, the only difference between object detection and instance segmentation being that the former gets a bounding box per instance, whereas the latter gets a binary mask per instance

That's a really nice way to build intuition for instance segmentation!

Thanks! I addressed it

NielsRogge · 2023-11-20T11:03:45Z

docs/source/en/tasks/semantic_segmentation.md

+     <img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/tasks/instance_segmentation_output.png" alt="Semantic Segmentation Output"/>
+</div>
+
+Panoptic segmentation combines semantic segmentation and instance segmentation, where every pixel is classified, and there are multiple masks for each instance of a class. We can use [facebook/mask2former-swin-large-cityscapes-panoptic](https://huggingface.co/facebook/mask2former-swin-large-cityscapes-panoptic) for this.


Suggested change

Panoptic segmentation combines semantic segmentation and instance segmentation, where every pixel is classified, and there are multiple masks for each instance of a class. We can use [facebook/mask2former-swin-large-cityscapes-panoptic](https://huggingface.co/facebook/mask2former-swin-large-cityscapes-panoptic) for this.

Panoptic segmentation combines semantic segmentation and instance segmentation, where every pixel is classified, and there are multiple masks for each instance of a class. We can use [facebook/mask2former-swin-large-cityscapes-panoptic](https://huggingface.co/facebook/mask2former-swin-large-cityscapes-panoptic) for this.

Panoptic segmentation technically assigns 2 labels per pixel: a semantic category and an instance ID.

MKhalusova

Thanks for iterating on the guide! LGTM, only one minor nit - the notebook login and pip install section is repeated twice in the guide.

MKhalusova · 2023-11-20T14:11:46Z

docs/source/en/tasks/semantic_segmentation.md

+Before you begin, make sure you have all the necessary libraries installed:
+
+```bash
+pip install -q datasets transformers evaluate
+```
+
+We encourage you to log in to your Hugging Face account so you can upload and share your model with the community. When prompted, enter your token to log in:
+
+```py
+>>> from huggingface_hub import notebook_login
+
+>>> notebook_login()
+```


This paragraph is repeated later in the fine-tuning section. It's probably best to have this information only once. I would suggest to leave the library installation instructions here (as we need the libraries installed for inference examples to work), but the notebook log in makes more sense in the fine-tuning section.

Thanks a lot for letting me know, addressed this!

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

merveenoyan · 2023-11-20T16:01:10Z

@MKhalusova can you merch if you approve? 👉 👈 🥹

pcuenca · 2023-11-20T16:04:48Z

docs/source/en/tasks/semantic_segmentation.md

Maybe rename file/URL to image_segmentation.md, for consistency with the contents. (Also in the yaml, of course) :)

merveenoyan · 2023-11-20T16:49:49Z

doc tests is still looking for semantic_segmentation.md although I've put a redirection 🧐

MKhalusova · 2023-11-20T18:27:02Z

We need green CI checks before we can merge. It looks like the utils/check_doctest_list.py check is failing. You can run it locally to debug and fix.

pcuenca · 2023-11-20T18:52:04Z

@MKhalusova it's my fault, I suggested to rename the file for consistency. I opened this PR to @merveenoyan's repo, which fixes the problem locally. It also fixes an issue with check_task_guides.py. Please, let me know if there's a better way to deal with this kind of situation :)

I also saw a semantic_segmentation.md in the ko translation, but refrained from renaming it because there's no _redirect.yml for Korean. Should we add it?

MKhalusova · 2023-11-20T19:03:05Z

@pcuenca Generally, we tend to avoid renaming/removing files. I have no experience safely doing so, perhaps @stevhliu could advise?

pcuenca · 2023-11-20T19:04:54Z

It's not a big deal, we can just go back to @merveenoyan's version before the rename if that's simpler.

mishig25 · 2023-11-21T09:06:23Z

Regarding redirects. For example, perf_infer_gpu_many: perf_infer_gpu_one. There is a chicken-egg situation because hf.co/docs uses _redirects.yml that is on main currently. Unless, this PR gets merged hf.co/docs for now will not redirectperf_infer_gpu_many: perf_infer_gpu_one

merveenoyan · 2023-11-21T09:09:07Z

@MKhalusova according to Mishig's response we need to merge before it turns red and then it will be green, so maybe you can make the call in this case.

pcuenca · 2023-11-21T11:59:01Z

There are two different things here.

The redirection works for me in the docs generated for the PR, these both work:
https://moon-ci-docs.huggingface.co/docs/transformers/pr_27039/en/tasks/semantic_segmentation (old)
https://moon-ci-docs.huggingface.co/docs/transformers/pr_27039/en/tasks/image_segmentation (new)
There are other CI scripts that fail because of references to the old name, as shown in this PR: https://github.com/merveenoyan/transformers/pull/1/files
These tests won't become green if we merge.

Given the increased complexity and that @MKhalusova said we generally try to avoid renames, I'd suggest we remove the rename and keep the same filename it had before. Sorry for introducing noise!

merveenoyan · 2023-11-23T14:22:37Z

Can this be merged by someone with write access?

amyeroberts · 2023-11-23T15:57:54Z

I can merge it :)

amyeroberts

Thanks for iterating!

merveenoyan added 2 commits October 24, 2023 15:51

Extended semantic segmentation

00654ca

Update image_segmentation.md

3078459

merveenoyan requested review from MKhalusova, amyeroberts and NielsRogge October 24, 2023 13:54

osanseviero reviewed Oct 24, 2023

View reviewed changes

docs/source/en/_toctree.yml Outdated Show resolved Hide resolved

Changed title

de6ef82

pcuenca reviewed Oct 25, 2023

View reviewed changes

amyeroberts reviewed Oct 25, 2023

View reviewed changes

docs/source/en/tasks/semantic_segmentation.md Outdated Show resolved Hide resolved

MKhalusova reviewed Oct 25, 2023

View reviewed changes

docs/source/en/tasks/semantic_segmentation.md Outdated Show resolved Hide resolved

merveenoyan and others added 6 commits October 26, 2023 15:44

Update docs/source/en/tasks/semantic_segmentation.md

d74a26a

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

Update docs/source/en/tasks/semantic_segmentation.md

59a9ef1

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

Update docs/source/en/tasks/semantic_segmentation.md

d1a8e64

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

Update docs/source/en/tasks/semantic_segmentation.md

03803ad

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

Update docs/source/en/tasks/semantic_segmentation.md

ddffbcc

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

Update semantic_segmentation.md

73a3dcf

amyeroberts approved these changes Nov 9, 2023

View reviewed changes

docs/source/en/tasks/semantic_segmentation.md Outdated Show resolved Hide resolved

Update docs/source/en/tasks/semantic_segmentation.md

93b3085

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

merveenoyan requested review from MKhalusova and pcuenca November 20, 2023 10:24

Merge branch 'huggingface:main' into improve_seg_task_guide

e1e627d

NielsRogge reviewed Nov 20, 2023

View reviewed changes

docs/source/en/tasks/semantic_segmentation.md Outdated Show resolved Hide resolved

NielsRogge reviewed Nov 20, 2023

View reviewed changes

MKhalusova approved these changes Nov 20, 2023

View reviewed changes

merveenoyan and others added 2 commits November 20, 2023 16:41

Update docs/source/en/tasks/semantic_segmentation.md

0cebee4

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

Addressed Niels' and Maria's comments

7b35bed

Added detail on panoptic segmentation

6a0f16e

merveenoyan requested a review from NielsRogge November 20, 2023 15:48

pcuenca reviewed Nov 20, 2023

View reviewed changes

Added redirection and renamed the file

f15645b

merveenoyan added 3 commits November 21, 2023 14:37

Update _toctree.yml

1b42272

Update _redirects.yml

aeedcc1

Rename image_segmentation.md to semantic_segmentation.md

1c725de

merveenoyan requested review from MKhalusova and amyeroberts November 21, 2023 14:41

amyeroberts approved these changes Nov 23, 2023

View reviewed changes

amyeroberts merged commit baabd38 into huggingface:main Nov 23, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extended semantic segmentation to image segmentation #27039

Extended semantic segmentation to image segmentation #27039

merveenoyan commented Oct 24, 2023

HuggingFaceDocBuilderDev commented Oct 24, 2023 •

edited

Loading

pcuenca Oct 25, 2023

merveenoyan Nov 9, 2023

pcuenca Oct 25, 2023

merveenoyan Nov 9, 2023

amyeroberts left a comment

MKhalusova left a comment

merveenoyan commented Nov 9, 2023

amyeroberts left a comment

NielsRogge Nov 20, 2023

pcuenca Nov 20, 2023

NielsRogge Nov 20, 2023

MKhalusova Nov 20, 2023

merveenoyan Nov 20, 2023

NielsRogge Nov 20, 2023

MKhalusova left a comment

MKhalusova Nov 20, 2023

merveenoyan Nov 20, 2023

merveenoyan commented Nov 20, 2023

pcuenca Nov 20, 2023

merveenoyan Nov 20, 2023

merveenoyan commented Nov 20, 2023 •

edited

Loading

MKhalusova commented Nov 20, 2023

pcuenca commented Nov 20, 2023 •

edited

Loading

MKhalusova commented Nov 20, 2023

pcuenca commented Nov 20, 2023

mishig25 commented Nov 21, 2023

merveenoyan commented Nov 21, 2023

pcuenca commented Nov 21, 2023

merveenoyan commented Nov 23, 2023

amyeroberts commented Nov 23, 2023

amyeroberts left a comment

	We can use transformers' image segmentation pipeline to quickly infer a semantic segmentation model. Let's take a look at the example image.
	We can use Transformers' image segmentation pipeline to quickly infer with a semantic segmentation model called [SegFormer](model_doc/segformer). Let's take a look at the example image.

	In instance segmentation, the goal is not to classify every pixel, but to predict a mask for every instance of an object in a given image. We will use [facebook/mask2former-swin-large-cityscapes-instance](https://huggingface.co/facebook/mask2former-swin-large-cityscapes-instance) for this.
	In instance segmentation, the goal is not to classify every pixel, but to predict a mask for every instance of a class in a given image. We will use [facebook/mask2former-swin-large-cityscapes-instance](https://huggingface.co/facebook/mask2former-swin-large-cityscapes-instance) for this.

	Panoptic segmentation combines semantic segmentation and instance segmentation, where every pixel is classified, and there are multiple masks for each instance of a class. We can use [facebook/mask2former-swin-large-cityscapes-panoptic](https://huggingface.co/facebook/mask2former-swin-large-cityscapes-panoptic) for this.
	Panoptic segmentation combines semantic segmentation and instance segmentation, where every pixel is classified, and there are multiple masks for each instance of a class. We can use [facebook/mask2former-swin-large-cityscapes-panoptic](https://huggingface.co/facebook/mask2former-swin-large-cityscapes-panoptic) for this.

Extended semantic segmentation to image segmentation #27039

Extended semantic segmentation to image segmentation #27039

Conversation

merveenoyan commented Oct 24, 2023

HuggingFaceDocBuilderDev commented Oct 24, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

amyeroberts left a comment

Choose a reason for hiding this comment

MKhalusova left a comment

Choose a reason for hiding this comment

merveenoyan commented Nov 9, 2023

amyeroberts left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

MKhalusova left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

merveenoyan commented Nov 20, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

merveenoyan commented Nov 20, 2023 • edited Loading

MKhalusova commented Nov 20, 2023

pcuenca commented Nov 20, 2023 • edited Loading

MKhalusova commented Nov 20, 2023

pcuenca commented Nov 20, 2023

mishig25 commented Nov 21, 2023

merveenoyan commented Nov 21, 2023

pcuenca commented Nov 21, 2023

merveenoyan commented Nov 23, 2023

amyeroberts commented Nov 23, 2023

amyeroberts left a comment

Choose a reason for hiding this comment

HuggingFaceDocBuilderDev commented Oct 24, 2023 •

edited

Loading

merveenoyan commented Nov 20, 2023 •

edited

Loading

pcuenca commented Nov 20, 2023 •

edited

Loading