Zhaoyun Yin, Pichao Wang, Fan Wang, Xianzhe Xu, Hanling Zhang, Hao Li, Rong Jin
[Preprint]
News: TransFGU is accepted as a paper for an oral presentation at ECCV'2022!
Create the environment
# create conda env
conda create -n TransFGU python=3.8
# activate conda env
conda activate TransFGU
# install pytorch
conda install pytorch=1.8 torchvision cudatoolkit=10.1
# install other dependencies
pip install mmcv-full -f https://download.openmmlab.com/mmcv/dist/cu101/torch1.8.0/index.html
pip install -r requirements.txt
- MS-COCO Dataset: Download the trainset, validset, annotations and the json files, place the extracted files into
root/data/MSCOCO
. - PascalVOC Dataset: Download training/validation data, place the extracted files into
root/data/PascalVOC
. - Cityscapes Dataset: Download leftImg8bit_trainvaltest.zip and gtFine_trainvaltest.zip, place the extracted files into
root/data/Cityscapes
. - LIP Dataset: Download TrainVal_images.zip and TrainVal_parsing_annotations.zip, place the extracted files into
root/data/LIP
.
the structure of dataset folders should be as follow:
data/
│── MSCOCO/
│ ├── images/
│ │ ├── train2017/
│ │ └── val2017/
│ └── annotations/
│ ├── train2017/
│ ├── val2017/
│ ├── instances_train2017.json
│ └── instances_val2017.json
│── Cityscapes/
│ ├── leftImg8bit/
│ │ ├── train/
│ │ │ ├── aachen
│ │ │ └── ...
│ │ └──── val/
│ │ ├── frankfurt
│ │ └── ...
│ └── gtFine/
│ ├── train/
│ │ ├── aachen
│ │ └── ...
│ └──── val/
│ ├── frankfurt
│ └── ...
│── PascalVOC/
│ ├── JPEGImages/
│ ├── SegmentationClass/
│ └── ImageSets/
│ └── Segmentation/
│ ├── train.txt
│ └── val.txt
└── LIP/
├── train_images/
├── train_segmentations/
├── val_images/
├── val_segmentations/
├── train_id.txt
└── val_id.txt
- please download the pretrained dino model (deit small 8x8), then place it into
root/weight/dino/
- download trained model from Google Drive or Baidu Netdisk (code:1118), then place them into
root/weight/trained/
Name | mIoU | Pixel Accuracy | Model |
---|---|---|---|
COCOStuff-27 | 16.19 | 44.52 | Google Drive |
COCOStuff-171 | 11.93 | 34.32 | Google Drive |
COCO-80 | 12.69 | 64.31 | Google Drive |
Cityscapes | 16.83 | 77.92 | Google Drive |
Pascal-VOC | 37.15 | 83.59 | Google Drive |
LIP-5 | 25.16 | 65.76 | Google Drive |
LIP-16 | 15.49 | 60.08 | Google Drive |
LIP-19 | 12.24 | 42.52 | Google Drive |
To train and evaluate our method on different datasets under desired granularity level, please follow the instructions here.
If you find our work useful in your research, please consider citing:
@inproceedings{yin2022transfgu,
title = {TransFGU: A Top-down Approach to Fine-Grained Unsupervised Semantic Segmentation},
author = {Zhaoyun, Yin and Pichao, Wang and Fan, Wang and Xianzhe, Xu and Hanling, Zhang and Hao, Li and Rong, Jin},
booktitle = {European Conference on Computer Vision},
pages = {73--89},
year = {2022},
organization = {Springer}
}
The code is released under the MIT license.
Copyright (C) 2010-2021 Alibaba Group Holding Limited.