This repository implements Semantic Instance Segmentation with a Discriminative Loss Function. However, it contains some enhancements.
- Reference paper does not predict semantic segmentation mask, instead it uses ground-truth semantic segmentation mask. This code predicts semantic segmentation mask, similar to Towards End-to-End Lane Detection: an Instance Segmentation Approach.
- Reference paper predicts the number of instances implicity. It predicts embeddings for instances and predicts the number of instances as a result of clustering. Instead, this code predicts the number of instances as an output of network.
- This code uses Spectral Clustering; however, reference paper uses "a fast variant of the mean-shift algorithm".
- Reference paper uses a segmentation network based on ResNet-38. Instead, this code uses ReSeg with skip-connections based on first seven convolutional layers of VGG16 as segmentation network.
In prediction phase, network inputs an image and outputs a semantic segmentation mask, the number of instances and embeddings for all pixels in the image. Then, foreground embeddings (which correspond to instances) are selected using semantic segmentation mask and foreground embeddings are clustered into "the number of instances" groups using spectral clustering.
- Clone this repository :
git clone --recursive https://github.com/Wizaron/instance-segmentation-pytorch.git
- Download and install Anaconda or Miniconda
- Create a conda environment :
conda env create -f instance-segmentation-pytorch/code/conda_environment.yml
- Download CVPPP dataset and extract downloaded zip file (
CVPPP2017_LSC_training.zip
) toinstance-segmentation-pytorch/data/raw/CVPPP/
- This work uses A1 subset of the dataset.
- Download Cityscapes dataset and extract downloaded zip files (
gtFine_trainvaltest.zip
andleftImg8bit_trainvaltest.zip
) toinstance-segmentation-pytorch/data/raw/cityscapes/
- code: Codes for training and evaluation.
- lib
- lib/cvppp_arch.py: Defines network architecture for CVPPP dataset.
- lib/cityscapes_arch.py: Defines network architecture for cityscapes dataset.
- lib/model.py: Defines model (optimization, criterion, fit, predict, test, etc.).
- lib/dataset.py: Data loading, augmentation, minibatching procedures.
- lib/preprocess.py, lib/utils: Augmentation methods.
- lib/prediction.py: Prediction module.
- lib/losses/dice.py: Dice loss for foreground semantic segmentation.
- lib/losses/discriminative.py: Discriminative loss for instance segmentation.
- settings
- settings/CVPPP/data_settings.py, settings/cityscapes/data_settings.py: Defines settings about data.
- settings/CVPPP/model_settings.py, settings/cityscapes/model_settings.py: Defines settings about model (hyper-parameters).
- settings/CVPPP/training_settings.py, settings/cityscapes/training_settings.py: Defines settings for training (optimization method, weight decay, augmentation, etc.).
- train.py: Training script.
- pred.py: Prediction script for single image.
- pred_list.py: Prediction scripts for a list of images.
- evaluate.py: Evaluation script. Calculates SBD (symmetric best dice), |DiC| (absolute difference in count) and Foreground Dice (Dice score for semantic segmentation) as defined in the paper.
- lib
- data: Stores data and scripts to prepare dataset for training and evaluation.
- metadata/CVPPP, metadata/cityscapes: Stores metadata; such as, training, validation and test splits, image shapes etc.
- processed/CVPPP, processed/cityscapes: Stores processed form of the data.
- raw/CVPPP, raw/cityscapes: Stores raw form of the data.
- scripts: Stores scripts to prepare dataset.
- scripts/CVPPP: For CVPPP dataset.
- scripts/CVPPP/1-create_annotations.py: Saves annotations as a numpy array to
processed/CVPPP/semantic-annotations/
andprocessed/CVPPP/instance-annotations
. - scripts/CVPPP/1-remove_alpha.sh: Removes alpha channels from images. (In order to run this script,
imagemagick
should be installed.). - scripts/CVPPP/2-get_image_means-stds.py: Calculates and prints channel-wise means and standard deviations from training subset.
- scripts/CVPPP/2-get_image_shapes.py: Saves image shapes to
metadata/CVPPP/image_shapes.txt
. - scripts/CVPPP/2-get_number_of_instances.py: Saves the number of instances in each image to
metadata/CVPPP/number_of_instances.txt
. - scripts/CVPPP/2-get_image_paths.py: Saves image paths to
metadata/CVPPP/training_image_paths.txt
,metadata/CVPPP/validation_image_paths.txt
- scripts/CVPPP/3-create_dataset.py: Creates an lmdb dataset to
processed/CVPPP/lmdb/
.
- scripts/CVPPP/1-create_annotations.py: Saves annotations as a numpy array to
- scripts/cityscapes: For cityscapes dataset.
- scripts/cityscapes/1-create-annotations.py: Saves annotations as a numpy array to
processed/cityscapes/semantic-annotations
andprocessed/cityscapes/instance-annotations
, saves the number of instances in each image tometadata/cityscapes/number_of_instances.txt
and save subset lists tometadata/cityscapes/training.lst
andmetadata/cityscapes/validation.lst
. - scripts/cityscapes/2-get_image_paths.py: Saves image paths to
metadata/cityscapes/training_image_paths.txt
,metadata/cityscapes/validation_image_paths.txt
- scrits/cityscapes/3-create_dataset.py: Creates an lmdb dataset to
processed/cityscapes/lmdb/
.
- scripts/cityscapes/1-create-annotations.py: Saves annotations as a numpy array to
- scripts/CVPPP: For CVPPP dataset.
- models/CVPPP, models/cityscapes: Stores checkpoints of the trained models.
- outputs/CVPPP, outputs/cityscapes: Stores predictions of the trained models.
Data should be prepared prior to training and evaluation.
- Activate previously created conda environment :
source activate ins-seg-pytorch
- Place the extracted dataset to
instance-segmentation-pytorch/data/raw/CVPPP/
. Hence, raw dataset should be found atinstance-segmentation-pytorch/data/raw/CVPPP/CVPPP2017_LSC_training/
. - In order to prepare the data go to
instance-segmentation-pytorch/data/CVPP/scripts
andpython 1-create_annotations.py
sh 1-remove_alpha.sh
python 2-get_image_paths.py
python 3-create_dataset.py
- Place the extracted datasets to
instance-segmentation-pytorch/data/raw/cityscapes/
. Hence, raw dataset should be found atinstance-segmentation-pytorch/data/raw/cityscapes/gtFine/
andinstance-segmentation-pytorch/data/raw/cityscapes/leftImg8bit/
. - In order to prepare the data go to
instance-segmentation-pytorch/data/cityscapes/scripts
andpython 1-create_annotations.py
python 2-get_image_paths.py
python 3-create_dataset.py
Start a Visdom server in a screen
or tmux
.
-
Activate previously created conda environment :
source activate ins-seg-pytorch
-
Start visdom server :
python -m visdom.server
-
Access visdom server using
http://localhost:8097
-
Activate previously created conda environment :
source activate ins-seg-pytorch
-
Go to
instance-segmentation-pytorch/code/
and runtrain.py
.
usage: train.py [-h] [--model MODEL] [--usegpu] [--nepochs NEPOCHS]
[--batchsize BATCHSIZE] [--debug] [--nworkers NWORKERS]
--dataset DATASET
optional arguments:
-h, --help show this help message and exit
--model MODEL Filepath of trained model (to continue training)
[Default: '']
--usegpu Enables cuda to train on gpu [Default: False]
--nepochs NEPOCHS Number of epochs to train for [Default: 600]
--batchsize BATCHSIZE
Batch size [Default: 2]
--debug Activates debug mode [Default: False]
--nworkers NWORKERS Number of workers for data loading (0 to do it using
main process) [Default : 2]
--dataset DATASET Name of the dataset: "cityscapes" or "CVPPP"
Debug mode plots pixel embeddings to visdom, it reduces size of the embeddings to two-dimensions using TSNE. Hence, it slows down training.
As training continues, models will be saved to instance-segmentation-pytorch/models/CVPPP
or instance-segmentation-pytorch/models/cityscapes
according to the dataset argument.
After training is complete, we can make predictions.
-
Activate previously created conda environment :
source activate ins-seg-pytorch
-
Go to
instance-segmentation-pytorch/code/
. -
Run
pred_list.py
.
usage: pred_list.py [-h] --lst LST --model MODEL [--usegpu]
[--n_workers N_WORKERS] --dataset DATASET
optional arguments:
-h, --help show this help message and exit
--lst LST Text file that contains image paths
--model MODEL Path of the model
--usegpu Enables cuda to predict on gpu
--n_workers N_WORKERS
Number of workers for clustering
--dataset DATASET Name of the dataset: "cityscapes" or "CVPPP"
For example: python pred_list.py --lst ../data/metadata/CVPPP/validation_image_paths.txt --model ../models/CVPPP/2018-3-4_16-15_jcmaxwell_29-937494/model_155_0.123682662845.pth --usegpu --n_workers 4 --dataset CVPPP
- After prediction is completed we can run
evaluate.py
. It will print metrics to the stdout.
usage: evaluate.py [-h] --pred_dir PRED_DIR --dataset DATASET
optional arguments:
-h, --help show this help message and exit
--pred_dir PRED_DIR Prediction directory
--dataset DATASET Name of the dataset: "cityscapes" or "CVPPP"
For example: python evaluate.py --pred_dir ../outputs/CVPPP/2018-3-4_16-15_jcmaxwell_29-937494-model_155_0.123682662845/validation/ --dataset CVPPP
After training is complete, we can make predictions. We can use pred.py
to make predictions for a single image.
-
Activate previously created conda environment :
source activate ins-seg-pytorch
-
Go to
instance-segmentation-pytorch/code/
. -
Run
pred.py
.
usage: pred.py [-h] --image IMAGE --model MODEL [--usegpu] --output OUTPUT
[--n_workers N_WORKERS] --dataset DATASET
optional arguments:
-h, --help show this help message and exit
--image IMAGE Path of the image
--model MODEL Path of the model
--usegpu Enables cuda to predict on gpu
--output OUTPUT Path of the output directory
--n_workers N_WORKERS
Number of workers for clustering
--dataset DATASET Name of the dataset: "cityscapes" or "CVPPP"
SBD | |DiC| | Foreground Dice |
---|---|---|
87.4 | 0.6 | 96.8 |
- PEP-8 Style coding.
- Support batch predictions.
- Train a model.
- Update evaluation script to support cityscapes.
- Improve data creation scripts.
- Add results.