Skip to content

Latest commit

 

History

History
184 lines (151 loc) · 8.81 KB

README.md

File metadata and controls

184 lines (151 loc) · 8.81 KB

Problem description

In this repository, you can train models for the xView2 challenge to create an accurate and efficient model for building localization and damage assessment based on satellite imagery. For building localization, predicted pixel values must be either 0 (no building) or 1 (building), whereas for building damage classification: 1 (undamaged building), 2 (minor damaged building), 3 (major damaged building), 4 (destroyed building)

IMAGE LABEL PREDICTION

Methods

The following options can be used to train U-Net models:

  • U-Net encoders:
  • Loss functions:
  • U-Net variants for damage assessment:
    • Siamese - share weights for pre and post disaster images; two variants - with shared encoder only or encoder and decoder
    • Fused - use two U-Nets with fused blocks to aggregate context from pre and post disaster images; two variants - with fused encoder only or encoder and decoder
    • Parallel - use two U-Nets for pre and post images separately; two variants - with parallel encoder only or encoder and decoder
    • Concatenated - use 6-channel input i.e. concatenation of pre and post images
  • Deep Supervision
  • Attention
  • Pyramid Parsing Module
  • Atrous Spatial Pyramid Pooling Module
  • Test time augmentation
  • Supported optimizers: SGD, Adam, RAdam, Adabelief, Adabound, Adamp, Novograd
  • Supported learning rate scheduler: Noam

In the usage section, you can find the full list of available options whereas in examples you can find a few commands for launching training and evaluation.

Dataset

The dataset used in the contests is called xBD and contains 22,068 images each of 1024x1024 size with RGB colors (see the xBD paper for more details). The data is available for download from the xView2 challenge website (registration required).

This repository assumes the following data layout:

/data
 ├── train
 │      ├── images
 │      │      └── <image_id>.png
 │      │      └── ...
 │      └── targets
 │             └── <image_id>.png
 │             └── ...
 ├── test
 │      ├── images
 │      │      └── <image_id>.png
 │      │      └── ...
 │      └── targets
 │             └── <image_id>.png
 │             └── ...
 └── holdout
        ├── images
        │      └── <image_id>.png
        │      └── ...
        └── targets
               └── <image_id>.png
               └── ...

For example to convert json files within DATA_PATH/train directory, use:

python utils/convert2png.py --data DATA_PATH/train

Installation

The repository contains Dockerfile which handles all required dependencies. Here are the steps to prepare the environment:

  1. Clone repository: git clone https://github.com/michal2409/xview2 && cd xview2

  2. Build docker image: docker build -t xview2 .

  3. Run docker container:

docker run -it --rm --gpus all --shm-size=8g --ulimit memlock=-1 --ulimit stack=67108864 -v RESULTS_PATH:/results -v DATA_PATH:/data xview2 bash

where

  • DATA_PATH is the path to xBD directory with layout as described in the dataset section
  • RESULTS_PATH is the path to the directory for artifacts like checkpoints, log output or predictions

Usage

Here are the options for the main.py script:

usage: python main.py [--optional arguments] 

optional arguments:
  -h, --help            show this help message and exit
  --exec_mode {train,eval}
                        Execution mode of main script
  --data                Path to the data directory
  --results             Path to the results directory
  --gpus                Number of gpus to use
  --num_workers         Number of subprocesses to use for data loading
  --batch_size          Training batch size
  --val_batch_size      Evaluation batch size
  --precision {16,32}   Numerical precision
  --epochs              Max number of epochs      
  --patience            Early stopping patience
  --ckpt                Path to pretrained checkpoint
  --logname             Name of logging file
  --ckpt_pre            Path to pretrained checkpoint of localization model used to initialize network for damage assesment
  --type {pre,post}     Type of task to run; pre - localization, post - damage assesment
  --seed
  --optimizer {sgd,adam,adamw,radam,adabelief,adabound,adamp,novograd}
  --dmg_model {siamese,siameseEnc,fused,fusedEnc,parallel,parallelEnc,diff,cat}
                        U-Net variant for damage assessment task
  --encoder {resnest50,resnest101,resnest200,resnest269,resnet50,resnet101,resnet152}
                        U-Net encoder
  --loss_str            String used for creation of loss function, e.g focal+dice creates the loss function as sum of focal and dice.
                        Available functions: dice, focal, ce, ohem, mse, coral
  --use_scheduler       Enable Noam learning rate scheduler
  --warmup              Warmup epochs for Noam learning rate scheduler
  --init_lr             Initial learning rate for Noam scheduler
  --final_lr            Final learning rate for Noam scheduler
  --lr                  Learning rate, or a target learning rate for Noam scheduler
  --weight_decay        Weight decay (L2 penalty)
  --momentum            Momentum for SGD optimizer
  --dilation {1,2,4}    Dilation rate for a encoder, e.g dilation=2 uses dilation instead of stride in the last encoder block
  --tta                 Enable test time augmentation
  --ppm                 Use pyramid pooling module
  --aspp                Use atrous spatial pyramid pooling
  --no_skip             Disable skip connections in UNet
  --deep_supervision    Enable deep supervision
  --attention           Enable attention module at the decoder
  --autoaugment         Use imageNet autoaugment pipeline
  --interpolate         Interpolate feature map from encoder without a decoder
  --dec_interp          Use interpolation instead of transposed convolution in a decoder

Examples

To train the building localization task with the resnest200 encoder, cross entropy + dice loss function, deep supervision, attention and test time augmentation with 1 gpu and batch size 16 for training and 8 for evaluation, launch:

python main.py --type pre --encoder resnest200 --loss_str ce+dice --deep_supervision --attention --tta --gpus 1 --batch_size 16 --val_batch_size 8 --gpus 1

To train the building damage assessment task with the siamese version of U-Net, resnest200 encoder, focal + dice loss function, deep supervision, attention and test time augmentation with 8 gpus and batch size 16 for training and 8 GPUs for evaluation, launch:

python main.py --type post --dmg_model siamese --encoder resnest200 --loss_str focal+dice --attention --deep_supervision --tta --gpus 8 --batch_size 16 --val_batch_size 8 

To run inference with batch size 8 on the test set, launch:

python main.py --exec_mode eval --type {pre,post} --ckpt <path/to/checkpoint> --gpus 1 --val_batch_size 8

To post process the saved predictions during inference, launch:

python utils/post_process.py

To get the final score, launch:

python utils/xview2_metrics.py /results/predictions /results/targets /results/score.json && cat /results/score.json

References