forked from vdurnov/xview2_1st_place_solution
-
Notifications
You must be signed in to change notification settings - Fork 21
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
53 changed files
with
12,528 additions
and
2 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,2 +1,98 @@ | ||
# xview2_1st_place_solution | ||
1st place solution for "xView2: Assess Building Damage" challenge. | ||
# xview2 1st place solution | ||
1st place solution for "xView2: Assess Building Damage" challenge. https://www.xview2.org | ||
|
||
# Introduction to Solution | ||
|
||
Solution developed using this environment: | ||
- Python 3 (based on Anaconda installation) | ||
- Pytorch 1.1.0+ and torchvision 0.3.0+ | ||
- Nvidia apex https://github.com/NVIDIA/apex | ||
- https://github.com/skvark/opencv-python | ||
- https://github.com/aleju/imgaug | ||
|
||
|
||
Hardware: | ||
Current training batch size requires at least 2 GPUs with 12GB each. (Initially trained on Titan V GPUs). For 1 GPU batch size and learning rate should be found in practice and changed accordingly. | ||
|
||
"train", "tier3" and "test" folders from competition dataset should be placed to the current folder. | ||
|
||
Use "train.sh" script to train all the models. (~7 days on 2 GPUs). | ||
To generate predictions/submission file use "predict.sh". | ||
"evalution-docker-container" folder contains code for docker container used for final evalution on hold out set (CPU version). | ||
|
||
# Trained models | ||
Trained model weights available here: https://vdurnov.s3.amazonaws.com/xview2_1st_weights.zip | ||
|
||
(Please Note: the code was developed during the competition and designed to perform separate experiments on different models. So, published as is without additional refactoring to provide fully training reproducibility). | ||
|
||
|
||
# Data Cleaning Techniques | ||
|
||
Dataset for this competition well prepared and I have not found any problems with it. | ||
Training masks generated using json files, "un-classified" type treated as "no-damage" (create_masks.py). "masks" folders will be created in "train" and "tier3" folders. | ||
|
||
The problem with different nadirs and small shifts between "pre" and "post" images solved on models level: | ||
- Frist, localization models trained using only "pre" images to ignore this additional noise from "post" images. Simple UNet-like segmentation Encoder-Decoder Neural Network architectures used here. | ||
- Then, already pretrained localization models converted to classification Siamese Neural Network. So, "pre" and "post" images shared common weights from localization model and the features from the last Decoder layer concatenated to predict damage level for each pixel. This allowed Neural Network to look at "pre" and "post" separately in the same way and helped to ignore these shifts and different nadirs as well. | ||
- Morphological dilation with 5*5 kernel applied to classification masks. Dilated masks made predictions more "bold" - this improved accuracy on borders and also helped with shifts and nadirs. | ||
|
||
|
||
# Data Processing Techniques | ||
|
||
Models trained on different crops sizes from (448, 448) for heavy encoder to (736, 736) for light encoder. | ||
Augmentations used for training: | ||
- Flip (often) | ||
- Rotation (often) | ||
- Scale (often) | ||
- Color shifts (rare) | ||
- Clahe / Blur / Noise (rare) | ||
- Saturation / Brightness / Contrast (rare) | ||
- ElasticTransformation (rare) | ||
|
||
Inference goes on full image size (1024, 1024) with 4 simple test-time augmentations (original, filp left-right, flip up-down, rotation to 180). | ||
|
||
|
||
# Details on Modeling Tools and Techniques | ||
|
||
All models trained with Train/Validation random split 90%/10% with fixed seeds (3 folds). Only checkpoints from epoches with best validation score used. | ||
|
||
For localization models 4 different pretrained encoders used: | ||
from torchvision.models: | ||
- ResNet34 | ||
from https://github.com/Cadene/pretrained-models.pytorch: | ||
- se_resnext50_32x4d | ||
- SeNet154 | ||
- Dpn92 | ||
|
||
Localization models trained on "pre" images, "post" images used in very rare cases as additional augmentation. | ||
|
||
Localization training parameters: | ||
Loss: Dice + Focal | ||
Validation metric: Dice | ||
Optimizer: AdamW | ||
|
||
Classification models initilized using weights from corresponding localization model and fold number. They are Siamese Neural Networks with whole localization model shared between "pre" and "post" input images. Features from last Decoder layer combined together for classification. Pretrained weights are not frozen. | ||
Using pretrained weights from localization models allowed to train classification models much faster and to have better accuracy. Features from "pre" and "post" images connected at the very end of the Decoder in bottleneck part, this helping not to overfit and get better generalizing model. | ||
|
||
Classification training parameters: | ||
Loss: Dice + Focal + CrossEntropyLoss. Larger coefficient for CrossEntropyLoss and 2-4 damage classes. | ||
Validation metric: competition metric | ||
Optimizer: AdamW | ||
Sampling: classes 2-4 sampled 2 times to give them more attention. | ||
|
||
Almost all checkpoints finally finetuned on full train data for few epoches using low learning rate and less augmentations. | ||
|
||
Predictions averaged with equal coefficients for both localization and classification models separately. | ||
|
||
Different thresholds for localization used for damaged and undamaged classes (lower for damaged). | ||
|
||
|
||
# Conclusion and Acknowledgments | ||
|
||
Thank you to xView2 team for creating and releasing this amazing dataset and opportunity to invent a solution that can help to response to the global natural disasters faster. I really hope it will be usefull and the idea will be improved further. | ||
|
||
# References | ||
- Competition and Dataset: https://www.xview2.org | ||
- UNet: https://arxiv.org/pdf/1505.04597.pdf | ||
- Pretrained models for Pytorch: https://github.com/Cadene/pretrained-models.pytorch | ||
- My 1st place solution from "SpaceNet 4: Off-Nadir Building Footprint Detection Challenge" (some ideas came from here): https://github.com/SpaceNetChallenge/SpaceNet_Off_Nadir_Solutions/tree/master/cannab |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,89 @@ | ||
# Based on https://github.com/pytorch/pytorch/pull/3740 | ||
import torch | ||
import math | ||
|
||
|
||
class AdamW(torch.optim.Optimizer): | ||
"""Implements AdamW algorithm. | ||
It has been proposed in `Fixing Weight Decay Regularization in Adam`_. | ||
Arguments: | ||
params (iterable): iterable of parameters to optimize or dicts defining | ||
parameter groups | ||
lr (float, optional): learning rate (default: 1e-3) | ||
betas (Tuple[float, float], optional): coefficients used for computing | ||
running averages of gradient and its square (default: (0.9, 0.999)) | ||
eps (float, optional): term added to the denominator to improve | ||
numerical stability (default: 1e-8) | ||
weight_decay (float, optional): weight decay (L2 penalty) (default: 0) | ||
.. Fixing Weight Decay Regularization in Adam: | ||
https://arxiv.org/abs/1711.05101 | ||
""" | ||
|
||
def __init__(self, params, lr=1e-3, betas=(0.9, 0.999), eps=1e-8, | ||
weight_decay=0): | ||
defaults = dict(lr=lr, betas=betas, eps=eps, | ||
weight_decay=weight_decay) | ||
super(AdamW, self).__init__(params, defaults) | ||
|
||
def step(self, closure=None): | ||
"""Performs a single optimization step. | ||
Arguments: | ||
closure (callable, optional): A closure that reevaluates the model | ||
and returns the loss. | ||
""" | ||
loss = None | ||
if closure is not None: | ||
loss = closure() | ||
|
||
for group in self.param_groups: | ||
for p in group['params']: | ||
if p.grad is None: | ||
continue | ||
grad = p.grad.data | ||
if grad.is_sparse: | ||
raise RuntimeError('AdamW does not support sparse gradients, please consider SparseAdam instead') | ||
|
||
state = self.state[p] | ||
|
||
# State initialization | ||
if len(state) == 0: | ||
state['step'] = 0 | ||
# Exponential moving average of gradient values | ||
state['exp_avg'] = torch.zeros_like(p.data) | ||
# Exponential moving average of squared gradient values | ||
state['exp_avg_sq'] = torch.zeros_like(p.data) | ||
|
||
exp_avg, exp_avg_sq = state['exp_avg'], state['exp_avg_sq'] | ||
beta1, beta2 = group['betas'] | ||
|
||
state['step'] += 1 | ||
|
||
# according to the paper, this penalty should come after the bias correction | ||
# if group['weight_decay'] != 0: | ||
# grad = grad.add(group['weight_decay'], p.data) | ||
|
||
# Decay the first and second moment running average coefficient | ||
exp_avg.mul_(beta1).add_(1 - beta1, grad) | ||
exp_avg_sq.mul_(beta2).addcmul_(1 - beta2, grad, grad) | ||
|
||
denom = exp_avg_sq.sqrt().add_(group['eps']) | ||
|
||
bias_correction1 = 1 - beta1 ** state['step'] | ||
bias_correction2 = 1 - beta2 ** state['step'] | ||
step_size = group['lr'] * math.sqrt(bias_correction2) / bias_correction1 | ||
|
||
# w = w - wd * lr * w | ||
if group['weight_decay'] != 0: | ||
p.data.add_(-group['weight_decay'] * group['lr'], p.data) | ||
|
||
# w = w - lr * w.grad | ||
p.data.addcdiv_(-step_size, exp_avg, denom) | ||
|
||
# w = w - wd * lr * w - lr * w.grad | ||
# See http://www.fast.ai/2018/07/02/adam-weight-decay/ | ||
|
||
return loss |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,90 @@ | ||
import os | ||
os.environ["MKL_NUM_THREADS"] = "1" | ||
os.environ["NUMEXPR_NUM_THREADS"] = "1" | ||
os.environ["OMP_NUM_THREADS"] = "1" | ||
|
||
import numpy as np | ||
np.random.seed(1) | ||
import random | ||
random.seed(1) | ||
import pandas as pd | ||
import cv2 | ||
import timeit | ||
from os import path, makedirs, listdir | ||
import sys | ||
sys.setrecursionlimit(10000) | ||
from multiprocessing import Pool | ||
from skimage.morphology import square, dilation, watershed, erosion | ||
from skimage import io | ||
|
||
from shapely.wkt import loads | ||
from shapely.geometry import mapping, Polygon | ||
|
||
# import matplotlib.pyplot as plt | ||
# import seaborn as sns | ||
|
||
import json | ||
|
||
masks_dir = 'masks' | ||
|
||
train_dirs = ['train', 'tier3'] | ||
|
||
|
||
def mask_for_polygon(poly, im_size=(1024, 1024)): | ||
img_mask = np.zeros(im_size, np.uint8) | ||
int_coords = lambda x: np.array(x).round().astype(np.int32) | ||
exteriors = [int_coords(poly.exterior.coords)] | ||
interiors = [int_coords(pi.coords) for pi in poly.interiors] | ||
cv2.fillPoly(img_mask, exteriors, 1) | ||
cv2.fillPoly(img_mask, interiors, 0) | ||
return img_mask | ||
|
||
|
||
damage_dict = { | ||
"no-damage": 1, | ||
"minor-damage": 2, | ||
"major-damage": 3, | ||
"destroyed": 4, | ||
"un-classified": 1 # ? | ||
} | ||
|
||
|
||
def process_image(json_file): | ||
js1 = json.load(open(json_file)) | ||
js2 = json.load(open(json_file.replace('_pre_disaster', '_post_disaster'))) | ||
|
||
msk = np.zeros((1024, 1024), dtype='uint8') | ||
msk_damage = np.zeros((1024, 1024), dtype='uint8') | ||
|
||
for feat in js1['features']['xy']: | ||
poly = loads(feat['wkt']) | ||
_msk = mask_for_polygon(poly) | ||
msk[_msk > 0] = 255 | ||
|
||
for feat in js2['features']['xy']: | ||
poly = loads(feat['wkt']) | ||
subtype = feat['properties']['subtype'] | ||
_msk = mask_for_polygon(poly) | ||
msk_damage[_msk > 0] = damage_dict[subtype] | ||
|
||
cv2.imwrite(json_file.replace('/labels/', '/masks/').replace('_pre_disaster.json', '_pre_disaster.png'), msk, [cv2.IMWRITE_PNG_COMPRESSION, 9]) | ||
cv2.imwrite(json_file.replace('/labels/', '/masks/').replace('_pre_disaster.json', '_post_disaster.png'), msk_damage, [cv2.IMWRITE_PNG_COMPRESSION, 9]) | ||
|
||
|
||
|
||
if __name__ == '__main__': | ||
t0 = timeit.default_timer() | ||
|
||
all_files = [] | ||
for d in train_dirs: | ||
makedirs(path.join(d, masks_dir), exist_ok=True) | ||
for f in sorted(listdir(path.join(d, 'images'))): | ||
if '_pre_disaster.png' in f: | ||
all_files.append(path.join(d, 'labels', f.replace('_pre_disaster.png', '_pre_disaster.json'))) | ||
|
||
|
||
with Pool() as pool: | ||
_ = pool.map(process_image, all_files) | ||
|
||
elapsed = timeit.default_timer() - t0 | ||
print('Time: {:.3f} min'.format(elapsed / 60)) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,86 @@ | ||
import os | ||
|
||
from os import path, makedirs, listdir | ||
import sys | ||
sys.setrecursionlimit(10000) | ||
from multiprocessing import Pool | ||
import numpy as np | ||
np.random.seed(1) | ||
import random | ||
random.seed(1) | ||
|
||
import pandas as pd | ||
from tqdm import tqdm | ||
import timeit | ||
import cv2 | ||
|
||
from skimage.morphology import remove_small_objects | ||
|
||
import matplotlib.pyplot as plt | ||
import seaborn as sns | ||
|
||
from skimage.morphology import square, dilation | ||
|
||
cv2.setNumThreads(0) | ||
cv2.ocl.setUseOpenCL(False) | ||
|
||
test_dir = 'test/images' | ||
pred_folders = ['dpn92cls_cce_0_tuned', 'dpn92cls_cce_1_tuned', 'dpn92cls_cce_2_tuned'] + ['res34cls2_0_tuned', 'res34cls2_1_tuned', 'res34cls2_2_tuned'] + ['res50cls_cce_0_tuned', 'res50cls_cce_1_tuned', 'res50cls_cce_2_tuned'] + ['se154cls_0_tuned', 'se154cls_1_tuned', 'se154cls_2_tuned'] | ||
pred_coefs = [1.0] * 12 | ||
loc_folders = ['pred50_loc_tuned', 'pred92_loc_tuned', 'pred34_loc', 'pred154_loc'] | ||
loc_coefs = [1.0] * 4 | ||
|
||
sub_folder = 'submission' | ||
|
||
_thr = [0.38, 0.13, 0.14] | ||
|
||
def process_image(f): | ||
preds = [] | ||
_i = -1 | ||
for d in pred_folders: | ||
_i += 1 | ||
msk1 = cv2.imread(path.join(d, f), cv2.IMREAD_UNCHANGED) | ||
msk2 = cv2.imread(path.join(d, f.replace('_part1', '_part2')), cv2.IMREAD_UNCHANGED) | ||
msk = np.concatenate([msk1, msk2[..., 1:]], axis=2) | ||
preds.append(msk * pred_coefs[_i]) | ||
preds = np.asarray(preds).astype('float').sum(axis=0) / np.sum(pred_coefs) / 255 | ||
|
||
loc_preds = [] | ||
_i = -1 | ||
for d in loc_folders: | ||
_i += 1 | ||
msk = cv2.imread(path.join(d, f), cv2.IMREAD_UNCHANGED) | ||
loc_preds.append(msk * loc_coefs[_i]) | ||
loc_preds = np.asarray(loc_preds).astype('float').sum(axis=0) / np.sum(loc_coefs) / 255 | ||
|
||
loc_preds = loc_preds | ||
|
||
msk_dmg = preds[..., 1:].argmax(axis=2) + 1 | ||
msk_loc = (1 * ((loc_preds > _thr[0]) | ((loc_preds > _thr[1]) & (msk_dmg > 1) & (msk_dmg < 4)) | ((loc_preds > _thr[2]) & (msk_dmg > 1)))).astype('uint8') | ||
|
||
msk_dmg = msk_dmg * msk_loc | ||
_msk = (msk_dmg == 2) | ||
if _msk.sum() > 0: | ||
_msk = dilation(_msk, square(5)) | ||
msk_dmg[_msk & msk_dmg == 1] = 2 | ||
|
||
msk_dmg = msk_dmg.astype('uint8') | ||
cv2.imwrite(path.join(sub_folder, '{0}'.format(f.replace('_pre_', '_localization_').replace('_part1.png', '_prediction'))), msk_loc, [cv2.IMWRITE_PNG_COMPRESSION, 9]) | ||
cv2.imwrite(path.join(sub_folder, '{0}'.format(f.replace('_pre_', '_damage_').replace('_part1.png', '_prediction'))), msk_dmg, [cv2.IMWRITE_PNG_COMPRESSION, 9]) | ||
|
||
|
||
if __name__ == '__main__': | ||
t0 = timeit.default_timer() | ||
|
||
makedirs(sub_folder, exist_ok=True) | ||
|
||
all_files = [] | ||
for f in tqdm(sorted(listdir(pred_folders[0]))): | ||
if '_part1.png' in f: | ||
all_files.append(f) | ||
|
||
with Pool() as pool: | ||
_ = pool.map(process_image, all_files) | ||
|
||
elapsed = timeit.default_timer() - t0 | ||
print('Time: {:.3f} min'.format(elapsed / 60)) |
Oops, something went wrong.