This code is writting with reference to this paper by Hongyu Wang in Beihang University, and the code is built upon a fork of Deformble Convolutional Networks and Faster RCNN for DOTA.
@InProceedings{Xia_2018_CVPR,
author = {Xia, Gui-Song and Bai, Xiang and Ding, Jian and Zhu, Zhen and Belongie, Serge and Luo, Jiebo and Datcu, Mihai and Pelillo, Marcello and Zhang, Liangpei},
title = {DOTA: A Large-Scale Dataset for Object Detection in Aerial Images},
booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2018}
}
In my code, I have tried different approaches mainly on the following points:
- Replace Faster_RCNN with PANet.
- A new loss function called Focal Loss is attempted.
-
MXNet from the offical repository. We tested our code on MXNet@(commit 62ecb60). Due to the rapid development of MXNet, it is recommended to checkout this version if you encounter any issues.
-
Python 2.7. We recommend using Anaconda2 to manage the environments and packages.
-
Some python packages: cython, opencv-python >= 3.2.0, easydict. If
pip
is set up on your system, those packages should be able to be fetched and installed by running:
pip install Cython
pip install opencv-python==3.2.0.6
pip install easydict==1.6
- For Windows users, Visual Studio 2015 is needed to compile cython module.
Any NVIDIA GPUs with at least 4GB memory should be sufficient.
For Windows users, run cmd .\init.bat
. For Linux user, run sh ./init.sh
. The scripts will build cython module automatically and create some folders.
- Please download DOTA dataset, use the DOTA_devkit to split the data into patches. And make sure the split images look like this:
./path-to-dota-split/images
./path-to-dota-split/labelTxt
./path-to-dota-split/test.txt
./path-to-dota-split/train.txt
The test.txt and train.txt are name of the subimages(without suffix) for train and test respectively.
- Please download ImageNet-pretrained ResNet-v1-101 model manually from OneDrive, or BaiduYun, or Google drive, and put it under folder
./model
. Make sure it look like this:./model/pretrained_model/resnet_v1_101-0000.params
-
All of our experiment settings (GPU #, dataset, etc.) are kept in yaml config files at folder
./experiments/faster_rcnn/cfgs
. -
Set the "dataset_path" and "root_path" in DOTA.yaml and DOTA_quadrangle.yaml. The "dataset_path" should be the father folder of "images" and "labelTxt". The "root_path" is the path you want to save the cache data.
-
Set the scales and aspect ratios as your wish in DOTA.yaml and DOTA_quadrangle.yaml.
-
To conduct experiments, run the python scripts with the corresponding config file as input. For example, train and test on quadrangle in an end-to-end manner, run
python experiments/faster_rcnn/rcnn_dota_quadrangle_e2e.py --cfg experiments/faster_rcnn/cfgs/DOTA_quadrangle.yaml