This code is modified from PyTorch ImageNet classification example. We support more models like efficientNet-b7, resnext101 and models with Squeeze-and-Excitation attention. we also add many regularization tricks borrowed like mixup, labelsmoothing.
- Python3
- Pytorch
- tensorboard
- Put data into
./data
directory
Run
python ./main.py --arch [ARCHTECHTURE] --model-dir [DIRTOSAVE] [--OPTIONARG]
For example, run python ./main.py --arch efficientNet-b7 --model-dir efficientNet_mixup --lr 0.07
Commands below follow this example, and please refer to Usage below for additional options. Note that you need to specify the model-dir
to save all the outputs.
- After training, the prediction be automatically generated for you in
./output/your-model-dir/results.csv
- If you want to make prediction by yourself, you need to specific which checkpoint to use. For example, Run
python ./main.py --arch efficientNet-b7 --model-dir efficientNet_mixup --resume best-model --evaluate
to make prediction using the best model in efficientNet_mixup directory.
usage: main.py [-h] [--data DIR] [--arch ARCH] [-j N] [--epochs N]
[--start-epoch N] [-b N] [--lr LR] [--momentum m] [--wd W]
[--mixup MIXUP] [--alpha ALPHA] [--augment AUGMENT]
[--label-smoothing LABEL_SMOOTHING] [--warmup-epoch E]
[--warmup-multiplier E] [-e] [-x] [-p N] [--save-freq S]
[--model-dir PATH] [--resume PATH] [--pretrained] [--seed SEED]
[--using-AdaBoost USING_ADABOOST]
PyTorch ImageNet Training
optional arguments:
-h, --help show this help message and exit
--data DIR path to dataset
--arch ARCH model architecture: alexnet | densenet121 |
densenet161 | densenet169 | densenet201 | googlenet |
inception_v3 | mnasnet0_5 | mnasnet0_75 | mnasnet1_0 |
mnasnet1_3 | mobilenet_v2 | resnet101 | resnet152 |
resnet18 | resnet34 | resnet50 | resnext101_32x8d |
resnext50_32x4d | shufflenet_v2_x0_5 |
shufflenet_v2_x1_0 | shufflenet_v2_x1_5 |
shufflenet_v2_x2_0 | squeezenet1_0 | squeezenet1_1 |
vgg11 | vgg11_bn | vgg13 | vgg13_bn | vgg16 | vgg16_bn
| vgg19 | vgg19_bn | wide_resnet101_2 |
wide_resnet50_2 | resnext101 | efficientNet-b7 |
se_resnet101 | se_resnext101 | wide_se_resnext101
(default: resnet18)
-j N, --workers N number of data loading workers (default: 4)
--epochs N number of total epochs to run
--start-epoch N manual epoch number (useful on restarts)
-b N, --batch-size N mini-batch size (default: 256), this is the total
batch size of all GPUs on the current node when using
Data Parallel or Distributed Data Parallel
--lr LR, --learning-rate LR
initial learning rate
--momentum m momentum
--wd W, --weight-decay W
weight decay (default: 1e-4)
--mixup MIXUP whether to use mixup
--alpha ALPHA alpha used for mix up
--augment AUGMENT whether to use data augment
--label-smoothing LABEL_SMOOTHING
label smoothing ratio
--warmup-epoch E warmup epoch (default: 20)
--warmup-multiplier E
warmup multiplier (default: 16)
-e, --evaluate evaluate model on validation set
-x, --extract-features
extract features on train set
-p N, --print-freq N print frequency (default: 10)
--save-freq S save frequency (default: 10)
--model-dir PATH path to save and log models
--resume PATH checkpoint / number or best_model
--pretrained use pre-trained model
--seed SEED seed for initializing training.
--using-AdaBoost USING_ADABOOST
using AdaBoost to manage training data