Skip to content


Folders and files

Last commit message
Last commit date

Latest commit



59 Commits

Repository files navigation


This code is modified from PyTorch ImageNet classification example. We support more models like efficientNet-b7, resnext101 and models with Squeeze-and-Excitation attention. we also add many regularization tricks borrowed like mixup, labelsmoothing.


Getting started

  • Put data into ./data directory


Run python ./ --arch [ARCHTECHTURE] --model-dir [DIRTOSAVE] [--OPTIONARG]

For example, run python ./ --arch efficientNet-b7 --model-dir efficientNet_mixup --lr 0.07
Commands below follow this example, and please refer to Usage below for additional options. Note that you need to specify the model-dir to save all the outputs.


  • After training, the prediction be automatically generated for you in ./output/your-model-dir/results.csv
  • If you want to make prediction by yourself, you need to specific which checkpoint to use. For example, Run python ./ --arch efficientNet-b7 --model-dir efficientNet_mixup --resume best-model --evaluate to make prediction using the best model in efficientNet_mixup directory.


usage: [-h] [--data DIR] [--arch ARCH] [-j N] [--epochs N]
               [--start-epoch N] [-b N] [--lr LR] [--momentum m] [--wd W]
               [--mixup MIXUP] [--alpha ALPHA] [--augment AUGMENT]
               [--label-smoothing LABEL_SMOOTHING] [--warmup-epoch E]
               [--warmup-multiplier E] [-e] [-x] [-p N] [--save-freq S]
               [--model-dir PATH] [--resume PATH] [--pretrained] [--seed SEED]
               [--using-AdaBoost USING_ADABOOST]

PyTorch ImageNet Training

optional arguments:
  -h, --help            show this help message and exit
  --data DIR            path to dataset
  --arch ARCH           model architecture: alexnet | densenet121 |
                        densenet161 | densenet169 | densenet201 | googlenet |
                        inception_v3 | mnasnet0_5 | mnasnet0_75 | mnasnet1_0 |
                        mnasnet1_3 | mobilenet_v2 | resnet101 | resnet152 |
                        resnet18 | resnet34 | resnet50 | resnext101_32x8d |
                        resnext50_32x4d | shufflenet_v2_x0_5 |
                        shufflenet_v2_x1_0 | shufflenet_v2_x1_5 |
                        shufflenet_v2_x2_0 | squeezenet1_0 | squeezenet1_1 |
                        vgg11 | vgg11_bn | vgg13 | vgg13_bn | vgg16 | vgg16_bn
                        | vgg19 | vgg19_bn | wide_resnet101_2 |
                        wide_resnet50_2 | resnext101 | efficientNet-b7 |
                        se_resnet101 | se_resnext101 | wide_se_resnext101
                        (default: resnet18)
  -j N, --workers N     number of data loading workers (default: 4)
  --epochs N            number of total epochs to run
  --start-epoch N       manual epoch number (useful on restarts)
  -b N, --batch-size N  mini-batch size (default: 256), this is the total
                        batch size of all GPUs on the current node when using
                        Data Parallel or Distributed Data Parallel
  --lr LR, --learning-rate LR
                        initial learning rate
  --momentum m          momentum
  --wd W, --weight-decay W
                        weight decay (default: 1e-4)
  --mixup MIXUP         whether to use mixup
  --alpha ALPHA         alpha used for mix up
  --augment AUGMENT     whether to use data augment
  --label-smoothing LABEL_SMOOTHING
                        label smoothing ratio
  --warmup-epoch E      warmup epoch (default: 20)
  --warmup-multiplier E
                        warmup multiplier (default: 16)
  -e, --evaluate        evaluate model on validation set
  -x, --extract-features
                        extract features on train set
  -p N, --print-freq N  print frequency (default: 10)
  --save-freq S         save frequency (default: 10)
  --model-dir PATH      path to save and log models
  --resume PATH         checkpoint / number or best_model
  --pretrained          use pre-trained model
  --seed SEED           seed for initializing training.
  --using-AdaBoost USING_ADABOOST
                        using AdaBoost to manage training data


Tiny ImageNet Challenge






No releases published


No packages published