This is the code for the ICML'20 paper "Rethinking Bias-Variance Trade-off for Generalization of Neural Networks".
Risk, bias, and variance for ResNet34 on CIFAR10 dataset (25,000 training samples).
- Python
- Pytorch (1.3.1)
- CUDA
- numpy
There are 4 folders, cifar10
, cifar100
, fmnist
, and mnist
. First cd
into the directory. Then run
python train.py --trial 2 --arch resnet34 --width 10 --num-epoch 500 --lr-decay 200 --outdir part1
trial
: how many splits, i.e., iftrial=2
oncifar10
, then the trainig sample size is50000/2 = 25000
.arch
: network architecture.width
: width of the network.num-epoch
: how many epochs for training.lr-decay
: after how many epochs to decay the learning rate.outdir
: specify the name of the folder for saving logs and checkpoints.
The results (including bias and variance) will be save in 'log_width{}.txt'.format(args.width)
, in the folder '{}_{}_trial{}_mse{}'.format(args.dataset, args.arch, args.trial, args.outdir)
.
The log file includes the following,
trial | train loss | train acc | test loss | test acc | bias | variance |
---|
First cd
into the cifar10
folder. Then run
python train_labelnoise.py --trial 5 --arch resnet34 --width 10 --noise-size 1000
noise-size
: specify the number of label noise for each split of the dataset. For example, heretrial=5
, the training sample size is10000
, the label noise size is1000
, which means that the label noise percentage is1000/10000=10%
.
The results (including bias and variance) will be save in 'log_width{}.txt'.format(args.width)
, in the folder '{}_{}_trial{}_labelnoise{}_mse{}'.format(args.dataset, args.arch, args.trial, args.noise_size, args.outdir)
.
How to train models on CIFAR10 datasets with cross-entropy loss (CE loss & generalized Bregman divergence bias-variance decomposition)?
First cd
into the cifar10
folder. Then run
python train_ce.py --trial 5 --arch resnet34 --width 10
The results (including bias and variance) will be save in 'log_width{}.txt'.format(args.width)
, in the folder '{}_{}_trial{}_ce{}'.format(args.dataset, args.arch, args.trial, args.outdir)
.
The bias and variance save in 'log_width{}.txt'.format(args.width)
is the classical MSE bias variance decomposition. To compute the generalized Bregman divergence bias variance decomposition, need to run
python evaluate_bv_ce.py --model-dir-list cifar10_resnet34_trial5_cepart1 cifar10_resnet34_trial5_cepart2 --outdir ce_bv_results --width 10
model-dir-list
: specify the folders for evaluations. For example, here we will calculate the bias and variance based on the models (width=10
) saved incifar10_resnet34_trial5_cepart1
andcifar10_resnet34_trial5_cepart2
. The total number of models is5 * 2 = 10
.outdir
: the folder name for saving the computed results.
Bias and variance under different label noise percentage. Increasing label noise leads to double-descent phenomenon. ResNet34 using MSE loss on CIFAR10 dataset with 10,000 training samples.
Training error and test error under different label noise percentage. Increasing label noise leads to double-descent phenomenon. ResNet34 using MSE loss on CIFAR10 dataset with 10,000 training samples.
How to evaluate bias variance on CIFAR10-C (out-of-distribution) dataset (MSE divergence bias-variance decomposition)?
First, cd
into the cifar10
folder, then download CIFAR-10-C dataset by
mkdir -p ./data/cifar
curl -O https://zenodo.org/record/2535967/files/CIFAR-10-C.tar
curl -O https://zenodo.org/record/3555552/files/CIFAR-100-C.tar
tar -xvf CIFAR-100-C.tar -C data/cifar/
tar -xvf CIFAR-10-C.tar -C data/cifar/
next, run
python evaluate_bv_mse_ood.py --modeldir cifar10_resnet34_trial2_mse --outdir ood_bv_results --width 10
The results (including bias and variance) will be save in 'log_width{}.txt'.format(args.width)
, in the folder ood_bv_results
.
modeldir
: specify the folder for evaluations.outdir
: the folder name for saving the computed results.
For more experimental and technical details, please check our paper.
@inproceedings{yang2020rethinking,
author = {Zitong Yang and Yaodong Yu and Chong You and Jacob Steinhardt and Yi Ma},
booktitle = {International Conference on Machine Learning (ICML)},
title = {Rethinking bias-variance trade-off for generalization of neural networks},
year = {2020},
}