Skip to content

Official implementation of Neurips 2020 "Sparse Weight Activation Training" paper.

License

Notifications You must be signed in to change notification settings

AamirRaihan/SWAT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Md Aamir Raihan, Tor Aamodt

This repository contains code for the CNN experiments presented in the NeurIPS 2020 paper along with some additional functionalities.

Table of Contents

Experiment Setup

All the experiments can be recreated either by running the docker image or by manually installing the cuda/cudnn with proper dependencies.

Manual Setup

# GCC Version  5.5.0 20171010

# Install CUDA-10.0.130 
wget https://developer.nvidia.com/compute/cuda/10.0/Prod/local_installers/cuda_10.0.130_410.48_linux
chmod +x cuda_10.0.130_410.48_linux
./cuda_10.0.130_410.48_linux

# Install cuDNN-/cudnn-7.6.4.38 
wget https://developer.nvidia.com/compute/machine-learning/cudnn/secure/7.6.4.38/Production/10.0_20190923/cudnn-10.0-linux-x64-v7.6.4.38.tgz
# Extract cudnn-10.0-linux-x64-v7.6.4.38.tgz
cp libcudnn.so  cuda-10/lib64/
cp cudnn.h   cuda-10/include/

# Install the anaconda enabled with Python 3.7.4
wget https://repo.anaconda.com/archive/Anaconda3-2019.10-Linux-x86_64.sh
chmod +x Anaconda3-2019.10-Linux-x86_64.sh
./Anaconda3-2019.10-Linux-x86_64.sh

#Install Pytorch
# Don't use any other version since the pytorch C++ interface for cuDNN wrapper has been changed and therefore the other version is not compatible with this code.
conda install pytorch==1.1.0 torchvision==0.3.0 cudatoolkit=10.0 -c pytorch

# Clone the SWAT Repository
git clone https://github.com/AamirRaihan/SWAT.git

# Install the cuDNN C++ wrapper for custom convolution layer.
cd SWAT/SWAT-code/mypkg
python setup.py install

Docker Setup

You can also pull a pre-built docker image from Docker Hub and run with docker v19.03+

sudo docker run --gpus all --ipc=host  -it  --rm -v /PATH/TO/IMAGENET/DATASET:/workspace/datasets/ swaticml/custom-swat-pytorch:v1

More Information on installing docker is present here.

Basic Usage

Unstructured SWAT

CIFAR10

Running Inference:

python main.py -model "ResNet18" -dataset="Cifar10" --schedule-file $1 --inference 1 --checkpoint $2

Schedule-files are present in run_configurations directory.

Running Training:

./run_script/resnet_cifar10.sh run_configurations/unstructured_constant_resnet18_schedule_90.yaml

Network Method Weight Sparsity (%) Activation Sparsity (%) Top-1 Accuracy (%) Training FLOP ⬇ % Checkpoint
VGG-16 SWAT-U 90.0 90.0 91.95±0.06 89.7 here
VGG-16 SWAT-ERK 95.0 82.0 92.50±0.07 89.5 here
VGG-16 SWAT-M 95.0 65.0 93.41±0.05 64.0 here
WRN-16-8 SWAT-U 90.0 90.0 95.13±0.11 90.0 here
WRN-16-8 SWAT-ERK 95.0 84.0 95.00±0.12 91.4 here
WRN-16-8 SWAT-M 95.0 78.0 94.97±0.04 86.3 here
DenseNet-121 SWAT-U 90.0 90.0 94.48±0.06 89.8 here
DenseNet-121 SWAT-ERK 90.0 88.0 94.14±0.11 89.7 here
DenseNet-121 SWAT-M 90.0 86.0 94.29±0.11 84.2 here

Note more checkpoints are available here. Basically, the checkpoints for different sparsity percentage are present there. Moreover, ResNet-18 data is also present. You can also plot the train/test and loss curve for all the individual training run using the data present in the directory.

CIFAR100

Running Inference:

python main.py -model "ResNet18" -dataset="Cifar100" --schedule-file $1 --inference 1 --checkpoint $2

Schedule-files are present in run_configurations directory.

Running Training:

./run_script/resnet_cifar100.sh run_configurations/unstructured_constant_resnet18_schedule_90.yaml

Network Method Weight Sparsity(%) Activation Sparsity(%) Top-1 Accuracy(%) Training FLOP ⬇ % Checkpoint
VGG-16 SWAT-U 90.0 90.0 91.95±0.08 - here
VGG-16 SWAT-ERK 90.0 69.6 92.50±0.11 - here
VGG-16 SWAT-M 90.0 59.9 93.41±0.23 - here
WRN-16-8 SWAT-U 90.0 90.0 95.13±0.13 - here
WRN-16-8 SWAT-ERK 90.0 77.6 95.00±0.07 - here
WRN-16-8 SWAT-M 90.0 73.3 94.97±0.11 - here
DenseNet-121 SWAT-U 90.0 90.0 94.48±0.06 - here
DenseNet-121 SWAT-ERK 90.0 90.0 94.14±0.03 - here
DenseNet-121 SWAT-M 90.0 84.2 94.29±0.13 - here

IMAGENET

Network Method Weight Sparsity (%) Activation Sparsity (%) Top-1 Accuracy (%) Training FLOP ⬇ %
ResNet-50 SWAT-U 80.0 80.0 75.2±0.06 76.1
ResNet-50 SWAT-U 90.0 90.0 72.1±0.03 85.6
ResNet-50 SWAT-ERK 80.0 52.0 76.0±0.16 60.0
ResNet-50 SWAT-ERK 90.0 64.0 73.8±0.23 79.0
ResNet-50 SWAT-M 80.0 49.0 74.6±0.10 45.9
ResNet-50 SWAT-M 90.0 57.0 74.0±0.18 65.4
WRN-50-2 SWAT-U 80.0 80.0 76.4±0.11 78.6
WRN-50-2 SWAT-U 90.0 90.0 74.7±0.27 88.4

Structured SWAT

CIFAR10

Running Inference:

python main.py -model "ResNet18" -dataset="Cifar10" --schedule-file $1 --inference 1 --checkpoint $2

Schedule-files are present in run_configurations directory.

Running Training:

./run_script/resnet_cifar10.sh run_configurations/structured_constant_resnet18_schedule_70.yaml

Network Method Weight Sparsity (%) Activation Sparsity (%) Channel Pruned (%) Top-1 Accuracy (%) Training FLOP ⬇ % Checkpoint
ResNet-18 SWAT-U 50.0 50.0 50.0 94.73±0.06 49.9 here
ResNet-18 SWAT-U 60.0 60.0 60.0 94.68±0.03 59.8 here
ResNet-18 SWAT-U 70.0 70.0 70.0 94.65±0.19 69.8 here
DenseNet-121 SWAT-U 50.0 50.0 50.0 95.04±0.26 49.9 here
DenseNet-121 SWAT-U 60.0 60.0 60.0 94.82±0.11 59.9 here
DenseNet-121 SWAT-U 70.0 70.0 70.0 94.81±0.20 69.9 here

CIFAR100

Running Inference:

python main.py -model "ResNet18" -dataset="Cifar100" --schedule-file $1 --inference 1 --checkpoint $2

Schedule-files are present in run_configurations directory.

Running Training:

./run_script/resnet_cifar100.sh run_configurations/structured_constant_resnet18_schedule_70.yaml

Network Method Weight Sparsity (%) Activation Sparsity (%) Channel Pruned (%) Top-1 Accuracy (%) Checkpoint
ResNet-18 SWAT-U 50.0 50.0 50.0 76.4±0.05 here
ResNet-18 SWAT-U 60.0 60.0 60.0 76.2±0.11 here
ResNet-18 SWAT-U 70.0 70.0 70.0 75.6±0.09 here
DenseNet-121 SWAT-U 50.0 50.0 50.0 78.7±0.03 here
DenseNet-121 SWAT-U 60.0 60.0 60.0 78.5±0.08 here
DenseNet-121 SWAT-U 70.0 70.0 70.0 78.1±0.12 here

IMAGEMENT

Network Method Weight Sparsity (%) Activation Sparsity (%) Channel Pruned (%) Top-1 Accuracy (%) Training FLOP ⬇ %
ResNet-50 SWAT-U 50.0 50.0 50.0 76.51±0.30 47.6
ResNet-50 SWAT-U 60.0 60.0 60.0 76.35±0.06 57.1
ResNet-50 SWAT-U 70.0 70.0 70.0 75.67±0.06 66.6
WRN-50-2 SWAT-U 50.0 50.0 50.0 78.08±0.20 49.1
WRN-50-2 SWAT-U 60.0 60.0 60.0 77.55±0.07 58.9
WRN-50-2 SWAT-U 70.0 70.0 70.0 77.19±0.11 68.7

Pretrained Model

  1. Unstructured-SWAT on CIFAR-10 Dataset : here
  2. Unstructured-SWAT on CIFAR-100 Dataset : here
  3. Structured-SWAT on CIFAR-10 Dataset : here
  4. Structured-SWAT on CIFAR-100 Dataset : here

Training/Inference FLOP count

Citation

If you find this project useful in your research, please consider citing:

​```
@inproceedings{RaihanNeurips2020
  author    = {Raihan, Md Aamir and Aamodt, Tor M},
  booktitle = {Advances in Neural Information Processing Systems},
  title     = {Sparse Weight Activation Training},
  url = {https://proceedings.neurips.cc/paper/2020/file/b44182379bf9fae976e6ae5996e13cd8-Paper.pdf},
  month     = {December},
  year      = {2020},
}
​```

About

Official implementation of Neurips 2020 "Sparse Weight Activation Training" paper.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published