Sparse Weight Activation Training

This repository contains code for the CNN experiments presented in the NeurIPS 2020 paper along with some additional functionalities.

# GCC Version  5.5.0 20171010

# Install CUDA-10.0.130 
wget https://developer.nvidia.com/compute/cuda/10.0/Prod/local_installers/cuda_10.0.130_410.48_linux
chmod +x cuda_10.0.130_410.48_linux
./cuda_10.0.130_410.48_linux

# Install cuDNN-/cudnn-7.6.4.38 
wget https://developer.nvidia.com/compute/machine-learning/cudnn/secure/7.6.4.38/Production/10.0_20190923/cudnn-10.0-linux-x64-v7.6.4.38.tgz
# Extract cudnn-10.0-linux-x64-v7.6.4.38.tgz
cp libcudnn.so  cuda-10/lib64/
cp cudnn.h   cuda-10/include/

# Install the anaconda enabled with Python 3.7.4
wget https://repo.anaconda.com/archive/Anaconda3-2019.10-Linux-x86_64.sh
chmod +x Anaconda3-2019.10-Linux-x86_64.sh
./Anaconda3-2019.10-Linux-x86_64.sh

#Install Pytorch
# Don't use any other version since the pytorch C++ interface for cuDNN wrapper has been changed and therefore the other version is not compatible with this code.
conda install pytorch==1.1.0 torchvision==0.3.0 cudatoolkit=10.0 -c pytorch

# Clone the SWAT Repository
git clone https://github.com/AamirRaihan/SWAT.git

# Install the cuDNN C++ wrapper for custom convolution layer.
cd SWAT/SWAT-code/mypkg
python setup.py install

Docker Setup

You can also pull a pre-built docker image from Docker Hub and run with docker v19.03+

sudo docker run --gpus all --ipc=host  -it  --rm -v /PATH/TO/IMAGENET/DATASET:/workspace/datasets/ swaticml/custom-swat-pytorch:v1

More Information on installing docker is present here.

Basic Usage

Unstructured SWAT

CIFAR10

Running Inference:

python main.py -model "ResNet18" -dataset="Cifar10" --schedule-file $1 --inference 1 --checkpoint $2

Schedule-files are present in run_configurations directory.

Running Training:

./run_script/resnet_cifar10.sh run_configurations/unstructured_constant_resnet18_schedule_90.yaml

Network	Method	Weight Sparsity (%)	Activation Sparsity (%)	Top-1 Accuracy (%)	Training FLOP ⬇ %	Checkpoint
VGG-16	SWAT-U	90.0	90.0	91.95±0.06	89.7	here
VGG-16	SWAT-ERK	95.0	82.0	92.50±0.07	89.5	here
VGG-16	SWAT-M	95.0	65.0	93.41±0.05	64.0	here
WRN-16-8	SWAT-U	90.0	90.0	95.13±0.11	90.0	here
WRN-16-8	SWAT-ERK	95.0	84.0	95.00±0.12	91.4	here
WRN-16-8	SWAT-M	95.0	78.0	94.97±0.04	86.3	here
DenseNet-121	SWAT-U	90.0	90.0	94.48±0.06	89.8	here
DenseNet-121	SWAT-ERK	90.0	88.0	94.14±0.11	89.7	here
DenseNet-121	SWAT-M	90.0	86.0	94.29±0.11	84.2	here

Note more checkpoints are available here. Basically, the checkpoints for different sparsity percentage are present there. Moreover, ResNet-18 data is also present. You can also plot the train/test and loss curve for all the individual training run using the data present in the directory.

CIFAR100

Running Inference:

python main.py -model "ResNet18" -dataset="Cifar100" --schedule-file $1 --inference 1 --checkpoint $2

Schedule-files are present in run_configurations directory.

Running Training:

./run_script/resnet_cifar100.sh run_configurations/unstructured_constant_resnet18_schedule_90.yaml

Network	Method	Weight Sparsity(%)	Activation Sparsity(%)	Top-1 Accuracy(%)	Training FLOP ⬇ %	Checkpoint
VGG-16	SWAT-U	90.0	90.0	91.95±0.08	-	here
VGG-16	SWAT-ERK	90.0	69.6	92.50±0.11	-	here
VGG-16	SWAT-M	90.0	59.9	93.41±0.23	-	here
WRN-16-8	SWAT-U	90.0	90.0	95.13±0.13	-	here
WRN-16-8	SWAT-ERK	90.0	77.6	95.00±0.07	-	here
WRN-16-8	SWAT-M	90.0	73.3	94.97±0.11	-	here
DenseNet-121	SWAT-U	90.0	90.0	94.48±0.06	-	here
DenseNet-121	SWAT-ERK	90.0	90.0	94.14±0.03	-	here
DenseNet-121	SWAT-M	90.0	84.2	94.29±0.13	-	here

IMAGENET

Network	Method	Weight Sparsity (%)	Activation Sparsity (%)	Top-1 Accuracy (%)	Training FLOP ⬇ %
ResNet-50	SWAT-U	80.0	80.0	75.2±0.06	76.1
ResNet-50	SWAT-U	90.0	90.0	72.1±0.03	85.6
ResNet-50	SWAT-ERK	80.0	52.0	76.0±0.16	60.0
ResNet-50	SWAT-ERK	90.0	64.0	73.8±0.23	79.0
ResNet-50	SWAT-M	80.0	49.0	74.6±0.10	45.9
ResNet-50	SWAT-M	90.0	57.0	74.0±0.18	65.4
WRN-50-2	SWAT-U	80.0	80.0	76.4±0.11	78.6
WRN-50-2	SWAT-U	90.0	90.0	74.7±0.27	88.4

Structured SWAT

CIFAR10

Running Inference:

python main.py -model "ResNet18" -dataset="Cifar10" --schedule-file $1 --inference 1 --checkpoint $2

Schedule-files are present in run_configurations directory.

Running Training:

./run_script/resnet_cifar10.sh run_configurations/structured_constant_resnet18_schedule_70.yaml

Network	Method	Weight Sparsity (%)	Activation Sparsity (%)	Channel Pruned (%)	Top-1 Accuracy (%)	Training FLOP ⬇ %	Checkpoint
ResNet-18	SWAT-U	50.0	50.0	50.0	94.73±0.06	49.9	here
ResNet-18	SWAT-U	60.0	60.0	60.0	94.68±0.03	59.8	here
ResNet-18	SWAT-U	70.0	70.0	70.0	94.65±0.19	69.8	here
DenseNet-121	SWAT-U	50.0	50.0	50.0	95.04±0.26	49.9	here
DenseNet-121	SWAT-U	60.0	60.0	60.0	94.82±0.11	59.9	here
DenseNet-121	SWAT-U	70.0	70.0	70.0	94.81±0.20	69.9	here

CIFAR100

Running Inference:

python main.py -model "ResNet18" -dataset="Cifar100" --schedule-file $1 --inference 1 --checkpoint $2

Schedule-files are present in run_configurations directory.

Running Training:

./run_script/resnet_cifar100.sh run_configurations/structured_constant_resnet18_schedule_70.yaml

Network	Method	Weight Sparsity (%)	Activation Sparsity (%)	Channel Pruned (%)	Top-1 Accuracy (%)	Checkpoint
ResNet-18	SWAT-U	50.0	50.0	50.0	76.4±0.05	here
ResNet-18	SWAT-U	60.0	60.0	60.0	76.2±0.11	here
ResNet-18	SWAT-U	70.0	70.0	70.0	75.6±0.09	here
DenseNet-121	SWAT-U	50.0	50.0	50.0	78.7±0.03	here
DenseNet-121	SWAT-U	60.0	60.0	60.0	78.5±0.08	here
DenseNet-121	SWAT-U	70.0	70.0	70.0	78.1±0.12	here

IMAGEMENT

Network	Method	Weight Sparsity (%)	Activation Sparsity (%)	Channel Pruned (%)	Top-1 Accuracy (%)	Training FLOP ⬇ %
ResNet-50	SWAT-U	50.0	50.0	50.0	76.51±0.30	47.6
ResNet-50	SWAT-U	60.0	60.0	60.0	76.35±0.06	57.1
ResNet-50	SWAT-U	70.0	70.0	70.0	75.67±0.06	66.6
WRN-50-2	SWAT-U	50.0	50.0	50.0	78.08±0.20	49.1
WRN-50-2	SWAT-U	60.0	60.0	60.0	77.55±0.07	58.9
WRN-50-2	SWAT-U	70.0	70.0	70.0	77.19±0.11	68.7

Pretrained Model

Unstructured-SWAT on CIFAR-10 Dataset : here
Unstructured-SWAT on CIFAR-100 Dataset : here
Structured-SWAT on CIFAR-10 Dataset : here
Structured-SWAT on CIFAR-100 Dataset : here

Training/Inference FLOP count

Citation

If you find this project useful in your research, please consider citing:

```
@inproceedings{RaihanNeurips2020
  author    = {Raihan, Md Aamir and Aamodt, Tor M},
  booktitle = {Advances in Neural Information Processing Systems},
  title     = {Sparse Weight Activation Training},
  url = {https://proceedings.neurips.cc/paper/2020/file/b44182379bf9fae976e6ae5996e13cd8-Paper.pdf},
  month     = {December},
  year      = {2020},
}
```

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
SWAT-code		SWAT-code
Docker.md		Docker.md
LICENSE		LICENSE
README.md		README.md
swat_poster.pptx		swat_poster.pptx

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Sparse Weight Activation Training

Table of Contents

Experiment Setup

Manual Setup

Docker Setup

Basic Usage

Unstructured SWAT

CIFAR10

CIFAR100

IMAGENET

Structured SWAT

CIFAR10

CIFAR100

IMAGEMENT

Pretrained Model

Training/Inference FLOP count

Citation

About

Releases

Packages

Languages

License

AamirRaihan/SWAT

Folders and files

Latest commit

History

Repository files navigation

Sparse Weight Activation Training

Table of Contents

Experiment Setup

Manual Setup

Docker Setup

Basic Usage

Unstructured SWAT

CIFAR10

CIFAR100

IMAGENET

Structured SWAT

CIFAR10

CIFAR100

IMAGEMENT

Pretrained Model

Training/Inference FLOP count

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages