Skip to content

Commit

Permalink
Sparse Flow (Node) paper and v2.2.0 release
Browse files Browse the repository at this point in the history
  • Loading branch information
Lucas Liebenwein committed Nov 16, 2022
1 parent 2e59069 commit 14b392c
Show file tree
Hide file tree
Showing 211 changed files with 16,859 additions and 3,126 deletions.
48 changes: 33 additions & 15 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,9 @@
# Neural Network Pruning
[Lucas Liebenwein](https://people.csail.mit.edu/lucasl/),
[Cenk Baykal](http://www.mit.edu/~baykal/),
[Alaa Maalouf](https://www.linkedin.com/in/alaa-maalouf/),
[Igor Gilitschenski](https://www.gilitschenski.org/igor/),
[Dan Feldman](http://people.csail.mit.edu/dannyf/),
[Daniela Rus](http://danielarus.csail.mit.edu/)
# torchprune
Main contributors of this code base:
[Lucas Liebenwein](http://www.mit.edu/~lucasl/),
[Cenk Baykal](http://www.mit.edu/~baykal/).

Please check individual paper folders for authors of each paper.

<p align="center">
<img src="./misc/imgs/pruning_pipeline.png" width="100%">
Expand All @@ -15,10 +14,11 @@ This repository contains code to reproduce the results from the following
papers:
| Paper | Venue | Title & Link |
| :---: | :---: | :--- |
| **Node** | NeurIPS 2021 | [Sparse Flows: Pruning Continuous-depth Models](https://proceedings.neurips.cc/paper/2021/hash/bf1b2f4b901c21a1d8645018ea9aeb05-Abstract.html) |
| **ALDS** | NeurIPS 2021 | [Compressing Neural Networks: Towards Determining the Optimal Layer-wise Decomposition](https://arxiv.org/abs/2107.11442) |
| **Lost** | MLSys 2021 | [Lost in Pruning: The Effects of Pruning Neural Networks beyond Test Accuracy](https://proceedings.mlsys.org/paper/2021/hash/2a79ea27c279e471f4d180b08d62b00a-Abstract.html) |
| **PFP** | ICLR 2020 | [Provable Filter Pruning for Efficient Neural Networks](https://openreview.net/forum?id=BJxkOlSYDH) |
| **SiPP** | arXiv | [SiPPing Neural Networks: Sensitivity-informed Provable Pruning of Neural Networks](https://arxiv.org/abs/1910.05422) |
| **SiPP** | SIAM 2022 | [SiPPing Neural Networks: Sensitivity-informed Provable Pruning of Neural Networks](https://doi.org/10.1137/20M1383239) |

### Packages
In addition, the repo also contains two stand-alone python packages that
Expand All @@ -35,6 +35,7 @@ about the paper and scripts and parameter configuration to reproduce the exact
results from the paper.
| Paper | Location |
| :---: | :---: |
| **Node** | [paper/node](./paper/node) |
| **ALDS** | [paper/alds](./paper/alds) |
| **Lost** | [paper/lost](./paper/lost) |
| **PFP** | [paper/pfp](./paper/pfp) |
Expand Down Expand Up @@ -98,14 +99,27 @@ using the codebase.
| --- | --- |
| [src/torchprune/README.md](./src/torchprune) | more details to prune neural networks, how to use and setup the data sets, how to implement custom pruning methods, and how to add your data sets and networks. |
| [src/experiment/README.md](./src/experiment) | more details on how to configure and run your own experiments, and more information on how to re-produce the results. |
| [paper/node/README.md](./paper/node) | check out for more information on the [Node](https://proceedings.neurips.cc/paper/2021/hash/bf1b2f4b901c21a1d8645018ea9aeb05-Abstract.html) paper. |
| [paper/alds/README.md](./paper/alds) | check out for more information on the [ALDS](https://arxiv.org/abs/2107.11442) paper. |
| [paper/lost/README.md](./paper/lost) | check out for more information on the [Lost](https://proceedings.mlsys.org/paper/2021/hash/2a79ea27c279e471f4d180b08d62b00a-Abstract.html) paper. |
| [paper/pfp/README.md](./paper/pfp) | check out for more information on the [PFP](https://openreview.net/forum?id=BJxkOlSYDH) paper. |
| [paper/sipp/README.md](./paper/sipp) | check out for more information on the [SiPP](https://arxiv.org/abs/1910.05422) paper. |
| [paper/sipp/README.md](./paper/sipp) | check out for more information on the [SiPP](https://doi.org/10.1137/20M1383239) paper. |

## Citations
Please cite the respective papers when using our work.

### [Sparse flows: Pruning continuous-depth models](https://proceedings.neurips.cc/paper/2021/hash/bf1b2f4b901c21a1d8645018ea9aeb05-Abstract.html)
```
@article{liebenwein2021sparse,
title={Sparse flows: Pruning continuous-depth models},
author={Liebenwein, Lucas and Hasani, Ramin and Amini, Alexander and Rus, Daniela},
journal={Advances in Neural Information Processing Systems},
volume={34},
pages={22628--22642},
year={2021}
}
```

### [Towards Determining the Optimal Layer-wise Decomposition](https://arxiv.org/abs/2107.11442)
```
@inproceedings{liebenwein2021alds,
Expand Down Expand Up @@ -140,12 +154,16 @@ url={https://openreview.net/forum?id=BJxkOlSYDH}
}
```

### [SiPPing Neural Networks](https://arxiv.org/abs/1910.05422)
### [SiPPing Neural Networks](https://doi.org/10.1137/20M1383239) (Weight Pruning)
```
@article{baykal2019sipping,
title={SiPPing Neural Networks: Sensitivity-informed Provable Pruning of Neural Networks},
author={Baykal, Cenk and Liebenwein, Lucas and Gilitschenski, Igor and Feldman, Dan and Rus, Daniela},
journal={arXiv preprint arXiv:1910.05422},
year={2019}
@article{baykal2022sensitivity,
title={Sensitivity-informed provable pruning of neural networks},
author={Baykal, Cenk and Liebenwein, Lucas and Gilitschenski, Igor and Feldman, Dan and Rus, Daniela},
journal={SIAM Journal on Mathematics of Data Science},
volume={4},
number={1},
pages={26--45},
year={2022},
publisher={SIAM}
}
```
Binary file added misc/imgs/node_overview.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
8 changes: 4 additions & 4 deletions misc/requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,10 @@
-e ./src/experiment

# We need those with special tags unfortunately...
-f https://download.pytorch.org/whl/torch_stable.html
torch==1.7.1+cu110
torchvision==0.8.2+cu110
torchaudio===0.7.2
-f https://download.pytorch.org/whl/lts/1.8/torch_lts.html
torch==1.8.2+cu111
torchvision==0.9.2+cu111
torchaudio==0.8.2

# Some extra requirements for the code base
jupyter
Expand Down
File renamed without changes.
File renamed without changes.
6 changes: 5 additions & 1 deletion paper/alds/script/results_viewer.py
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@
TABLE_BOLD_THRESHOLD = 0.005

# auto-discover files from folder without "common.yaml"
FILES = glob.glob(os.path.join(FOLDER, "[!common]*.yaml"))
FILES = glob.glob(os.path.join(FOLDER, "*[!common]*.yaml"))


def key_files(item):
Expand All @@ -52,6 +52,7 @@ def key_files(item):
"resnet18",
"resnet101",
"wide_resnet50_2",
"mobilenet_v2",
"deeplabv3_resnet50",
]

Expand Down Expand Up @@ -127,6 +128,8 @@ def get_results(file, logger, legend_on):
elif "imagenet/prune" in file:
graphers[0]._figure.gca().set_xlim([0, 87])
graphers[0]._figure.gca().set_ylim([-87, 5])
elif "imagenet/retrain/mobilenet_v2" in file:
graphers[0]._figure.gca().set_ylim([-5, 0.5])
elif "imagenet/retrain/" in file:
graphers[0]._figure.gca().set_ylim([-3.5, 1.5])
elif "imagenet/retraincascade" in file:
Expand Down Expand Up @@ -317,6 +320,7 @@ def generate_table_entries(
"resnet18": "ResNet18",
"resnet101": "ResNet101",
"wide_resnet50_2": "WRN50-2",
"mobilenet_v2": "MobileNetV2",
"deeplabv3_resnet50": "DeeplabV3-ResNet50",
}

Expand Down
68 changes: 68 additions & 0 deletions paper/node/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
# Sparse flows: Pruning continuous-depth models
[Lucas Liebenwein*](https://people.csail.mit.edu/lucasl/),
[Ramin Hasani*](http://www.raminhasani.com),
[Alexander Amini](https://www.mit.edu/~amini/),
[Daniela Rus](http://danielarus.csail.mit.edu/)

***Equal contribution**

<p align="center">
<img align="center" src="../../misc/imgs/node_overview.png" width="80%">
</p>
<!-- <br clear="left"/> -->

Continuous deep learning architectures enable learning of flexible
probabilistic models for predictive modeling as neural ordinary differential
equations (ODEs), and for generative modeling as continuous normalizing flows.
In this work, we design a framework to decipher the internal dynamics of these
continuous depth models by pruning their network architectures. Our empirical
results suggest that pruning improves generalization for neural ODEs in
generative modeling. We empirically show that the improvement is because
pruning helps avoid mode- collapse and flatten the loss surface. Moreover,
pruning finds efficient neural ODE representations with up to 98% less
parameters compared to the original network, without loss of accuracy. We hope
our results will invigorate further research into the performance-size
trade-offs of modern continuous-depth models.

## Setup
Check out the main [README.md](../../README.md) and the respective packages for
more information on the code base.

## Overview

### Run compression experiments
The experiment configurations are located [here](./param). To reproduce the
experiments for a specific configuration, run:
```bash
python -m experiment.main param/toy/ffjord/spirals/vanilla_l4_h64.yaml
```

The pruning experiments will be run fully automatically and store all the
results.

### Experimental evaluations

The [script](./script) contains the evaluation and plotting scripts to
evaluate and analyze the various experiments. Please take a look at each of
them to understand how to load the pruning experiments and how to analyze
the pruning experiments.

Each plot and experiment presented in the paper can be reproduced this way.

## Citation
Please cite the following paper when using our work.

### Paper link
[Sparse flows: Pruning continuous-depth models](https://proceedings.neurips.cc/paper/2021/hash/bf1b2f4b901c21a1d8645018ea9aeb05-Abstract.html)

### Bibtex
```
@article{liebenwein2021sparse,
title={Sparse flows: Pruning continuous-depth models},
author={Liebenwein, Lucas and Hasani, Ramin and Amini, Alexander and Rus, Daniela},
journal={Advances in Neural Information Processing Systems},
volume={34},
pages={22628--22642},
year={2021}
}
```
68 changes: 68 additions & 0 deletions paper/node/param/cnf/cifar_multiscale.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
network:
name: "ffjord_multiscale_cifar"
dataset: "CIFAR10"
outputSize: 10

training:
transformsTrain:
- type: RandomHorizontalFlip
kwargs: {}
transformsTest: []
transformsFinal:
- type: Resize
kwargs: { size: 32 }
- type: ToTensor
kwargs: {}
- type: RandomNoise
kwargs: { "normalization": 255.0 }

loss: "NLLBitsLoss"
lossKwargs: {}

metricsTest:
- type: NLLBits
kwargs: {}
- type: Dummy
kwargs: {}

batchSize: 200 # don't change that since it's hard-coded

optimizer: "Adam"
optimizerKwargs:
lr: 1.0e-3
weight_decay: 0.0

numEpochs: 50
earlyStopEpoch: 0
enableAMP: False

lrSchedulers:
- type: MultiStepLR
stepKwargs: { milestones: [45] }
kwargs: { gamma: 0.1 }

file: "paper/node/param/directories.yaml"

retraining:
startEpoch: 0

experiments:
methods:
- "ThresNet"
- "FilterThresNet"
mode: "cascade"

numRepetitions: 1
numNets: 1

plotting:
minVal: 0.02
maxVal: 0.85

spacing:
- type: "geometric"
numIntervals: 12
maxVal: 0.80
minVal: 0.05

retrainIterations: -1
66 changes: 66 additions & 0 deletions paper/node/param/cnf/mnist_multiscale.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
network:
name: "ffjord_multiscale_mnist"
dataset: "MNIST"
outputSize: 10

training:
transformsTrain: []
transformsTest: []
transformsFinal:
- type: Resize
kwargs: { size: 28 }
- type: ToTensor
kwargs: {}
- type: RandomNoise
kwargs: { "normalization": 255.0 }

loss: "NLLBitsLoss"
lossKwargs: {}

metricsTest:
- type: NLLBits
kwargs: {}
- type: Dummy
kwargs: {}

batchSize: 200 # don't change that since it's hard-coded

optimizer: "Adam"
optimizerKwargs:
lr: 1.0e-3
weight_decay: 0.0

numEpochs: 50
earlyStopEpoch: 0
enableAMP: False

lrSchedulers:
- type: MultiStepLR
stepKwargs: { milestones: [45] }
kwargs: { gamma: 0.1 }

file: "paper/node/param/directories.yaml"

retraining:
startEpoch: 0

experiments:
methods:
- "ThresNet"
- "FilterThresNet"
mode: "cascade"

numRepetitions: 1
numNets: 1

plotting:
minVal: 0.02
maxVal: 0.85

spacing:
- type: "geometric"
numIntervals: 12
maxVal: 0.80
minVal: 0.05

retrainIterations: -1
6 changes: 6 additions & 0 deletions paper/node/param/directories.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
# relative directories from where main.py was called
directories:
results: "./data/node/results"
trained_networks: null
training_data: "./data/training"
local_data: "./local"
Loading

0 comments on commit 14b392c

Please sign in to comment.