This repository implements pre-defined sparse neural networks -- as per research done by the USC HAL team. Pre-defined sparsity lowers complexity of networks with minimal performance degradation. This leads to simpler architectures and better understanding of the 'black box' that is neural networks.
This research paper has more details. Please consider citing it if you use or benefit from this work:
Sourya Dey, Kuan-Wen Huang, Peter A. Beerel, Keith M. Chugg, "Pre-Defined Sparse Neural Networks with Hardware Acceleration" in IEEE Journal on Emerging and Selected Topics in Circuits and Systems, vol. 9, no. 2, pp. 332-345, June 2019.
Available on IEEE and arXiv (copyright owned by IEEE).
Read this Medium blog post for a quick description.
Software used:
- Python 3
- Keras 2.2.4 with backend Tensorflow 1.10.0
- numpy, scipy
Main file: keras_impl
Run the sim_net
method with these arguments:
config
: Neuron configurationfo
: Out-degree (fanout) configurationl2_val
: L2 regularization coefficientz
: Degree of parallelism, if simulating clash-free adjacency matricesdataset_filename
: Path to datasetpreds_compare
: Path to benchmark results with which current results will be compared. Some benchmark results are in timit_FCs
For example:
recs,model = sim_net(
config = np.array([800,100,10]),
fo = np.array([50,10]),
l2_val = 8e-5,
z = None,
dataset_filename = data_folder + 'dataset_MNIST/mnist.npz',
preds_compare = 0
)
A complete explanation of these terms and concepts is given in the research paper. The documentation of the run_model
method also has useful details. Final results and the model itself after doing a run are stored in results_new by default (some examples are given).
Supporting files:
- adjmatint: Create adjacency matrices which describe the pre-defined sparse connection pattern in different junctions. Different types -- random, basic, clash-free.
- data_loadstore: Data management (see datasets section below).
- data_processing: Normalization functions.
- keras_nets: Methods to create MLPs, as well as conv nets used in the research paper to experiment on CIFAR.
- utils: Has the useful
merge_dicts
method for managing training records.
Datasets for experimentation are used in the .npz format with 6 keys -- xtr, ytr, xva, yva, xte, yte
for data (x) and labels (y) of training, validation and test splits. Our experiments included:
- MNIST
- CIFAR
- Reuters RCV1 v2 - Links to download and methods to process this dataset are given in data_loadstore
- TIMIT: Not freely available, hence not provided.
Further research details: Our group at USC has been reseaching and developing pre-defined sparsity starting from 2016. Our other publications can be found here. You can also check out my website for other deep learning projects and details.