Table of Contents generated with DocToc

Neural Net API

Neural Net API

You can create a network in C++ directly. As an example, to create a 8C5-MP2-16C5-MP3-150N-10N network, for MNIST, you could do:

EasyCL *cl = new EasyCL();
NeuralNet *net = new NeuralNet(cl);
net->addLayer( InputLayerMaker::instance()->numPlanes(1)->imageSize(28) );
net->addLayer( NormalizationLayerMaker::instance()->translate( -mean )->scale( 1.0f / standardDeviation ) );
net->addLayer( ConvolutionalMaker::instance()->numFilters(8)->filterSize(5)->biased() );
net->addLayer( ActivationMaker::instance()->relu() );
net->addLayer( PoolingMaker::instance()->poolingSize(2) );
net->addLayer( ConvolutionalMaker::instance()->numFilters(16)->filterSize(5)->biased() );
net->addLayer( ActivationMaker::instance()->relu() );
net->addLayer( PoolingMaker::instance()->poolingSize(3) );
net->addLayer( FullyConnectedMaker::instance()->numPlanes(150)->imageSize(1)->biased() );
net->addLayer( ActivationMaker::instance()->relu() );
net->addLayer( FullyConnectedMaker::instance()->numPlanes(10)->imageSize(1)->biased() );
net->addLayer( ActivationMaker::instance()->linear() );
net->addLayer( SoftMaxMaker::instance() );
net->print();

The following sections will detail the various layers available, and the options available for each layer type
Data must be provided in contiguous, 1d format, see below

Create a net

#include "DeepCL.h"
OpenCLHelper *cl = OpenCLHelper::createForFirstGpuOtherwiseCpu();
NeuralNet *net = new NeuralNet( cl );

Add an input layer

You need exactly one input layer:

net->addLayer( InputMaker::instance()->numPlanes(10)->imageSize(19) );

You need to set the number of input planes, and the image size.

Normalization layer

You can add a normalization layer, to translate and scale input data. Put it just after the input layer, like this:

NeuralNet *net = new NeuralNet();
net->addLayer( InputMaker::instance()->numPlanes(10)->imageSize(19) );
net->addLayer( NormalizationMaker::instance()->translate( - mean )->scale( 1.0f / standardDeviation ) );
// other layers here...

Dropout layer

To add a drop out layer:

net->addLayer( DropoutMaker::instance()->dropRatio(0.5f) );

This should probably go in between a fully-connected layer, and its associated activation layer, like:

net->addLayer( FullyConnectedMaker::instance()->numPlanes(10)->imageSize(1)->linear()->biased() );
net->addLayer( DropoutMaker::instance()->dropRatio(0.5f) );
net->addLayer( ActivationMaker::instance()->tanh() );

Random patch layer

You can add a random patch layer, to cut a patch from each image, in a random location, and train against that
You need to specify the patch size, eg on minst, which is 28x28 images, you might use a patch size of 24
During training the patch location is chosen randomly, per image, per epoch
Size of output image from this layer is the size of the patch
During testing, the patch is cut from the centre of the image

net->addLayer( RandomPatchMaker::instance()->patchSize(24) );

Random translations layer

You can add a random translations layer, to randomly translate each input image by a random amount, during training
During testing, no translation is done
If you put eg translateSize(2), then the translation amount will be chosen uniformly from the set {-2,-1,0,1,2}, for each axis.
Output image from this layer is same size as input image

net->addLayer( RandomTranslationsMaker::instance()->translateSize(2) );

Convolutional layers

Eg:

net->addLayer( ConvolutionalMaker::instance()->numFilters(32)->filterSize(5)->relu()->biased() );

You can change the number of filters, and their size. If you want, you can use any of the following options:
- ->padZeros(): pad the input image with zeros, so the output image is same size as the input
- ->biased() turn on bias
- ->biased(1) same as ->biased()
- ->biased(0) turn off bias (default)
convolutional layers forward-prop and backward-prop both run on GPU, via OpenCL

Activation layers

Eg:

net->addLayer( ActivationMaker::instance()->relu() );

You can create one of the following activations to be applied on the previous layer.
- ->linear() choose linear activation
- ->relu() choose RELU activation
- ->elu() choose ELU activation
- ->sigmoid() choose sigmoid activation
- ->tanh() choose tanh activation (current default, but defaults can change...)
- ->scaledtanh() 1.7159 * tanh(0.66667 * x )

Fully connected layers

eg:

net->addLayer( FullyConnectedMaker::instance()->numPlanes(2)->imageSize(28) );

Available options:

->biased() turn on bias
->biased(1) same as ->biased()
->biased(0) turn off bias (default)
->linear() choose linear activation
->relu() choose relu activation
->sigmoid() choose sigmoid activation
->tanh() choose tanh activation (current default, but defaults can change...)
->scaledtanh() 1.7159 * tanh(0.66667 * x )

Max-pooling layers

net->addLayer( PoolingMaker::instance()->poolingSize(2) );

By default, if the input image size is not an exact multiple of the poolingsize, the extra margin will be ignored
You can specify padZeros to include this margin:

net->addLayer( PoolingMaker::instance()->poolingSize(2)->padZeros() );

Loss layer

You need to add exactly one loss layer, as the last layer of the net. The following loss layers are available:

net->addLayer( SquareLossMaker::instance() );
net->addLayer( CrossEntropyMaker::instance() );
net->addLayer( SoftMaxLayer::instance() );

if your outputs are categorial, 1-of-N, then softMaxLayer is probably what you want
otherwise, you can choose square loss, or cross-entropy loss:
- squared loss works well with a tanh last layer
- cross entropy loss works well with a sigmoid last layer
- if you're not sure, then tanh last layer, with squared loss, works well
the softmax layer:
- creates a probability distribution, ie a set of outputs, that sum to 1, and each lie in the range 0 <= x <= 1
- can create this probability distribution either across all output planes, with a imagesize of 1
  - this is the default
- or else a per-plane probability distribution
  - add option ->perPlane()

Data format

Input data should be provided in a contiguous array, of floats. "group by" order should be:

training example id
input plane
image row
image column

Providing labels, as an integer array, is the most efficient way of training, if you are training against categorical data. The labels should be provided as one integer per example, zero-based.

in this case, the last layer of the net should have the same number of nodes as categories, eg a netdef ending in -5n, if there are 5 categories
if using the C++ API, you would probably want to use a softmax loss layer

For non-categorical data, you can provide expected output values as a contiguous array of floats. "group by" order for the floats should be:

training example id
output plane (eg, corresponds to filter id, for convolutional network)
output row
output column

Create a Trainer

// create a Trainer object, currently SGD,
// passing in learning rate, and momentum:
Trainer *trainer = SGD::instance( cl, 0.02f, 0.0f );

Can set weightdecay, momentum, learningrate:

SGD *sgd = SGD::instance( cl );
sgd->setLearningRate( 0.002f );
sgd->setMomentum( 0.1f );
sgd->setWeightDecay( 0.001f );

Other trainers:

Adagrad *adagrad = new Adagrad( cl );
adagrad->setLearningRate( 0.002f );
Trainer *trainer = adagrad;

Rmsprop *rmsprop = new Rmsprop( cl );
rmsprop->setLearningRate( 0.002f );
Trainer *trainer = rmsprop;

Nesterov *nesterov = new Nesterov( cl );
nesterov->setLearningRate( 0.002f );
nesterov->setMomentum( 0.1f );
Trainer *trainer = nesterov;

Annealer *annealer = new Annealer( cl );
annealer->setLearningRate( 0.002f );
annealer->setAnneal( 0.97f );
Trainer *trainer = annealer;

Train

eg:

NetLearner netLearner(
    trainer, net,
    Ntrain, trainData, trainLabels,
    Ntest, testData, testLabels );
netLearner.setSchedule( numEpochs );
netLearner.setBatchSize( batchSize );
netLearner.learn();
// learning is now done :-)

Test

eg

// (create a net, as above)
// (train it, as above)
// test, eg:
BatchLearner batchLearner( net );
int testNumRight = batchLearner.test( batchSize, Ntest, testData, testLabels );

Weight initialization

By default an OriginalInitializer object is used to initialize weights (a bit hacky, but changing this would need a major version bump)
You can create an instance of UniformInitializer, and assign this to the ConvolutionalMaker by doing for example ->setWeightInitializer( new UniformInitializer(1.0f) ), to use a uniform initializer
- uniform initializer assigns weights sampled uniformally from the range +/- ( initialWeights divided by fanin)
possible to create other WeightsInitializers if we ant

More details

You can find more details in the Doxygen-generated docs at doxy docs for 4.x.x

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NeuralNetAPI.md

NeuralNetAPI.md

Neural Net API

Create a net

Add an input layer

Normalization layer

Dropout layer

Random patch layer

Random translations layer

Convolutional layers

Activation layers

Fully connected layers

Max-pooling layers

Loss layer

Data format

Create a Trainer

Train

Test

Weight initialization

More details

Files

NeuralNetAPI.md

Latest commit

History

NeuralNetAPI.md

File metadata and controls

Neural Net API

Create a net

Add an input layer

Normalization layer

Dropout layer

Random patch layer

Random translations layer

Convolutional layers

Activation layers

Fully connected layers

Max-pooling layers

Loss layer

Data format

Create a Trainer

Train

Test

Weight initialization

More details