Skip to content

Commit

Permalink
Merge pull request BVLC#1100 from cNikolaou/issue1099
Browse files Browse the repository at this point in the history
Polish mnist + cifar10 examples.
  • Loading branch information
shelhamer committed Sep 18, 2014
2 parents 6f94eac + 8ec223e commit 987d370
Show file tree
Hide file tree
Showing 2 changed files with 26 additions and 6 deletions.
6 changes: 3 additions & 3 deletions examples/cifar10/readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ Alex's CIFAR-10 tutorial, Caffe style

Alex Krizhevsky's [cuda-convnet](https://code.google.com/p/cuda-convnet/) details the model definitions, parameters, and training procedure for good performance on CIFAR-10. This example reproduces his results in Caffe.

We will assume that you have Caffe successfully compiled. If not, please refer to the [Installation page](installation.html). In this tutorial, we will assume that your caffe installation is located at `CAFFE_ROOT`.
We will assume that you have Caffe successfully compiled. If not, please refer to the [Installation page](/installation.html). In this tutorial, we will assume that your caffe installation is located at `CAFFE_ROOT`.

We thank @chyojn for the pull request that defined the model schemas and solver configurations.

Expand All @@ -32,12 +32,12 @@ If it complains that `wget` or `gunzip` are not installed, you need to install t
The Model
---------

The CIFAR-10 model is a CNN that composes layers of convolution, pooling, rectified linear unit (ReLU) nonlinearities, and local contrast normalization with a linear classifier on top of it all. We have defined the model in the `CAFFE_ROOT/examples/cifar10` directory's `cifar10_quick_train.prototxt`.
The CIFAR-10 model is a CNN that composes layers of convolution, pooling, rectified linear unit (ReLU) nonlinearities, and local contrast normalization with a linear classifier on top of it all. We have defined the model in the `CAFFE_ROOT/examples/cifar10` directory's `cifar10_quick_train_test.prototxt`.

Training and Testing the "Quick" Model
--------------------------------------

Training the model is simple after you have written the network definition protobuf and solver protobuf files. Simply run `train_quick.sh`, or the following command directly:
Training the model is simple after you have written the network definition protobuf and solver protobuf files (refer to [MNIST Tutorial](../examples/mnist.html)). Simply run `train_quick.sh`, or the following command directly:

cd $CAFFE_ROOT/examples/cifar10
./train_quick.sh
Expand Down
26 changes: 23 additions & 3 deletions examples/mnist/readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ priority: 1

# Training MNIST with Caffe

We will assume that you have caffe successfully compiled. If not, please refer to the [Installation page](installation.html). In this tutorial, we will assume that your caffe installation is located at `CAFFE_ROOT`.
We will assume that you have Caffe successfully compiled. If not, please refer to the [Installation page](/installation.html). In this tutorial, we will assume that your Caffe installation is located at `CAFFE_ROOT`.

## Prepare Datasets

Expand All @@ -29,7 +29,7 @@ The design of LeNet contains the essence of CNNs that are still used in larger m

## Define the MNIST Network

This section explains the prototxt file `lenet_train.prototxt` used in the MNIST demo. We assume that you are familiar with [Google Protobuf](https://developers.google.com/protocol-buffers/docs/overview), and assume that you have read the protobuf definitions used by Caffe, which can be found at `$CAFFE_ROOT/src/caffe/proto/caffe.proto`.
This section explains the `lenet_train_test.prototxt` model definition that specifies the LeNet model for MNIST handwritten digit classification. We assume that you are familiar with [Google Protobuf](https://developers.google.com/protocol-buffers/docs/overview), and assume that you have read the protobuf definitions used by Caffe, which can be found at `$CAFFE_ROOT/src/caffe/proto/caffe.proto`.

Specifically, we will write a `caffe::NetParameter` (or in python, `caffe.proto.caffe_pb2.NetParameter`) protobuf. We will start by giving the network a name:

Expand Down Expand Up @@ -173,6 +173,25 @@ Finally, we will write the loss!

The `softmax_loss` layer implements both the softmax and the multinomial logistic loss (that saves time and improves numerical stability). It takes two blobs, the first one being the prediction and the second one being the `label` provided by the data layer (remember it?). It does not produce any outputs - all it does is to compute the loss function value, report it when backpropagation starts, and initiates the gradient with respect to `ip2`. This is where all magic starts.


### Additional Notes: Writing Layer Rules

Layer definitions can include rules for whether and when they are included in the network definition, like the one below:

layers {
// ...layer definition...
include: { phase: TRAIN }
}

This is a rule, which controls layer inclusion in the network, based on current network's state.
You can refer to `$CAFFE_ROOT/src/caffe/proto/caffe.proto` for more information about layer rules and model schema.

In the above example, this layer will be included only in `TRAIN` phase.
If we change `TRAIN` with `TEST`, then this layer will be used only in test phase.
By default, that is without layer rules, a layer is always included in the network.
Thus, `lenet_train_test.prototxt` has two `DATA` layers defined (with different `batch_size`), one for the training phase and one for the testing phase.
Also, there is an `ACCURACY` layer which is included only in `TEST` phase for reporting the model accuracy every 100 iteration, as defined in `lenet_solver.prototxt`.

## Define the MNIST Solver

Check out the comments explaining each line in the prototxt `$CAFFE_ROOT/examples/mnist/lenet_solver.prototxt`:
Expand Down Expand Up @@ -203,9 +222,10 @@ Check out the comments explaining each line in the prototxt `$CAFFE_ROOT/example
# solver mode: CPU or GPU
solver_mode: GPU


## Training and Testing the Model

Training the model is simple after you have written the network definition protobuf and solver protobuf files. Simply run `train_mnist.sh`, or the following command directly:
Training the model is simple after you have written the network definition protobuf and solver protobuf files. Simply run `train_lenet.sh`, or the following command directly:

cd $CAFFE_ROOT/examples/mnist
./train_lenet.sh
Expand Down

0 comments on commit 987d370

Please sign in to comment.