Built a numeric character recognition system trained on the MNIST dataset using Convolutional Neural Networks.
I used a simple architecture based on a very popular network called the LeNet:
1. Input - 1×28×28
2. Convolution - k = 5, s = 1, p = 0, 20 filters
3. ReLU
4. MAXPooling - k=2, s=2, p=0
5. Convolution - k = 5, s = 1, p = 0, 50 filters
6. ReLU
7. MAXPooling - k=2, s=2, p=0
8. Fully Connected layer - 500 neurons
9. ReLU
10. Loss layer - A softmax function is used to convert the output to a probability score
The original image given as an input to the trained model:
The image before being given to CNN forward pass as input data:
The following shows the output for the Loss Layer which represent the probability assigned to each number (from 0 to 9):
P = 0.0009 0.0000 0.0388 0.8451 0.0000 0.0000 0.0000 0.0000 0.1152 0.0000
With 84.51% probability, this image was labelled correctly as a 3. 11.52% labelled wrong as an 8.