Skip to content

Lightweight deep learning library implemented in Python. Designed for studying how contemporary deep learning libraries are implemented.

License

Notifications You must be signed in to change notification settings

stormy-ua/DeepLearningToy

Repository files navigation

Deep Learning Toy

alt text

Lightweight deep learning library implemented in Python. Designed for studying how contemporary deep learning libraries are implemented.

Build Status PyPI

Architecture

There are several core ideas used by the framework: computational graph, forward propagation, loss/cost function, gradient descent, and backward propagation. Computational graph is a graph representing ordered set of primitive algeabric operations. Forward propagation feeds an input into a computational graph and produces the output. Loss function is a metric measuring how well a model estimates class or a value based on the input; usually, a loss function produces a scalar value. Gradient descent is the calculus approach for a loss function minimization. It uses the simple idea that in order to minimize a function we have to follow a path directed by its variables gradients. Backward propagation takes a graph in the state after forward propagation had finished, and calculates gradients starting from the output towards the input; this direction from the head of the computational graph towards the tail is the result of the calculus chain rule.

Computational Graph

ComputationalGraph class is equipped with methods representing primitive algeabric operations. Each method takes an input and produces an output. Inputs and outputs are represented by the Connection class, and operations by the Node class. There are two types of connections: constants and variables. The former do not change during the model optimization, but the latter could be changed during the optimization process. Here is the example of the primitive computational graph which adds two numbers:

from pydeeptoy.computational_graph import *

cg = ComputationalGraph()
sum_result = cg.sum(cg.constant(1), cg.constant(2))

The code listed above builds the computational graph, but doesn't execute it. In order to execute the graph the SimulationContext class should be used. The simulation context has the logic for doing forward/backward propagation. In addition, it stores all computation results produced by each and every operation, including gradients obtained during the backward phase. The code executing the computational graph described above:

from pydeeptoy.computational_graph import *
from pydeeptoy.simulation import *

cg = ComputationalGraph()
sum_result = cg.sum(cg.constant(1), cg.constant(2))

ctx = SimulationContext()
ctx.forward(cg)

print("1+2={}".format(ctx[sum_result].value))

Atomic Operations

A computational graph is composed from a set of operations. An operation is the minimum building block of a computational graph. In the framework an operation is represented by the abstract Node class. All operation take an input in the form of a numpy array or a scalar value and produce either a scalar value or a numpy array. In other words, a computational graph passes a tensor through itself. That is why one of the most popular deep learning framework is called TensorFlow. The following operations are implemented in the computational_graph module:

Operation Description
sum Computes the sum of two tensors.
multiply Computes the product of two tensors.
matrix_multiply Computes the product of two matrices (aka 2 dimensional tensors).
div Divides one tensor by another.
exp Calculate the exponential of all elements in the input tensor.
log Natural logarithm, element-wise.
reduce_sum Computes the sum of elements across dimensions of a tensor.
max Element-wise maximum of tensor elements.
broadcast
transpose Permute the dimensions of a tensor.
reshape Gives a new shape to an array without changing its data.
conv2d Computes a 2-D convolution given 4-D input and filter tensors.

Activation Functions

Activation functions are used for thresholding a single neuron output. First, a neuron calculates its output based on the weighted sum of its inputs. Second, the calculated weighted sum is fed into the activation function. Finally, the activation function produces the final neuron output. Usually, an activation function ouput is normalized to be in between 0 and 1, or -1 and 1. The list of implemented activation functions:

Loss Functions

Loss functions are used as a mesure of the model performance. Usually, it is just a scalar value telling how well a model estimates output based on the input. Needless to say, a universal loss function which fits all model flavours doesn't exists. The following loss functions are implemented in the losses module:

Computational Graph Visualization

It is important to be able to visualize a complex computational graph. First, it helps to understand how a model works. Second, having a computational graph in the form of a visualization might help with debugging and finding an issue.

The example is demonstrating how to render a computational graph on the web page using d3.js library on the frontend and Flask web framework on the backend. Interactive demo is here. It renders computational graph of the 4-layer neural network:

alt text

Usage Examples

The set of primitive building blocks provided by the framework could be used to build robust estimators. The benefit of using the framework is that you do not have to implement forward/backward propagation from scratch for every kind of an estimator.

Iris MNIST CIFAR-10
Support Vector Machine (SVM) Example
Multilayer Perceptron Example Example

License

MIT license

About

Lightweight deep learning library implemented in Python. Designed for studying how contemporary deep learning libraries are implemented.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published