Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

Commit

Permalink
Merge numpy.mxnet.io into mxnet official website
Browse files Browse the repository at this point in the history
Author: Yang Shi <yangshia@amazon.com>
Date:   Sun Jul 5 17:11:25 2020 -0700

    update links

commit 4b5af0c40d844783b990316b4e98db9575e5e92d
Author: Yang Shi <yangshia@amazon.com>
Date:   Mon Jul 6 11:23:59 2020 -0700

    add deepnumy doc

commit 481dfdef5e55fd7935b64168c83444b80352df9c
Author: Yang Shi <yangshia@amazon.com>
Date:   Sun Jul 5 17:07:06 2020 -0700

    update links

commit feee30627a836a730b5a31593fd85d74bb94a78e
Author: Yang Shi <yangshia@amazon.com>
Date:   Sun Jul 5 16:39:32 2020 -0700

    fix links

commit 4b0ec4c9b5d10e79cce89f520c85791802f1a83a
Author: Yang Shi <yangshia@amazon.com>
Date:   Sun Jul 5 16:13:46 2020 -0700

    test

commit babda9818df2c0d6e459136f7869b73b90964064
Author: Yang Shi <yangshia@amazon.com>
Date:   Sun Jul 5 15:44:12 2020 -0700

    fix internal links

commit 34230a9feaee98d114c5eb861f62cdcce1ad38b1
Author: Yang Shi <yangshia@amazon.com>
Date:   Sun Jul 5 15:07:04 2020 -0700

    update inline markup

commit e6e083e943d99cc209e7b21eb1b402990ee8b29e
Author: Yang Shi <yangshia@amazon.com>
Date:   Sun Jul 5 14:28:17 2020 -0700

    merge numpy site guides
  • Loading branch information
ys2843 committed Jul 13, 2020
1 parent 19e373d commit f436dcc
Show file tree
Hide file tree
Showing 26 changed files with 2,940 additions and 236 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -15,113 +15,108 @@
<!--- specific language governing permissions and limitations -->
<!--- under the License. -->

# Manipulate data with `ndarray`
# Step 1: Manipulate data with NP on MXNet

We'll start by introducing the `NDArray`, MXNet’s primary tool for storing and transforming data. If you’ve worked with `NumPy` before, you’ll notice that an NDArray is, by design, similar to NumPy’s multi-dimensional array.
This getting started exercise introduces the `np` package, which is the primary tool for storing and
transforming data on MXNet. If you’ve worked with NumPy before, you’ll notice `np` is, by design, similar to NumPy.

## Get started
## Import packages and create an array

To get started, let's import the `ndarray` package (`nd` is a shorter alias) from MXNet.

```{.python .input n=1}
# If you haven't installed MXNet yet, you can uncomment the following line to
# install the latest stable release
# !pip install -U mxnet
To get started, run the following commands to import the `np` package together with the NumPy extensions package `npx`. Together, `np` with `npx` make up the NP on MXNet front end.

from mxnet import nd
```{.python .input n=1}
from mxnet import np, npx
npx.set_np() # Activate NumPy-like mode.
```

Next, let's see how to create a 2D array (also called a matrix) with values from two sets of numbers: 1, 2, 3 and 4, 5, 6. This might also be referred to as a tuple of a tuple of integers.
In this step, create a 2D array (also called a matrix). The following code example creates a matrix with values from two sets of numbers: 1, 2, 3 and 4, 5, 6. This might also be referred to as a tuple of a tuple of integers.

```{.python .input n=2}
nd.array(((1,2,3),(5,6,7)))
np.array(((1,2,3),(5,6,7)))
```

We can also create a very simple matrix with the same shape (2 rows by 3 columns), but fill it with 1s.
You can also create a very simple matrix with the same shape (2 rows by 3 columns), but fill it with 1s.

```{.python .input n=3}
x = nd.ones((2,3))
x = np.ones((2,3))
x
```

Often we’ll want to create arrays whose values are sampled randomly. For example, sampling values uniformly between -1 and 1. Here we create the same shape, but with random sampling.
You can create arrays whose values are sampled randomly. For example, sampling values uniformly between -1 and 1. The following code example creates the same shape, but with random sampling.

```{.python .input n=15}
y = nd.random.uniform(-1,1,(2,3))
y = np.random.uniform(-1,1, (2,3))
y
```

You can also fill an array of a given shape with a given value, such as `2.0`.
<!-- added to improve multiplication example -->

```{.python .input n=16}
x = nd.full((2,3), 2.0)
x
```

As with NumPy, the dimensions of each NDArray are accessible by accessing the `.shape` attribute. We can also query its `size`, which is equal to the product of the components of the shape. In addition, `.dtype` tells the data type of the stored values.
As with NumPy, the dimensions of each ndarray are shown by accessing the `.shape` attribute. As the following code example shows, you can also query for `size`, which is equal to the product of the components of the shape. In addition, `.dtype` tells the data type of the stored values.

```{.python .input n=17}
(x.shape, x.size, x.dtype)
```

## Operations
## Performing operations on an array

NDArray supports a large number of standard mathematical operations, such as element-wise multiplication:
An ndarray supports a large number of standard mathematical operations. Here are three examples. You can perform element-wise multiplication by using the following code example.

```{.python .input n=18}
x * y
```

Exponentiation:
You can perform exponentiation by using the following code example.

```{.python .input n=23}
y.exp()
np.exp(y)
```

And transposing a matrix to compute a proper matrix-matrix product:
You can also find a matrix’s transpose to compute a proper matrix-matrix product by using the following code example.

```{.python .input n=24}
nd.dot(x, y.T)
np.dot(x, y.T)
```

## Indexing
## Indexing an array

MXNet NDArrays support slicing in all the ridiculous ways you might imagine accessing your data. Here’s an example of reading a particular element, which returns a 1D array with shape `(1,)`.
The ndarrays support slicing in many ways you might want to access your data. The following code example shows how to read a particular element, which returns a 1D array with shape `(1,)`.

```{.python .input n=25}
y[1,2]
```

Read the second and third columns from `y`.
This example shows how to read the second and third columns from `y`.

```{.python .input n=26}
y[:,1:3]
```

and write to a specific element.
This example shows how to write to a specific element.

```{.python .input n=27}
y[:,1:3] = 2
y
```

Multi-dimensional slicing is also supported.
You can perform multi-dimensional slicing, which is shown in the following code example.

```{.python .input n=28}
y[1:2,0:2] = 4
y
```

## Converting between MXNet NDArray and NumPy
## Converting between MXNet ndarrays and NumPy ndarrays

Converting MXNet NDArrays to and from NumPy is easy. The converted arrays do not share memory.
You can convert MXNet ndarrays to and from NumPy ndarrays, as shown in the following example. The converted arrays do not share memory.

```{.python .input n=29}
a = x.asnumpy()
(type(a), a)
```

```{.python .input n=30}
nd.array(a)
np.array(a)
```

## Next steps

Learn how to construct a neural network with the Gluon module: [Step 2: Create a neural network](2-nn.md).
Original file line number Diff line number Diff line change
Expand Up @@ -15,47 +15,50 @@
<!--- specific language governing permissions and limitations -->
<!--- under the License. -->

# Create a neural network
# Step 2: Create a neural network

Now let's look how to create neural networks in Gluon. In addition the NDArray package (`nd`) that we just covered, we now will also import the neural network `nn` package from `gluon`.
In this step, you learn how to use NP on MXNet to create neural networks in Gluon. In addition to the `np` package that you learned about in the previous step [Step 1: Manipulate data with NP on MXNet](1-ndarray.md), you also import the neural network `nn` package from `gluon`.

Use the following commands to import the packages required for this step.

```{.python .input n=2}
from mxnet import nd
from mxnet import np, npx
from mxnet.gluon import nn
npx.set_np() # Change MXNet to the numpy-like mode.
```

## Create your neural network's first layer

Let's start with a dense layer with 2 output units.
Use the following code example to start with a dense layer with two output units.
<!-- mention what the none and the linear parts mean? -->

```{.python .input n=31}
layer = nn.Dense(2)
layer
```

Then initialize its weights with the default initialization method, which draws random values uniformly from $[-0.7, 0.7]$.
Initialize its weights with the default initialization method, which draws random values uniformly from $[-0.7, 0.7]$. You can see this in the following example.

```{.python .input n=32}
layer.initialize()
```

Then we do a forward pass with random data. We create a $(3,4)$ shape random input `x` and feed into the layer to compute the output.
Do a forward pass with random data, shown in the following example. We create a $(3,4)$ shape random input `x` and feed into the layer to compute the output.

```{.python .input n=34}
x = nd.random.uniform(-1,1,(3,4))
x = np.random.uniform(-1,1,(3,4))
layer(x)
```

As can be seen, the layer's input limit of 2 produced a $(3,2)$ shape output from our $(3,4)$ input. Note that we didn't specify the input size of `layer` before (though we can specify it with the argument `in_units=4` here), the system will automatically infer it during the first time we feed in data, create and initialize the weights. So we can access the weight after the first forward pass:
As can be seen, the layer's input limit of two produced a $(3,2)$ shape output from our $(3,4)$ input. You didn't specify the input size of `layer` before, though you can specify it with the argument `in_units=4` here. The system automatically infers it during the first time you feed in data, create, and initialize the weights. You can access the weight after the first forward pass, as shown in this example.

```{.python .input n=35}
layer.weight.data()
# layer.weight.data() # FIXME
```

## Chain layers into a neural network

Let's first consider a simple case that a neural network is a chain of layers. During the forward pass, we run layers sequentially one-by-one. The following code implements a famous network called [LeNet](http://yann.lecun.com/exdb/lenet/) through `nn.Sequential`.
Consider a simple case where a neural network is a chain of layers. During the forward pass, you run layers sequentially one-by-one. Use the following code to implement a famous network called [LeNet](http://yann.lecun.com/exdb/lenet/) through `nn.Sequential`.

```{.python .input}
net = nn.Sequential()
Expand All @@ -80,18 +83,18 @@ net

<!--Mention the tuple option for kernel and stride as an exercise for the reader? Or leave it out as too much info for now?-->

The usage of `nn.Sequential` is similar to `nn.Dense`. In fact, both of them are subclasses of `nn.Block`. The following codes show how to initialize the weights and run the forward pass.
Using `nn.Sequential` is similar to `nn.Dense`. In fact, both of them are subclasses of `nn.Block`. Use the following code to initialize the weights and run the forward pass.

```{.python .input}
net.initialize()
# Input shape is (batch_size, color_channels, height, width)
x = nd.random.uniform(shape=(4,1,28,28))
x = np.random.uniform(size=(4,1,28,28))
y = net(x)
y.shape
```

We can use `[]` to index a particular layer. For example, the following
accesses the 1st layer's weight and 6th layer's bias.
You can use `[]` to index a particular layer. For example, the following
accesses the first layer's weight and sixth layer's bias.

```{.python .input}
(net[0].weight.data().shape, net[5].bias.data().shape)
Expand All @@ -100,9 +103,9 @@ accesses the 1st layer's weight and 6th layer's bias.
## Create a neural network flexibly

In `nn.Sequential`, MXNet will automatically construct the forward function that sequentially executes added layers.
Now let's introduce another way to construct a network with a flexible forward function.
Here is another way to construct a network with a flexible forward function.

To do it, we create a subclass of `nn.Block` and implement two methods:
Create a subclass of `nn.Block` and implement two methods by using the following code.

- `__init__` create the layers
- `forward` define the forward function.
Expand All @@ -117,26 +120,31 @@ class MixMLP(nn.Block):
nn.Dense(4, activation='relu'))
self.dense = nn.Dense(5)
def forward(self, x):
y = nd.relu(self.blk(x))
y = npx.relu(self.blk(x))
print(y)
return self.dense(y)
net = MixMLP()
net
```

In the sequential chaining approach, we can only add instances with `nn.Block` as the base class and then run them in a forward pass. In this example, we used `print` to get the intermediate results and `nd.relu` to apply relu activation. So this approach provides a more flexible way to define the forward function.
In the sequential chaining approach, you can only add instances with `nn.Block` as the base class and then run them in a forward pass. In this example, you used `print` to get the intermediate results and `nd.relu` to apply relu activation. This approach provides a more flexible way to define the forward function.

The usage of `net` is similar as before.
The following code example uses `net` in a similar manner as earlier.

```{.python .input}
net.initialize()
x = nd.random.uniform(shape=(2,2))
x = np.random.uniform(size=(2,2))
net(x)
```

Finally, let's access a particular layer's weight
Finally, access a particular layer's weight with this code.

```{.python .input n=8}
net.blk[1].weight.data()
```

## Next steps

After you create a neural network, learn how to automatically
compute the gradients in [Step 3: Automatic differentiation with autograd](3-autograd.md).
Original file line number Diff line number Diff line change
Expand Up @@ -15,85 +15,92 @@
<!--- specific language governing permissions and limitations -->
<!--- under the License. -->

# Automatic differentiation with `autograd`
# Step 3: Automatic differentiation with autograd

We train models to get better and better as a function of experience. Usually, getting better means minimizing a loss function. To achieve this goal, we often iteratively compute the gradient of the loss with respect to weights and then update the weights accordingly. While the gradient calculations are straightforward through a chain rule, for complex models, working it out by hand can be a pain.
In this step, you learn how to use the MXNet `autograd` package to perform gradient calculations by automatically calculating derivatives.

Before diving deep into the model training, let's go through how MXNet’s `autograd` package expedites this work by automatically calculating derivatives.
This is helpful because it will help you save time and effort. You train models to get better as a function of experience. Usually, getting better means minimizing a loss function. To achieve this goal, you often iteratively compute the gradient of the loss with respect to weights and then update the weights accordingly. Gradient calculations are straightforward through a chain rule. However, for complex models, working this out manually is challenging.

## Basic usage
The `autograd` package helps you by automatically calculating derivatives.

Let's first import the `autograd` package.
## Basic use

To get started, import the `autograd` package as in the following code.

```{.python .input}
from mxnet import nd
from mxnet import np, npx
from mxnet import autograd
npx.set_np()
```

As a toy example, let’s say that we are interested in differentiating a function $f(x) = 2 x^2$ with respect to parameter $x$. We can start by assigning an initial value of $x$.
As an example, you could differentiate a function $f(x) = 2 x^2$ with respect to parameter $x$. You can start by assigning an initial value of $x$, as follows:

```{.python .input n=3}
x = nd.array([[1, 2], [3, 4]])
x = np.array([[1, 2], [3, 4]])
x
```

Once we compute the gradient of $f(x)$ with respect to $x$, we’ll need a place to store it. In MXNet, we can tell an NDArray that we plan to store a gradient by invoking its `attach_grad` method.
After you compute the gradient of $f(x)$ with respect to $x$, you need a place to store it. In MXNet, you can tell an ndarray that you plan to store a gradient by invoking its `attach_grad` method, shown in the following example.

```{.python .input n=6}
x.attach_grad()
```

Now we’re going to define the function $y=f(x)$. To let MXNet store $y$, so that we can compute gradients later, we need to put the definition inside a `autograd.record()` scope.
Next, define the function $y=f(x)$. To let MXNet store $y$, so that you can compute gradients later, use the following code to put the definition inside an `autograd.record()` scope.

```{.python .input n=7}
with autograd.record():
y = 2 * x * x
```

Let’s invoke back propagation (backprop) by calling `y.backward()`. When $y$ has more than one entry, `y.backward()` is equivalent to `y.sum().backward()`.
You can invoke back propagation (backprop) by calling `y.backward()`. When $y$ has more than one entry, `y.backward()` is equivalent to `y.sum().backward()`.
<!-- I'm not sure what this second part really means. I don't have enough context. TMI?-->

```{.python .input n=8}
y.backward()
```

Now, let’s see if this is the expected output. Note that $y=2x^2$ and $\frac{dy}{dx} = 4x$, which should be `[[4, 8],[12, 16]]`. Let's check the automatically computed results:
Next, verify whether this is the expected output. Note that $y=2x^2$ and $\frac{dy}{dx} = 4x$, which should be `[[4, 8],[12, 16]]`. Check the automatically computed results.

```{.python .input n=9}
x.grad
```

## Using Python control flows

Sometimes we want to write dynamic programs where the execution depends on some real-time values. MXNet will record the execution trace and compute the gradient as well.
Sometimes you want to write dynamic programs where the execution depends on real-time values. MXNet records the execution trace and computes the gradient as well.

Consider the following function `f`: it doubles the inputs until it's `norm` reaches 1000. Then it selects one element depending on the sum of its elements.
Consider the following function `f` in the following example code. The function doubles the inputs until its `norm` reaches 1000. Then it selects one element depending on the sum of its elements.
<!-- I wonder if there could be another less "mathy" demo of this -->

```{.python .input}
def f(a):
b = a * 2
while b.norm().asscalar() < 1000:
while np.abs(b).sum() < 1000:
b = b * 2
if b.sum().asscalar() >= 0:
if b.sum() >= 0:
c = b[0]
else:
c = b[1]
return c
```

We record the trace and feed in a random value:
In this example, you record the trace and feed in a random value.

```{.python .input}
a = nd.random.uniform(shape=2)
a = np.random.uniform(size=2)
a.attach_grad()
with autograd.record():
c = f(a)
c.backward()
```

We know that `b` is a linear function of `a`, and `c` is chosen from `b`. Then the gradient with respect to `a` be will be either `[c/a[0], 0]` or `[0, c/a[1]]`, depending on which element from `b` we picked. Let's find the results:
You can see that `b` is a linear function of `a`, and `c` is chosen from `b`. The gradient with respect to `a` be will be either `[c/a[0], 0]` or `[0, c/a[1]]`, depending on which element from `b` is picked. You see the results of this example with this code:

```{.python .input}
[a.grad, c/a]
a.grad == c/a
```

## Next Steps

After you have used `autograd`, learn about training a neural network. See [Step 4: Train the neural network](4-train.md).
Loading

0 comments on commit f436dcc

Please sign in to comment.