- Introduction
- Creating and Optimizing a TensorFlow Inference Graph
- Generating and Compiling Spatial
- Creating and Visualizing Input Data
- Running on the F1
This shows some end-to-end tutorials for running TensorFlow on the Amazon F1. We will look at 2 CNNs:
- LeNet
- ResNet-50 (two versions: Original Publication and Official TensorFlow)
Each one will highlight different aspects of the toolchain.
The steps to run examples are also provided in the example's models
directory.
The Official TensorFlow ResNet example contains a script to download the model trained by Google.
Change into the resnet_official
model example directory and run:
bash get_resnet_official.sh
Next, run the following commands to optimize the model. Note these instructions are also provided in the example's README.
First, create the unoptimized inference graph:
python create_inference_graph.py saved_model models/resnet_official/resnet_v1_fp32_savedmodel_NHWC/1538686669 softmax_tensor models/resnet_official/ resnet_official
Next, optimize it using TensorFlow's graph utilities:
python optimize_inference_graph.py models/resnet_official/resnet_official.pb input_tensor softmax_tensor 224,224,3
This script will create a new optimized graph.
For LeNet and the Original ResNet, the optimized models are already included in the models directory.
Once the optimizations are run, generate Spatial.
Once the .scala
file is generated, follow the printed instructions to move the generated file to the Spatial apps
directory and then run compilation.
python dnn_to_spatial.py models/lenet/lenet.pb
mv lenet.scala spatial/test/spatial/tests/apps/
cd spatial
bin/spatial lenet --synth --forceFuseFMA --fpga=AWS_F1 && cd gen/lenet && make aws-F1
Now, modify dnn_to_spatial
here to set include_imagenet_classification
to True
. Eventually this will be a flag passed into the script or part of a configuration file. It adds extra host code after the application execution completes which reads in the ImageNet class file and prints classification results.
Now generate Spatial for the ResNets as well.
python dnn_to_spatial.py models/resnet/ResNet50_opt.pb
mv resnet50opt.scala spatial/test/spatial/tests/apps/
cd spatial
bin/spatial resnet50opt --synth --forceFuseFMA --noBindParallels --fpga=AWS_F1 && cd gen/resnet50opt && make aws-F1
python dnn_to_spatial.py models/resnet_official/resnet_official_opt2.pb
mv resnetofficialopt2.scala spatial/test/spatial/tests/apps/
cd spatial
bin/spatial resnetofficialopt2 --synth --forceFuseFMA --noBindParallels --fpga=AWS_F1 && cd gen/resnetofficialopt2 && make aws-F1
For LeNet, we've included an example input.
To visualize this file, which is Grayscale, run:
python data/vis.py models/lenet/7.csv grayscale 28,28
The result is:
For the ResNets, we will use an image from the internet. We have selected a fire truck, but encourage you to select any image from the ImageNet classification list and test that the classification works for a variety of images.
Once you download this image (we tested with a .jpg), convert it to a .csv format:
python data/img_to_csv.py /path/to/img.jpg
You can then rename this .csv and also visualize it:
python data/vis.py input0.csv
No arguments are needed because this is an RGB image and 224,224 is the standard input size.
Note that the img_to_csv.py
usage above converts the image to a .csv and reshapes it, but does not preprocess it (e.g. scaling, mean subtraction).
This is because for the Original ResNet, this preprocessing is done in the TensorFlow graph. For the TensorFlow Official ResNet, however,
the preprocessing is done in python before the TensorFlow graph is built (for details, see
here).
This means that the Official ResNet needs an input with preprocessing
already performed. The preprocessing first does a division by 225.0
, and then mean subtraction from each channel with mean vector [123.68, 116.78, 103.94]
.
To generate another .csv with this preprocessing, run:
python data/img_to_csv.py /path/to/img.jpg 224,224,3 123.68,116.78,103.94 255.0
When running the Spatial Top
executable, pass the appropriate .csv file as an argument.
These steps now follow the AWS instructions.
Once the Spatial compilation finishes, create the AGFI ID in each directory by running:
bash create_spatial_AFI_instructions.sh
The output will look something like this:
{
"FpgaImageId": "afi-xxxxxxxxxxxxxxxxx",
"FpgaImageGlobalId": "agfi-xxxxxxxxxxxxxxxxx"
}
For the three examples, our global (AGFI) IDs were:
LeNet: 054891e89ff91ffed
ResNet (Official): 0603ca85f32ab0a6a
ResNet (Original): 0b6c5a9d112305248
Now edit load.sh
in each software/runtime
directory and paste in your AGFI ID.
Opening an F1 instance as the AWS instructions describe, copy the following to the F1 instance for each app:
load.sh
Top
binary- the input .csv files
- the class list file (for ImageNet), which is in
data/imagenet_classes.csv
- the weights binary files, which exist in the example's directory under
models
Run the ResNet example using:
bash load.sh
sudo ./Top /path/to/input.csv /path/to/classes.csv /path/to/weights/directory/
For example:
bash load.sh
sudo ./Top /home/centos/src/project_data/input.csv /home/centos/src/project_data/imagenet_classes.csv /home/centos/src/project_data/ResNet50_opt_spatial_weights/
First the weights will be copied from the host DRAM to the FPGA DRAM, and then the inference will be run.
The output is as follows (shown for both versions):
Screenshot 1: (before output prints)
Screenshot 2: (end of output)
Screenshot 1: (before output prints)
Screenshot 2: (end of output)
After the application finishes, first the profiling information is printed, followed by
the output of the final DNN layer (before SoftMax).
Then because ImageNet classification was set to True
and the class file was passed as an argument, the class names are also printed.
Both versions have roughly the same execution times (within 5%) and same Top-1 classification.
Note that currently only 32-bit computation is supported, but in the future lower precisions may also be supported.
Note also that while here only a single inference was run and timed, once the weights are copied to the FPGA DRAM the number of inferences
that can be run is unlimited, i.e. the weights only need to be copied once. For now this requires editing TopHost.cpp to write a loop around
the call to run()
.
Now run the LeNet app using:
bash load.sh
sudo ./Top /path/to/input.csv /path/to/weights/directory/
For example:
bash load.sh
sudo ./Top 7.csv /home/centos/src/project_data/lenet_spatial_weights/
The output is as follows:
Notice that for LeNet, only the 10 outputs were printed. These are the outputs before SoftMax corresponding to classes 0 through 9.
The eighth output is highest, which represents the class 7
, as expected.
In order to print arbitrary class names for a custom dataset other than ImageNet, follow these steps:
- Set
include_imagenet_classification
toTrue
as described above - After running
dnn_to_spatial.py
, edit the bottom of the generated.scala
file to specify the number of classes and how many top predictions should be printed
In the future this support for custom class names may also be added to the dnn_to_spatial.py
script.