[PYTHON, TVM] Python TVM library, unit tests and end to end example

* VTA python library * Python unit tests * End to end example with Resnet18 * README instructions * Bug fixes
tqchen · Jul 12, 2018 · 16b5877 · 16b5877
1 parent e7557db
commit 16b5877
Show file tree

Hide file tree

Showing 35 changed files with 4,046 additions and 77 deletions.
diff --git a/Makefile b/Makefile
@@ -55,10 +55,10 @@ endif
 all: lib/libvta.$(SHARED_LIBRARY_SUFFIX)
 
 VTA_LIB_SRC = $(wildcard src/*.cc src/tvm/*.cc)
-ifeq ($(TARGET), PYNQ_TARGET)
+ifeq ($(TARGET), VTA_PYNQ_TARGET)
 	VTA_LIB_SRC += $(wildcard src/pynq/*.cc)
 	LDFLAGS += -L/usr/lib -lsds_lib
-	LDFLAGS += -L/opt/python3.6/lib/python3.6/site-packages/pynq/drivers/ -l:libdma.so
+	LDFLAGS += -L/opt/python3.6/lib/python3.6/site-packages/pynq/lib/ -l:libdma.so
 endif
 VTA_LIB_OBJ = $(patsubst %.cc, build/%.o, $(VTA_LIB_SRC))
 
@@ -79,7 +79,7 @@ cpplint:
 	python nnvm/dmlc-core/scripts/lint.py vta cpp include src hardware tests
 
 pylint:
-	pylint python/vta --rcfile=$(ROOTDIR)/tests/lint/pylintrc
+	pylint python/tvm_vta --rcfile=$(ROOTDIR)/tests/lint/pylintrc
 
 doc:
 	doxygen docs/Doxyfile

diff --git a/apps/pynq_rpc/README.md b/apps/pynq_rpc/README.md
@@ -0,0 +1,80 @@
+### PYNQ RPC Server for VTA
+
+This guide describes how to setup a Pynq-based RPC server to accelerate deep learning workloads with VTA.
+
+## Pynq Setup
+
+Follow the getting started tutorial for the [Pynq board](http://pynq.readthedocs.io/en/latest/getting_started.html).
+* For this RPC setup make sure to go with the *Connect to a Computer* Ethernet setup.
+
+Make sure that you can ssh into your Pynq board successfully:
+```bash
+ssh xilinx@192.168.2.99
+```
+
+When ssh-ing onto the board, the default password for the `xilinx` account is `xilinx`.
+
+For convenience let's go ahead and mount the Pynq board's file system to easily access it and maintain it:
+```bash
+sshfs xilinx@192.168.2.99:/home/xilinx <mountpoint>
+```
+
+## Pynq TVM & VTA installation
+
+On your **host PC**, go to the `<mountpoint>` directory of your Pynq board file system.
+```bash
+cd <mountpoint>
+```
+
+From there, clone the VTA repository:
+```bash
+git clone git@github.com:uwsaml/vta.git --recursive
+```
+
+Next, clone the TVM repository:
+```bash
+git clone git@github.com:dmlc/tvm.git --recursive
+```
+
+TVM is rapidly changing, and to ensure stability, we keep track of working TVM checkpoints.
+As of now, the TVM checkpoint `e4c2af9abdcb3c7aabafba8084414d7739c17c4c` is known to work with VTA.
+```bash
+git checkout e4c2af9abdcb3c7aabafba8084414d7739c17c4c
+```
+
+Now, ssh into your **Pynq board** to build the TVM runtime with the following commands:
+```bash
+ssh xilinx@192.168.2.99 # ssh if you haven't done so
+cd ~/tvm
+cp make/config.mk .
+echo USE_RPC=1 >> config.mk
+make runtime -j2
+```
+
+## Pynq RPC server setup
+
+We're now ready to build the Pynq RPC server on the Pynq board.
+```bash
+ssh xilinx@192.168.2.99 # ssh if you haven't done so
+cd ~/vta
+export TVM_PATH = /home/xilinx/tvm
+make
+```
+
+The last stage will build the `192.168.2.99:home/xilinx/vta/lib/libvta.so` library file. We are now ready to launch the RPC server on the Pynq. In order to enable the FPGA drivers, we need to run the RPC server with administrator privileges (using `su`, account: `xilinx`, pwd: `xilinx`).
+```bash
+ssh xilinx@192.168.2.99 # ssh if you haven't done so
+cd ~/vta
+su
+./apps/pynq_rpc/start_rpc_server.sh
+```
+
+You should see the following being displayed when starting the RPC server:
+```
+INFO:root:Load additional library /home/xilinx/vta/lib/libvta.so
+INFO:root:RPCServer: bind to 0.0.0.0:9091
+```
+
+Note that it should be listening on port `9091`.
+
+To kill the RPC server, just enter the `Ctrl + c` command.
diff --git a/apps/pynq_rpc/start_rpc_server.sh b/apps/pynq_rpc/start_rpc_server.sh
@@ -1,4 +1,4 @@
 #!/bin/bash
 export PYTHONPATH=${PYTHONPATH}:/home/xilinx/tvm/python
-export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/opt/python3.6/lib/python3.6/site-packages/pynq/drivers/
+export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/opt/python3.6/lib/python3.6/site-packages/pynq/lib/
 python -m  tvm.exec.rpc_server --load-library /home/xilinx/vta/lib/libvta.so
diff --git a/examples/resnet18/pynq/.gitignore b/examples/resnet18/pynq/.gitignore
@@ -0,0 +1,5 @@
+quantize_graph.json
+quantize_params.pkl
+synset.txt
+*.jpg
+vta.bit
diff --git a/examples/resnet18/pynq/README.md b/examples/resnet18/pynq/README.md
@@ -0,0 +1,98 @@
+# Resnet-18 Example on Pynq-based VTA Design
+
+In order to run this example you'll need to have:
+* VTA installed
+* TVM installed
+* NNVM installed
+* A Pynq-based RPC server running
+
+## VTA installation
+
+Clone the VTA repository in the directory of your choosing:
+```bash
+git clone git@github.com:uwsaml/vta.git --recursive
+```
+
+Update your `~/.bashrc` file to include the VTA python libraries in your `PYTHONPATH` (don't forget to source the newly modified `.bashrc` file!):
+```bash
+export PYTHONPATH=<vta root>/python:${PYTHONPATH}
+```
+
+## TVM installation
+
+Clone the TVM repository in the directory of your choosing:
+```bash
+git clone git@github.com:dmlc/tvm.git --recursive
+```
+
+TVM is rapidly changing, and to ensure stability, we keep track of working TVM checkpoints.
+As of now, the TVM checkpoint `e4c2af9abdcb3c7aabafba8084414d7739c17c4c` is known to work with VTA.
+```bash
+git checkout e4c2af9abdcb3c7aabafba8084414d7739c17c4c
+```
+
+Before building TVM, copy the `make/config.mk` file into the root TVM directory:
+```bash
+cd <tvm root>
+cp make/config.mk .
+```
+
+In the 'config.mk' file sure that:
+* `LLVM_CONFIG` points to the llvm-config executable (e.g. `LLVM_CONFIG = /usr/bin/llvm-config-4.0`). You'll need to have llvm4.0 installed or later.
+* `USE_RPC` should be set to 1
+
+Launch the compilation, this takes about 5 minutes.
+```bash
+cd <tvm root>
+make -j4
+```
+
+Finally update your `~/.bashrc` file to include the TVM python libraries in your `PYTHONPATH` (don't forget to source the newly modified `.bashrc` file!):
+```bash
+export PYTHONPATH=<tvm root>/python:<tvm root>/topi/python:${PYTHONPATH}
+```
+
+## NNVM installation
+
+Clone the NNVM repository from `tqchen` in the directory of your choosing:
+```bash
+git clone git@github.com:tqchen/nnvm.git --recursive
+```
+
+To run this example, we rely on a special branch of NNVM: `qt`:
+```bash
+cd <nnvm root>
+git checkout qt
+```
+
+Launch the compilation, this takes less a minute.
+```bash
+cd <nnvm root>
+make -j4
+```
+
+Finally update your `~/.bashrc` file to include the NNVM python libraries in your `PYTHONPATH` (don't forget to source the newly modified `.bashrc` file!):
+```bash
+export PYTHONPATH=<nnvm root>/python:${PYTHONPATH}
+```
+
+## Pynq RPC Server Setup
+
+Follow the [Pynq RPC Server Guide](https://github.com/saml/vta/tree/master/apps/pynq_rpc/README.md)
+
+## Running the example
+
+Simply run the following python script:
+```bash
+python imagenet_predict.py
+```
+
+This will run imagenet classification using the ResNet18 architecture on a VTA design that performs 8-bit integer inference, to perform classification on a cat image `cat.jpg`.
+
+The script reports runtime measured on the Pynq board, and the top-1 result category:
+```
+('x', (1, 3, 224, 224))
+Build complete...
+('TVM prediction top-1:', 281, 'tabby, tabby cat')
+t-cost=0.41906
+```
diff --git a/examples/resnet18/pynq/imagenet_predict.py b/examples/resnet18/pynq/imagenet_predict.py
@@ -0,0 +1,174 @@
+# some standard imports
+import nnvm
+import tvm
+from nnvm.compiler import graph_attr
+import vta
+import os
+import numpy as np
+from PIL import Image
+import pickle
+import json
+import logging
+import wget
+from tvm.contrib import graph_runtime, rpc, util
+
+factor = 16
+host = "pynq"
+port = 9091
+verbose = False
+# only run fpga component, mark non-conv ops as nop
+debug_fpga_only = False
+
+# Obtain model and hardware files (they're too large to check-in)
+url = "https://homes.cs.washington.edu/~moreau/media/vta/"
+TEST_FILE = 'cat.jpg'
+CATEG_FILE = 'synset.txt'
+RESNET_GRAPH_FILE = 'quantize_graph.json'
+RESNET_PARAMS_FILE = 'quantize_params.pkl'
+BITSTREAM_FILE = 'vta.bit'
+for file in [TEST_FILE, CATEG_FILE, RESNET_GRAPH_FILE, RESNET_PARAMS_FILE, BITSTREAM_FILE]:
+    if not os.path.isfile(file):
+        print "Downloading {}".format(file)
+        wget.download(url+file) 
+
+# Program the FPGA remotely
+assert tvm.module.enabled("rpc")
+remote = rpc.connect(host, port)
+remote.upload(BITSTREAM_FILE, BITSTREAM_FILE)
+fprogram = remote.get_function("tvm.contrib.vta.init")
+fprogram(BITSTREAM_FILE)
+
+if verbose:
+    logging.basicConfig(level=logging.INFO)
+
+# Change to -device=tcpu to run cpu only inference.
+target = "llvm -device=vta"
+
+synset = eval(open(os.path.join(CATEG_FILE)).read())
+image = Image.open(os.path.join(TEST_FILE)).resize((224, 224))
+
+def transform_image(image):
+    image = np.array(image) - np.array([123., 117., 104.])
+    image /= np.array([58.395, 57.12, 57.375])
+    image = image.transpose((2, 0, 1))
+    image = image[np.newaxis, :]
+    return image
+
+def mark_nop(graph, conv_layer=-1, skip_conv_layer=()):
+    """Helper function to mark certain op as nop
+
+    Useful to debug performance issues.
+    """
+    jgraph = json.loads(graph.json())
+    counter = 0
+    for nid, node in enumerate(jgraph["nodes"]):
+        op_name = node["op"]
+        if op_name != "tvm_op":
+            continue
+        attrs = node["attrs"]
+        node_name = node["name"]
+        func_name = attrs["func_name"]
+        if func_name.find("quantized_conv2d") != -1:
+            if conv_layer >= 0:
+                if counter != conv_layer:
+                    attrs["func_name"] = "__nop"
+            if counter in skip_conv_layer:
+                attrs["func_name"] = "__nop"
+            counter += 1
+        else:
+            if conv_layer >= 0:
+                attrs["func_name"] = "__nop"
+            attrs["func_name"] = "__nop"
+        if attrs["func_name"] != "__nop":
+            print("Run function %s"% func_name)
+    graph = nnvm.graph.load_json(json.dumps(jgraph))
+    return graph
+
+x = transform_image(image)
+print('x', x.shape)
+
+######################################################################
+# now compile the graph
+import nnvm.compiler
+np.random.seed(0)
+sym = nnvm.graph.load_json(
+    open(os.path.join(RESNET_GRAPH_FILE)).read())
+params = pickle.load(
+    open(os.path.join(RESNET_PARAMS_FILE)))
+
+shape_dict = {"data": x.shape}
+dtype_dict = {"data": 'float32'}
+shape_dict.update({k: v.shape for k, v in params.items()})
+dtype_dict.update({k: str(v.dtype) for k, v in params.items()})
+
+graph = nnvm.graph.create(sym)
+graph_attr.set_shape_inputs(sym, shape_dict)
+graph_attr.set_dtype_inputs(sym, dtype_dict)
+graph = graph.apply("InferShape").apply("InferType")
+
+dtype = "float32"
+sym = vta.graph.remove_stochastic(sym)
+sym = vta.graph.clean_cast(sym)
+sym = vta.graph.clean_conv_fuse(sym)
+if "vta" in target:
+    sym = vta.graph.pack(sym, shape_dict, factor)
+
+graph_attr.set_shape_inputs(sym, shape_dict)
+sym = sym.apply("InferShape")
+graph_attr.set_dtype_inputs(sym, dtype_dict)
+sym = sym.apply("InferType")
+
+with nnvm.compiler.build_config(opt_level=3):
+    bdict = {}
+    if "vta" not in target:
+        bdict = {"add_lower_pass": []}
+    else:
+        bdict = {"add_lower_pass": vta.debug_mode(0)}
+    with tvm.build_config(**bdict):
+        graph, lib, params = nnvm.compiler.build(
+            sym, target, shape_dict, dtype_dict,
+            params=params)
+
+remote = rpc.connect(host, port)
+temp = util.tempdir()
+lib.save(temp.relpath("graphlib.o"))
+remote.upload(temp.relpath("graphlib.o"))
+lib = remote.load_module("graphlib.o")
+ctx = remote.ext_dev(0) if "vta" in target else remote.cpu(0)
+
+print("Build complete...")
+
+def run_e2e(graph):
+    """Running end to end example
+    """
+    if debug_fpga_only:
+        graph = mark_nop(graph, skip_conv_layer=(0,))
+    m = graph_runtime.create(graph, lib, ctx)
+    # set inputs
+    m.set_input('data', tvm.nd.array(x.astype("float32")))
+    m.set_input(**params)
+    # execute
+    timer = m.module.time_evaluator("run", ctx, number=10)
+    tcost = timer()
+    # get outputs
+    tvm_output = m.get_output(
+        0,tvm.nd.empty((1000,), dtype, remote.cpu(0)))
+    top1 = np.argmax(tvm_output.asnumpy())
+    print('TVM prediction top-1:', top1, synset[top1])
+    print("t-cost=%g" % tcost.mean)
+
+
+def run_layer(old_graph):
+    """Run a certain layer."""
+    for layer_id in range(1, 2):
+        graph = mark_nop(old_graph, layer_id)
+        m = graph_runtime.create(graph, lib, ctx)
+        # set inputs
+        m.set_input('data', tvm.nd.array(x.astype("float32")))
+        m.set_input(**params)
+        # execute
+        timer = m.module.time_evaluator("run", ctx, number=10)
+        tcost = timer()
+        print("resnet[%d]: %g\n"% (layer_id, tcost.mean))
+
+run_e2e(graph)
diff --git a/hardware/vivado/.gitignore → hardware/xilinx/.gitignore b/hardware/vivado/.gitignore → hardware/xilinx/.gitignore