muRISCV-NN is a collection of efficient deep learning kernels for embedded platforms and microcontrollers. It is based on ARM's CMSIS-NN library but targets the RISC-V ISA instead.
It offers accelerated kernels using the RISC-V "V" vector extension v1.0, and the RISC-V packed "P" extension v0.9.6.
muRISCV-NN aims to stay functionally equivalent to CMSIS-NN so that no functional difference should be noticeable to users of either CMSIS-NN or muRISCV-NN. This way, muRISCV-NN acts as a drop-in replacement for CMSIS-NN and can be used with embedded deep learning frameworks such as TensorFlow Lite for Microcontrollers (TFLM) or microTVM.
We provide integration for both TFLM and microTVM in the Integration/
directory. Using these deep learning frameworks, we are able to run the complete suit of MLPerf Tiny Deep Learning Benchmarks consisting of MobileNet, ResNet, and AutoEncoder models.
You can simulate muRISCV-NN using a number of different simulators. We provide support for instruction-level simulators (such as Spike or riscvOVPsim), as well as register transfer level (RTL) implementations (Vicuna running on Verilator).
Please refer to the Sim/
directory for more information on each simulator and its corresponding files.
In order to ensure functional correctness on an individual kernel level, we provide a suite of unit tests in Tests/
. The unit tests use the same data as upstream CMSIS-NN, thus ensuring functional equivalency.
muRISCV-NN supports both the RISC-V GNU Compiler vector toolchain and LLVM (which has built-in RISC-V vector support). We provide pre-compiled toolchains in the Toolchain/
directory. Additionally, we also offer instructions on how to compile and install your own toolchain.
muRISCV-NN is not a GIT fork "in the traditional sense". Instead, we aim to pull in changes from "upstream" CMSIS-NN manually on a regular basis in order to stay consistent and up-to-date. A direct fork would not make much sense, as our code differs too much in functionality and naming compared to CMSIS-NN.
The latest upstream CMSIS-NN commit muRISCV-NN is based on is 8ec46de
(only respecting commits affecting the CMSIS/NN/
directory).
When running ResNet on TensorFlow Lite for Microcontrollers (TFLM), muRISCV-NN delivers close to 100x dynamic instruction count reduction:
Kernels | Extension | VLEN | Dynamic Instr. [x10^6] |
---|---|---|---|
Baseline | - | - | 688 |
muRISCV-NN | - | - | 62.5 |
muRISCV-NN | P-Ext. | - | 49.5 |
muRISCV-NN | V-Ext. | 64 | 12.3 |
muRISCV-NN | V-Ext. | 128 | 9.67 |
muRISCV-NN | V-Ext. | 256 | 8.41 |
muRISCV-NN | V-Ext. | 512 | 7.47 |
muRISCV-NN | V-Ext. | 1024 | 7.21 |
Stay tuned for more performance numbers in the near future!
This research is partially funded by the German Federal Ministry of Education and Research (BMBF) within the project Scale4Edge (grant number 16ME0465).