From 49ec4145c365fd1af8eac24ceea9c64887f0014a Mon Sep 17 00:00:00 2001 From: Abhi Khobare Date: Thu, 17 Jun 2021 19:51:12 -0700 Subject: [PATCH] Added AdaRound and RNN QAT results Signed-off-by: Abhi Khobare --- README.md | 104 +++++++++++++++++++++++++++++++++++++++++++++++------- 1 file changed, 91 insertions(+), 13 deletions(-) diff --git a/README.md b/README.md index 4845254dd0c..e8371150c43 100644 --- a/README.md +++ b/README.md @@ -79,34 +79,110 @@ Some recently added features include ## Results -AIMET can quantize an existing 32-bit floating-point model to an 8-bit fixed-point model without sacrificing much accuracy and without model fine-tuning. As an example of accuracy maintained, the DFQ method applied to several popular networks, such as MobileNet-v2 and ResNet-50, result in less than 0.9% loss in accuracy all the way down to 8-bit quantization, in an automated way without any training data. +AIMET can quantize an existing 32-bit floating-point model to an 8-bit fixed-point model without sacrificing much accuracy and without model fine-tuning. + + +

DFQ

+ +The DFQ method applied to several popular networks, such as MobileNet-v2 and ResNet-50, result in less than 0.9% +loss in accuracy all the way down to 8-bit quantization, in an automated way without any training data. - + - - + + - - + + - - + + + +
ModelsFP32 FP32 INT8 Simulation
MobileNet v2 (top1)71.72%71.08%71.72%71.08%
ResNet 50 (top1)76.05%75.45%76.05%75.45%
DeepLab v3 (mIOU)72.65%71.91%72.65%71.91%
+
+ +

AdaRound (Adaptive Rounding)

+
ADAS Object Detect
+

For this example ADAS object detection model, which was challenging to quantize to 8-bit precision, +AdaRound can recover the accuracy to within 1% of the FP32 accuracy.

+ + + + + + + + + + + + + + + +
ConfigurationmAP - Mean Average Precision
FP3282.20%
Nearest Rounding (INT8 weights, INT8 acts)49.85%
AdaRound (INT8 weights, INT8 acts)81.21%
+
DeepLabv3 Semantic Segmentation
+

For some models like the DeepLabv3 semantic segmentation model, AdaRound can even quantize the model weights to +4-bit precision without a significant drop in accuracy.

+ + + + + + + + + + + + + + + + + +
ConfigurationmIOU - Mean intersection over union
FP3272.94%
Nearest Rounding (INT4 weights, INT8 acts)6.09%
AdaRound (INT4 weights, INT8 acts)70.86%

-AIMET can also significantly compress models. For popular models, such as Resnet-50 and Resnet-18, compression with spatial SVD plus channel pruning achieves 50% MAC (multiply-accumulate) reduction while retaining accuracy within approx. 1% of the original uncompressed model. +

Quantization for Recurrent Models

+

AIMET supports quantization simulation and quantization-aware training (QAT) for recurrent models (RNN, LSTM, GRU). +Using QAT feature in AIMET, a DeepSpeech2 model with bi-directional LSTMs can be quantized to 8-bit precision with +minimal drop in accuracy.

+ + + + + + + + + + + + + + +
DeepSpeech2
(using bi-directional LSTMs)
Word Error Rate
FP329.92%
INT810.22%
+ +
+ +

Model Compression

+

AIMET can also significantly compress models. For popular models, such as Resnet-50 and Resnet-18, +compression with spatial SVD plus channel pruning achieves 50% MAC (multiply-accumulate) reduction while retaining +accuracy within approx. 1% of the original uncompressed model.

@@ -116,16 +192,18 @@ AIMET can also significantly compress models. For popular models, such as Resnet - - + + - - + +
ResNet18 (top1)69.76%68.56%69.76%68.56%
ResNet 50 (top1)76.05%75.75%76.05%75.75%
+
+ ## Installation Instructions To install and use the pre-built version of the AIMET package, please follow one of the below links: - [Install and run AIMET in *Ubuntu* environment](./packaging/install.md)