From 49ec4145c365fd1af8eac24ceea9c64887f0014a Mon Sep 17 00:00:00 2001
From: Abhi Khobare <quic_akhobare@quicinc.com>
Date: Thu, 17 Jun 2021 19:51:12 -0700
Subject: [PATCH] Added AdaRound and RNN QAT results

Signed-off-by: Abhi Khobare <quic_akhobare@quicinc.com>
---
 README.md | 104 +++++++++++++++++++++++++++++++++++++++++++++++-------
 1 file changed, 91 insertions(+), 13 deletions(-)
diff --git a/README.md b/README.md
index 4845254dd0c..e8371150c43 100644
--- a/README.md
+++ b/README.md
@@ -79,34 +79,110 @@ Some recently added features include
 
 ## Results
 
-AIMET can quantize an existing 32-bit floating-point model to an 8-bit fixed-point model without sacrificing much accuracy and without model fine-tuning. As an example of accuracy maintained, the DFQ method applied to several popular networks, such as MobileNet-v2 and ResNet-50, result in less than 0.9% loss in accuracy all the way down to 8-bit quantization, in an automated way without any training data.
+AIMET can quantize an existing 32-bit floating-point model to an 8-bit fixed-point model without sacrificing much accuracy and without model fine-tuning. 
+
+
+<h4>DFQ</h4>
+
+The DFQ method applied to several popular networks, such as MobileNet-v2 and ResNet-50, result in less than 0.9% 
+loss in accuracy all the way down to 8-bit quantization, in an automated way without any training data.
 
 <table style="width:50%">
   <tr>
     <th style="width:80px">Models</th>
-    <th>FP32 </th>
+    <th>FP32</th>
     <th>INT8 Simulation </th>
   </tr>
   <tr>
     <td>MobileNet v2 (top1)</td>
-    <td>71.72%</td>
-    <td>71.08%</td>
+    <td align="center">71.72%</td>
+    <td align="center">71.08%</td>
   </tr>
   <tr>
     <td>ResNet 50 (top1)</td>
-    <td>76.05%</td>
-    <td>75.45%</td>
+    <td align="center">76.05%</td>
+    <td align="center">75.45%</td>
   </tr>
   <tr>
     <td>DeepLab v3 (mIOU)</td>
-    <td>72.65%</td>
-    <td>71.91%</td>
+    <td align="center">72.65%</td>
+    <td align="center">71.91%</td>
+  </tr>
+</table>
+<br>
+
+<h4>AdaRound (Adaptive Rounding)</h4>
+<h5>ADAS Object Detect</h5>
+<p>For this example ADAS object detection model, which was challenging to quantize to 8-bit precision, 
+AdaRound can recover the accuracy to within 1% of the FP32 accuracy.</p>
+<table style="width:50%">
+  <tr>
+    <th style="width:80px" colspan="15">Configuration</th>
+    <th>mAP - Mean Average Precision</th>
+  </tr>
+  <tr>
+    <td colspan="15">FP32</td>
+    <td align="center">82.20%</td>
+  </tr>
+  <tr>
+    <td colspan="15">Nearest Rounding (INT8 weights, INT8 acts)</td>
+    <td align="center">49.85%</td>
+  </tr>
+  <tr>
+    <td colspan="15">AdaRound (INT8 weights, INT8 acts)</td>
+    <td align="center" bgcolor="#add8e6">81.21%</td>
   </tr>
 </table>
 
+<h5>DeepLabv3 Semantic Segmentation</h5>
+<p>For some models like the DeepLabv3 semantic segmentation model, AdaRound can even quantize the model weights to 
+4-bit precision without a significant drop in accuracy.</p>
+<table style="width:50%">
+  <tr>
+    <th style="width:80px" colspan="15">Configuration</th>
+    <th>mIOU - Mean intersection over union</th>
+  </tr>
+  <tr>
+    <td colspan="15">FP32</td>
+    <td align="center">72.94%</td>
+  </tr>
+  <tr>
+    <td colspan="15">Nearest Rounding (INT4 weights, INT8 acts)</td>
+    <td align="center">6.09%</td>
+  </tr>
+  <tr>
+    <td colspan="15">AdaRound (INT4 weights, INT8 acts)</td>
+    <td align="center" bgcolor="#add8e6">70.86%</td>
+  </tr>
+</table>
 <br>  
 
-AIMET can also significantly compress models. For popular models, such as Resnet-50 and Resnet-18, compression with spatial SVD plus channel pruning achieves 50% MAC (multiply-accumulate) reduction while retaining accuracy within approx. 1% of the original uncompressed model.
+<h4>Quantization for Recurrent Models</h4>
+<p>AIMET supports quantization simulation and quantization-aware training (QAT) for recurrent models (RNN, LSTM, GRU). 
+Using QAT feature in AIMET, a DeepSpeech2 model with bi-directional LSTMs can be quantized to 8-bit precision with 
+minimal drop in accuracy.</p>
+
+<table style="width:50%">
+  <tr>
+    <th>DeepSpeech2 <br>(using bi-directional LSTMs)</th>
+    <th>Word Error Rate</th>
+  </tr>
+  <tr>
+    <td>FP32</td>
+    <td align="center">9.92%</td>
+  </tr>
+  <tr>
+    <td>INT8</td>
+    <td align="center">10.22%</td>
+  </tr>
+</table>
+
+<br>
+
+<h4>Model Compression</h4>
+<p>AIMET can also significantly compress models. For popular models, such as Resnet-50 and Resnet-18, 
+compression with spatial SVD plus channel pruning achieves 50% MAC (multiply-accumulate) reduction while retaining 
+accuracy within approx. 1% of the original uncompressed model.</p>
 
 <table style="width:50%">
   <tr>
@@ -116,16 +192,18 @@ AIMET can also significantly compress models. For popular models, such as Resnet
   </tr>
   <tr>
     <td>ResNet18 (top1)</td>
-    <td>69.76%</td>
-    <td>68.56%</td>
+    <td align="center">69.76%</td>
+    <td align="center">68.56%</td>
   </tr>
   <tr>
     <td>ResNet 50 (top1)</td>
-    <td>76.05%</td>
-    <td>75.75%</td>
+    <td align="center">76.05%</td>
+    <td align="center">75.75%</td>
   </tr>
 </table>
 
+<br>
+
 ## Installation Instructions
 To install and use the pre-built version of the AIMET package, please follow one of the below links:
 - [Install and run AIMET in *Ubuntu* environment](./packaging/install.md)

Models	FP32	FP32	INT8 Simulation
MobileNet v2 (top1)	71.72%	71.08%	71.72%	71.08%
ResNet 50 (top1)	76.05%	75.45%	76.05%	75.45%
DeepLab v3 (mIOU)	72.65%	71.91%	72.65%	71.91%
Configuration															mAP - Mean Average Precision
FP32															82.20%
Nearest Rounding (INT8 weights, INT8 acts)															49.85%
AdaRound (INT8 weights, INT8 acts)															81.21%
Configuration															mIOU - Mean intersection over union
FP32															72.94%
Nearest Rounding (INT4 weights, INT8 acts)															6.09%
AdaRound (INT4 weights, INT8 acts)															70.86%
ResNet18 (top1)	69.76%	68.56%	69.76%	68.56%
ResNet 50 (top1)	76.05%	75.75%	76.05%	75.75%