Add benchmark models that are not easily accessible (#5)

* Add QAT BERT model * Add Efficientnet v2 model * Add README for model dir * Add more details on export and talk about license stuff
tlc-pack · Mar 15, 2022 · 218ad10 · 218ad10
1 parent 57eef48
commit 218ad10
Show file tree

Hide file tree

Showing 4 changed files with 14 additions and 0 deletions.
diff --git a/models/.gitattributes b/models/.gitattributes
@@ -0,0 +1,4 @@
+bert-base-qat.onnx filter=lfs diff=lfs merge=lfs -text
+efficientnetv2.onnx filter=lfs diff=lfs merge=lfs -text
+efficientnetv2-s.onnx filter=lfs diff=lfs merge=lfs -text
+efficientnetv2-m.onnx filter=lfs diff=lfs merge=lfs -text
diff --git a/models/README.md b/models/README.md
@@ -0,0 +1,4 @@
+This directory stores good models for benchmarking.
+
+- [Int8 BERT quantized with Quantization-Aware training](bert-base-qat.onnx) following the steps in https://github.com/NVIDIA/FasterTransformer/tree/main/bert-quantization/bert-pyt-quantization#quantization-aware-fine-tuning and converted to ONNX manually using [this function](https://gist.github.com/masahi/19ff1e59a7558a21c80de9e6707108eb#file-qat_bert_export-py-L741). The model and `run_squad.py` script that the export code is based on are both licensed under Apache-2.0.
+- [EfficientNetv2-M](efficientnetv2-m.onnx), the original TF2 model is from https://github.com/google/automl/tree/master/efficientnetv2 and converted to ONNX following the steps in https://github.com/NVIDIA/TensorRT/tree/master/samples/python/efficientnet#2-efficientnet-v2. Both the original model and the ONNX export code are licensed under Apache-2.0.
diff --git a/models/bert-base-qat.onnx b/models/bert-base-qat.onnx
diff --git a/models/efficientnetv2-m.onnx b/models/efficientnetv2-m.onnx