Intel® Neural Compressor validated examples with multiple compression techniques, including quantization, pruning, knowledge distillation and orchestration. Part of the validated cases can be found in the example tables, and the release data is available here.
- tf_example1: quantize with built-in dataloader and metric.
- tf_example2: quantize keras model with customized metric and dataloader.
- tf_example3: quantize slim model.
- tf_example4: quantize checkpoint with dummy dataloader.
- tf_example5: config performance and accuracy measurement.
- tf_example6: use default user-facing APIs to quantize a pb model.
- tf_example7: enable quantization and benchmark with python-flavor config.
- tf_example8: quantize with pure python API.
- BERT Mini SST2 performance boost with INC: train a BERT-Mini model on SST-2 dataset through distillation, and leverage quantization to accelerate the inference while maintaining the accuracy using Intel® Neural Compressor.
- Performance of FP32 Vs. INT8 ResNet50 Model: compare existed FP32 & INT8 ResNet50 model directly.
- Intel® Neural Compressor Sample for PyTorch*: an End-To-End pipeline to build up a CNN model by PyTorch to recognize fashion image and speed up AI model by Intel® Neural Compressor.
- Intel® Neural Compressor Sample for TensorFlow*: an End-To-End pipeline to build up a CNN model by TensorFlow to recognize handwriting number and speed up AI model by Intel® Neural Compressor:
Model | Domain | Approach | Examples |
---|---|---|---|
ResNet50 V1.0 | Image Recognition | Post-Training Static Quantization | pb |
ResNet50 V1.5 | Image Recognition | Post-Training Static Quantization | pb |
ResNet101 | Image Recognition | Post-Training Static Quantization | pb |
MobileNet V1 | Image Recognition | Post-Training Static Quantization | pb / SavedModel |
MobileNet V2 | Image Recognition | Post-Training Static Quantization | pb / SavedModel |
MobileNet V3 | Image Recognition | Post-Training Static Quantization | pb |
Inception V1 | Image Recognition | Post-Training Static Quantization | pb |
Inception V2 | Image Recognition | Post-Training Static Quantization | pb |
Inception V3 | Image Recognition | Post-Training Static Quantization | pb |
Inception V4 | Image Recognition | Post-Training Static Quantization | pb |
Inception ResNet V2 | Image Recognition | Post-Training Static Quantization | pb |
VGG16 | Image Recognition | Post-Training Static Quantization | pb / keras |
VGG19 | Image Recognition | Post-Training Static Quantization | pb / keras |
ResNet V2 50 | Image Recognition | Post-Training Static Quantization | pb |
ResNet V2 101 | Image Recognition | Post-Training Static Quantization | pb |
ResNet V2 152 | Image Recognition | Post-Training Static Quantization | pb |
DenseNet121 | Image Recognition | Post-Training Static Quantization | pb |
DenseNet161 | Image Recognition | Post-Training Static Quantization | pb |
DenseNet169 | Image Recognition | Post-Training Static Quantization | pb |
EfficientNet B0 | Image Recognition | Post-Training Static Quantization | ckpt |
MNIST | Image Recognition | Quantization-Aware Training | keras |
ResNet50 | Image Recognition | Post-Training Static Quantization | keras |
ResNet50 Fashion | Image Recognition | Post-Training Static Quantization | keras |
ResNet50 V2 | Image Recognition | Post-Training Static Quantization | keras |
ResNet101 | Image Recognition | Post-Training Static Quantization | keras |
Inception V3 | Image Recognition | Post-Training Static Quantization | keras |
Inception Resnet V2 | Image Recognition | Post-Training Static Quantization | keras |
ResNet101 V2 | Image Recognition | Post-Training Static Quantization | keras |
MobileNet V2 | Image Recognition | Post-Training Static Quantization | keras |
Xception | Image Recognition | Post-Training Static Quantization | keras |
ResNet V2 | Image Recognition | Quantization-Aware Training | keras |
EfficientNet V2 B0 | Image Recognition | Post-Training Static Quantization | SavedModel |
BERT base MRPC | Natural Language Processing | Post-Training Static Quantization | ckpt |
BERT large SQuAD (Model Zoo) | Natural Language Processing | Post-Training Static Quantization | pb |
BERT large SQuAD | Natural Language Processing | Post-Training Static Quantization | pb |
Transformer LT | Natural Language Processing | Post-Training Static Quantization | pb |
SSD ResNet50 V1 | Object Detection | Post-Training Static Quantization | pb / ckpt |
SSD MobileNet V1 | Object Detection | Post-Training Static Quantization | pb / ckpt |
Faster R-CNN Inception ResNet V2 | Object Detection | Post-Training Static Quantization | pb / SavedModel |
Faster R-CNN ResNet101 | Object Detection | Post-Training Static Quantization | pb / SavedModel |
Faster R-CNN ResNet50 | Object Detection | Post-Training Static Quantization | pb |
Mask R-CNN Inception V2 | Object Detection | Post-Training Static Quantization | pb / ckpt |
SSD ResNet34 | Object Detection | Post-Training Static Quantization | pb |
YOLOv3 | Object Detection | Post-Training Static Quantization | pb |
Wide & Deep | Recommendation | Post-Training Static Quantization | pb |
Arbitrary Style Transfer | Style Transfer | Post-Training Static Quantization | ckpt |
Model | Domain | Pruning Type | Approach | Examples |
---|---|---|---|---|
Inception V3 | Image Recognition | Unstructured | Magnitude | pb |
ResNet V2 | Image Recognition | Unstructured | Magnitude | pb |
ViT | Image Recognition | Unstructured | Magnitude | ckpt |
Student Model | Teacher Model | Domain | Examples |
---|---|---|---|
MobileNet | DenseNet201 | Image Recognition | pb |
Model | Domain | Approach | Examples |
---|---|---|---|
ResNet18 | Image Recognition | Post-Training Static Quantization | eager / fx / ipex |
ResNet18 | Image Recognition | Quantization-Aware Training | eager / fx |
ResNet50 | Image Recognition | Post-Training Static Quantization | eager / ipex |
ResNet50 | Image Recognition | Quantization-Aware Training | eager |
ResNeXt101_32x16d_wsl | Image Recognition | Post-Training Static Quantization | ipex |
ResNeXt101_32x8d | Image Recognition | Post-Training Static Quantization | eager |
Se_ResNeXt50_32x4d | Image Recognition | Post-Training Static Quantization | eager |
Inception V3 | Image Recognition | Post-Training Static Quantization | eager |
MobileNet V2 | Image Recognition | Post-Training Static Quantization | eager |
PeleeNet | Image Recognition | Post-Training Static Quantization | eager |
ResNeSt50 | Image Recognition | Post-Training Static Quantization | eager |
3D-UNet | Image Recognition | Post-Training Static Quantization | eager |
SSD ResNet34 | Object Detection | Post-Training Static Quantization | fx / ipex |
Mask R-CNN | Object Detection | Post-Training Static Quantization | fx |
YOLOv3 | Object Detection | Post-Training Static Quantization | eager |
DLRM | Recommendation | Post-Training Static Quantization | eager / ipex / fx |
RNN-T | Speech Recognition | Post-Training Dynamic / Static Quantization | eager / ipex |
Wav2Vec2 | Speech Recognition | Post-Training Dynamic Quantization | eager |
HuBERT | Speech Recognition | Post-Training Dynamic Quantization | eager |
BlendCNN | Natural Language Processing | Post-Training Static Quantization | eager |
bert-large-uncased-whole-word-masking-finetuned-squad | Natural Language Processing | Post-Training Static Quantization | fx / ipex |
distilbert-base-uncased-distilled-squad | Natural Language Processing | Post-Training Static Quantization | ipex |
t5-small | Natural Language Processing | Post-Training Dynamic Quantization | eager |
Helsinki-NLP/opus-mt-en-ro | Natural Language Processing | Post-Training Dynamic Quantization | eager |
lvwerra/pegasus-samsum | Natural Language Processing | Post-Training Dynamic Quantization | eager |
Model | Domain | Pruning Type | Approach | Examples |
---|---|---|---|---|
ResNet18 | Image Recognition | Unstructured | Magnitude | eager |
ResNet34 | Image Recognition | Unstructured | Magnitude | eager |
ResNet50 | Image Recognition | Unstructured | Magnitude | eager |
ResNet101 | Image Recognition | Unstructured | Magnitude | eager |
BERT large | Natural Language Processing | Structured (2x1) | Group Lasso | eager |
Intel/bert-base-uncased-sparse-70-unstructured | Natural Language Processing (question-answering) | Unstructured | Pattern Lock | eager |
bert-base-uncased | Natural Language Processing | Structured (Filter/Channel-wise) | Gradient Sensitivity | eager |
DistilBERT | Natural Language Processing | Unstructured | Magnitude | eager |
Intel/bert-base-uncased-sparse-70-unstructured | Natural Language Processing (text-classification) | Unstructured | Pattern Lock | eager |
Bert-mini | Natural Language Processing (text classification) | Structured (4x1, 2in4), Unstructured | Snip-momentum | eager |
Bert-mini | Natural Language Processing (question answering) | Structured (4x1, 2in4), Unstructured | Snip-momentum | eager |
Student Model | Teacher Model | Domain | Examples |
---|---|---|---|
CNN-2 | CNN-10 | Image Recognition | eager |
MobileNet V2-0.35 | WideResNet40-2 | Image Recognition | eager |
ResNet18|ResNet34|ResNet50|ResNet101 | ResNet18|ResNet34|ResNet50|ResNet101 | Image Recognition | eager |
VGG-8 | VGG-13 | Image Recognition | eager |
BlendCNN | BERT base | Natural Language Processing | eager |
distilbert-base-uncased | csarron/bert-base-uncased-squad-v1 | Natural Language Processing | eager |
BiLSTM | textattack/roberta-base-SST-2 | Natural Language Processing | eager |
huawei-noah/TinyBERT_General_4L_312D | blackbird/bert-base-uncased-MNLI-v1 | Natural Language Processing | eager |
nreimers | textattack/bert-base-uncased-QQP | Natural Language Processing | eager |
distilroberta-base | howey/roberta-large-cola | Natural Language Processing | eager |
Model | Domain | Approach | Examples |
---|---|---|---|
ResNet50 | Image Recognition | Multi-shot: Pruning and PTQ |
link |
ResNet50 | Image Recognition | One-shot: QAT during Pruning |
link |
Intel/bert-base-uncased-sparse-90-unstructured-pruneofa | Natural Language Processing (question-answering) | One-shot: Pruning, Distillation and QAT |
link |
Intel/bert-base-uncased-sparse-90-unstructured-pruneofa | Natural Language Processing (text-classification) | One-shot: Pruning, Distillation and QAT |
link |
Model | Domain | Approach | Examples |
---|---|---|---|
ResNet50 V1.5 | Image Recognition | Post-Training Static Quantization | qlinearops / qdq |
ResNet50 V1.5 MLPerf | Image Recognition | Post-Training Static Quantization | qlinearops / qdq |
VGG16 | Image Recognition | Post-Training Static Quantization | qlinearops / qdq |
MobileNet V2 | Image Recognition | Post-Training Static Quantization | qlinearops / qdq |
MobileNet V3 MLPerf | Image Recognition | Post-Training Static Quantization | qlinearops / qdq |
AlexNet | Image Recognition | Post-Training Static Quantization | qlinearops / qdq |
CaffeNet | Image Recognition | Post-Training Static Quantization | qlinearops / qdq |
DenseNet | Image Recognition | Post-Training Static Quantization | qlinearops |
EfficientNet | Image Recognition | Post-Training Static Quantization | qlinearops / qdq |
FCN | Image Recognition | Post-Training Static Quantization | qlinearops / qdq |
GoogleNet | Image Recognition | Post-Training Static Quantization | qlinearops / qdq |
Inception V1 | Image Recognition | Post-Training Static Quantization | qlinearops / qdq |
MNIST | Image Recognition | Post-Training Static Quantization | qlinearops |
MobileNet V2 (ONNX Model Zoo) | Image Recognition | Post-Training Static Quantization | qlinearops / qdq |
ResNet50 V1.5 (ONNX Model Zoo) | Image Recognition | Post-Training Static Quantization | qlinearops / qdq |
ShuffleNet V2 | Image Recognition | Post-Training Static Quantization | qlinearops / qdq |
SqueezeNet | Image Recognition | Post-Training Static Quantization | qlinearops / qdq |
VGG16 (ONNX Model Zoo) | Image Recognition | Post-Training Static Quantization | qlinearops / qdq |
ZFNet | Image Recognition | Post-Training Static Quantization | qlinearops / qdq |
ArcFace | Image Recognition | Post-Training Static Quantization | qlinearops |
BERT base MRPC | Natural Language Processing | Post-Training Static Quantization | integerops / qdq |
BERT base MRPC | Natural Language Processing | Post-Training Dynamic Quantization | integerops |
DistilBERT base MRPC | Natural Language Processing | Post-Training Dynamic / Static Quantization | integerops / qdq |
Mobile bert MRPC | Natural Language Processing | Post-Training Dynamic / Static Quantization | integerops / qdq |
Roberta base MRPC | Natural Language Processing | Post-Training Dynamic / Static Quantization | integerops / qdq |
BERT SQuAD | Natural Language Processing | Post-Training Dynamic / Static Quantization | integerops / qdq |
GPT2 lm head WikiText | Natural Language Processing | Post-Training Dynamic Quantization | integerops |
MobileBERT SQuAD MLPerf | Natural Language Processing | Post-Training Dynamic / Static Quantization | integerops / qdq |
BiDAF | Natural Language Processing | Post-Training Dynamic Quantization | integerops |
SSD MobileNet V1 | Object Detection | Post-Training Static Quantization | qlinearops / qdq |
SSD MobileNet V2 | Object Detection | Post-Training Static Quantization | qlinearops / qdq |
SSD MobileNet V1 (ONNX Model Zoo) | Object Detection | Post-Training Static Quantization | qlinearops / qdq |
DUC | Object Detection | Post-Training Static Quantization | qlinearops |
Faster R-CNN | Object Detection | Post-Training Static Quantization | qlinearops / qdq |
Mask R-CNN | Object Detection | Post-Training Static Quantization | qlinearops / qdq |
SSD | Object Detection | Post-Training Static Quantization | qlinearops / qdq |
Tiny YOLOv3 | Object Detection | Post-Training Static Quantization | qlinearops |
YOLOv3 | Object Detection | Post-Training Static Quantization | qlinearops |
YOLOv4 | Object Detection | Post-Training Static Quantization | qlinearops |
Emotion FERPlus | Body Analysis | Post-Training Static Quantization | qlinearops |
Ultra Face | Body Analysis | Post-Training Static Quantization | qlinearops |