Full Validated Models

The below tables are models enabled by the Intel® Neural Compressor.

TensorFlow 2.x models

Framework	Version	Model	Accuracy			Performance
Framework	Version	Model	INT8 Tuning Accuracy	FP32 Accuracy Baseline	Acc Ratio [(INT8-FP32)/FP32]	INT8 realtime(ms) CLX8280 1s 4c per instance	FP32 realtime(ms) CLX8280 1s 4c per instance	Realtime Latency Ratio[FP32/INT8]
tensorflow	2.5.0	resnet50v1.0	74.24%	74.27%	-0.04%	7.64	21.54	2.82x
tensorflow	2.5.0	resnet50v1.5	76.94%	76.46%	0.63%	9.54	24.28	2.54x
tensorflow	2.5.0	resnet101	77.21%	76.45%	0.99%	12.92	30.65	2.37x
tensorflow	2.5.0	inception_v1	70.30%	69.74%	0.80%	5.58	10.13	1.82x
tensorflow	2.5.0	inception_v2	74.27%	73.97%	0.41%	6.78	12.42	1.83x
tensorflow	2.5.0	inception_v3	77.29%	76.75%	0.70%	12.90	27.74	2.15x
tensorflow	2.5.0	inception_v4	80.36%	80.27%	0.11%	21.00	54.42	2.59x
tensorflow	2.5.0	inception_resnet_v2	80.42%	80.40%	0.02%	44.72	87.62	1.96x
tensorflow	2.5.0	mobilenetv1	73.93%	70.96%	4.19%	2.96	9.88	3.34x
tensorflow	2.5.0	mobilenetv2	71.96%	71.76%	0.28%	4.95	10.71	2.16x
tensorflow	2.5.0	ssd_resnet50_v1	37.91%	38.00%	-0.24%	145.96	422.11	2.89x
tensorflow	2.5.0	ssd_mobilenet_v1	23.02%	23.13%	-0.48%	12.19	26.85	2.20x
tensorflow	2.5.0	faster_rcnn_resnet101	30.33%	30.38%	-0.16%	152.71	541.75	3.55x
tensorflow	2.5.0	faster_rcnn_resnet101_saved	30.37%	30.38%	-0.03%	151.55	613.76	4.05x
tensorflow	2.5.0	mask_rcnn_inception_v2	28.61%	28.73%	-0.42%	77.73	201.69	2.59x
tensorflow	2.5.0	wide_deep_large_ds	77.61%	77.67%	-0.08%	1.24	1.86	1.50x
tensorflow	2.5.0	vgg16	72.13%	70.89%	1.75%	16.91	61.21	3.62x
tensorflow	2.5.0	vgg19	72.35%	71.01%	1.89%	20.58	74.47	3.62x
tensorflow	2.5.0	resnetv2_50	70.36%	69.64%	1.03%	15.20	18.59	1.22x
tensorflow	2.5.0	resnetv2_101	72.58%	71.87%	0.99%	25.54	34.33	1.34x
tensorflow	2.5.0	resnetv2_152	72.92%	72.37%	0.76%	37.25	49.86	1.34x
tensorflow	2.5.0	densenet121	72.31%	72.89%	-0.80%	30.56	44.87	1.47x
tensorflow	2.5.0	densenet161	76.36%	76.29%	0.09%	53.69	85.54	1.59x
tensorflow	2.5.0	densenet169	74.49%	74.65%	-0.21%	39.50	56.68	1.44x
tensorflow	2.5.0	ssd_resnet50_v1_ckpt	37.89%	38.00%	-0.29%	142.82	481.75	3.37x
tensorflow	2.5.0	ssd_mobilenet_v1_ckpt	23.02%	23.13%	-0.48%	12.22	32.22	2.64x
tensorflow	2.5.0	mask_rcnn_inception_v2_ckpt	28.61%	28.73%	-0.42%	82.38	204.74	2.49x
tensorflow	2.5.0	efficientnet_b0	78.53%	76.75%	2.32%	26.23	27.53	1.05x
tensorflow	2.5.0	resnet50_fashion	78.05%	78.12%	-0.09%	3.11	6.89	2.22x

TensorFlow 1.x models

Framework	Version	Model	Accuracy			Performance
Framework	Version	Model	INT8 Tuning Accuracy	FP32 Accuracy Baseline	Acc Ratio [(INT8-FP32)/FP32]	INT8 realtime(ms) CLX8280 1s 4c per instance	FP32 realtime(ms) CLX8280 1s 4c per instance	Realtime Latency Ratio[FP32/INT8]
tensorflow	1.15.0-up3	bert_large_squad	92.35	92.98	-0.67%	397.58	875.35	2.20x
tensorflow	1.15.0-up3	bert_base_mrpc	86.03%	86.52%	-0.57%	42.25	75.95	1.80x
tensorflow	1.15.0-up3	resnet_v1_50_slim	76.03%	75.18%	1.13%	7.07	23.60	3.34x
tensorflow	1.15.0-up3	resnet_v1_101_slim	77.12%	76.40%	0.94%	12.53	43.21	3.45x
tensorflow	1.15.0-up3	resnet_v1_152_slim	77.58%	76.81%	1.00%	17.76	65.32	3.68x
tensorflow	1.15.0-up3	inception_v1_slim	70.41%	69.77%	0.92%	5.62	12.09	2.15x
tensorflow	1.15.0-up3	inception_v2_slim	74.38%	73.98%	0.54%	6.82	14.40	2.11x
tensorflow	1.15.0-up3	inception_v3_slim	78.32%	77.99%	0.42%	11.63	31.22	2.68x
tensorflow	1.15.0-up3	inception_v4_slim	80.35%	80.19%	0.20%	21.63	62.51	2.89x
tensorflow	1.15.0-up3	vgg16_slim	72.16%	70.89%	1.79%	17.09	60.87	3.56x
tensorflow	1.15.0-up3	vgg19_slim	72.22%	71.01%	1.70%	20.46	73.54	3.59x
tensorflow	1.15.0-up3	resnetv2_50_slim	70.36%	69.72%	0.92%	13.25	19.39	1.46x
tensorflow	1.15.0-up3	resnetv2_101_slim	72.59%	71.91%	0.95%	23.21	35.98	1.55x
tensorflow	1.15.0-up3	resnetv2_152_slim	72.93%	72.40%	0.73%	33.40	52.74	1.58x

PyTorch models

Framework	Version	Model	Accuracy			Performance
Framework	Version	Model	INT8 Tuning Accuracy	FP32 Accuracy Baseline	Acc Ratio [(INT8-FP32)/FP32]	INT8 realtime(ms) CLX8280 1s 4c per instance	FP32 realtime(ms) CLX8280 1s 4c per instance	Realtime Latency Ratio[FP32/INT8]
pytorch	1.9.0+cpu	resnet18	69.58%	69.76%	-0.26%	13.59	24.97	1.84x
pytorch	1.9.0+cpu	resnet50	75.87%	76.13%	-0.34%	25.67	54.12	2.11x
pytorch	1.9.0+cpu	resnext101_32x8d	79.09%	79.31%	-0.28%	62.44	147.88	2.37x
pytorch	1.9.0+cpu	bert_base_mrpc	88.16%	88.73%	-0.64%	41.33	81.93	1.98x
pytorch	1.9.0+cpu	bert_base_cola	58.29%	58.84%	-0.93%	39.30	86.58	2.20x
pytorch	1.9.0+cpu	bert_base_sts-b	88.65%	89.27%	-0.70%	39.46	86.97	2.20x
pytorch	1.9.0+cpu	bert_base_sst-2	91.63%	91.86%	-0.25%	39.12	82.59	2.11x
pytorch	1.9.0+cpu	bert_base_rte	69.31%	69.68%	-0.52%	39.81	81.98	2.06x
pytorch	1.9.0+cpu	bert_large_mrpc	87.48%	88.33%	-0.95%	112.61	287.44	2.55x
pytorch	1.9.0+cpu	bert_large_squad	92.79	93.05	-0.28%	497.79	953.74	1.92x
pytorch	1.9.0+cpu	bert_large_qnli	91.12%	91.82%	-0.76%	112.43	291.10	2.59x
pytorch	1.9.0+cpu	bert_large_rte	72.92%	72.56%	0.50%	148.60	287.03	1.93x
pytorch	1.9.0+cpu	bert_large_cola	62.85%	62.57%	0.45%	112.54	283.38	2.52x
pytorch	1.9.0+cpu	dlrm	80.27%	80.27%	0.00%	0.01	0.01	1.00x
pytorch	1.9.0+cpu	inception_v3	69.39%	69.54%	-0.21%	29.40	52.01	1.77x
pytorch	1.9.0+cpu	peleenet	71.54%	72.08%	-0.75%	24.99	33.14	1.33x
pytorch	1.9.0+cpu	yolo_v3	24.50%	24.54%	-0.17%	117.56	243.60	2.07x
pytorch	1.9.0+cpu	se_resnext50_32x4d	79.02%	79.08%	-0.07%	33.41	63.55	1.90x
pytorch	1.9.0+cpu	mobilenet_v2	70.73%	71.86%	-1.57%	15.34	23.27	1.52x
pytorch	1.9.0+cpu	blendcnn	68.40%	68.40%	0.00%	2.43	2.52	1.03x
pytorch	1.9.0+cpu	gpt_wikitext	60.06	60.20	-0.23%	545.94	590.43	1.08x
pytorch	1.9.0+cpu	roberta_base_mrpc	85.37%	85.51%	-0.17%	40.61	82.25	2.03x
pytorch	1.9.0+cpu	camembert_base_mrpc	84.72%	84.22%	0.60%	44.23	83.24	1.88x
pytorch	1.9.0+cpu	distilbert_base_mrpc	81.17%	80.99%	0.21%	26.24	45.65	1.74x
pytorch	1.9.0+cpu	albert_base_mrpc	88.77%	88.50%	0.31%	303.38	374.12	1.23x
pytorch	1.9.0+cpu	funnel_mrpc	91.72%	92.26%	-0.58%	86.83	89.71	1.03x
pytorch	1.9.0+cpu	bart_wnli	49.30%	52.11%	-5.41%	321.66	363.76	1.13x
pytorch	1.9.0+cpu	mbart_wnli	56.34%	56.34%	0.00%	175.87	342.64	1.95x
pytorch	1.9.0+cpu	t5_wmt_en_ro	24.39	24.52	-0.55%	2530.55	2674.40	1.06x
pytorch	1.9.0+cpu	marianmt_wmt_en_ro	22.39	22.23	0.72%	3522.83	3758.02	1.07x
pytorch	1.9.0+cpu	pegasus_billsum	50.23	51.21	-1.91%	40000.00	62500.00	1.56x
pytorch	1.9.0+cpu	rnnt	92.48	92.55	-0.08%	182.23	554.61	3.04x
pytorch	1.9.0+cpu	xlm-roberta-base_mrpc	87.93%	88.62%	-0.78%	88.30	90.27	1.02x
pytorch	1.9.0+cpu	flaubert_mrpc	79.81%	80.19%	-0.48%	19.46	24.80	1.27x
pytorch	1.9.0+cpu	barthez_mrpc	83.25%	83.81%	-0.66%	69.93	104.06	1.49x
pytorch	1.9.0+cpu	longformer_mrpc	90.97%	91.46%	-0.53%	528.43	656.89	1.24x
pytorch	1.9.0+cpu	layoutlm_mrpc	81.22%	78.01%	4.12%	48.18	88.37	1.83x
pytorch	1.9.0+cpu	deberta_mrpc	90.29%	90.91%	-0.68%	89.03	135.90	1.53x
pytorch	1.9.0+cpu	squeezebert_mrpc	87.96%	87.65%	0.36%	47.68	56.26	1.18x
pytorch	1.9.0+cpu	dlrm_fx	80.19%	80.27%	-0.10%	0.00	0.01	1.67x
pytorch	1.9.0+cpu	resnet18_fx	69.61%	69.76%	-0.22%	13.42	26.41	1.97x
pytorch	1.9.0+cpu	xlnet_base_mrpc	89.43%	89.47%	-0.04%	101.99	128.57	1.26x
pytorch	1.9.0+cpu	ctrl_mrpc	82.00%	82.00%	0.00%	474.58	1265.14	2.67x
pytorch	1.9.0+cpu	xlm_mrpc	80.50%	79.56%	1.18%	177.14	536.52	3.03x
pytorch	1.9.0+cpu	maskrcnn_fx	37.70%	37.80%	-0.26%	116.62	179.57	1.54x
pytorch	1.9.0+cpu	ssd_resnet34_fx	19.511	19.63	-0.61%	378.40	1347.00	3.56x

Quantization-aware training models

Framework	Version	Model	Accuracy			Performance
Framework	Version	Model	INT8 Tuning Accuracy	FP32 Accuracy Baseline	Acc Ratio [(INT8-FP32)/FP32]	INT8 realtime(ms) CLX8280 1s 4c per instance	FP32 realtime(ms) CLX8280 1s 4c per instance	Realtime Latency Ratio[FP32/INT8]
pytorch	1.9.0+cpu	resnet18_qat	69.75%	69.76%	-0.02%	13.66	25.60	1.87x
pytorch	1.9.0+cpu	resnet50_qat	76.05%	76.13%	-0.11%	25.22	54.32	2.15x
pytorch	1.9.0+cpu	resnet18_qat_fx	69.72%	69.76%	-0.05%	13.53	26.72	1.97x
pytorch	1.9.0+cpu	mobilenet_v2_qat	71.45%	71.86%	-0.56%	15.29	22.79	1.49x

MXNet models

Framework	Version	Model	Accuracy			Performance
Framework	Version	Model	INT8 Tuning Accuracy	FP32 Accuracy Baseline	Acc Ratio [(INT8-FP32)/FP32]	INT8 realtime(ms) CLX8280 1s 4c per instance	FP32 realtime(ms) CLX8280 1s 4c per instance	Realtime Latency Ratio[FP32/INT8]
mxnet	1.7.0	resnet50v1	76.08%	76.33%	-0.32%	6.29	20.85	3.32x
mxnet	1.7.0	inceptionv3	77.73%	77.64%	0.11%	11.18	31.76	2.84x
mxnet	1.7.0	mobilenet1.0	71.69%	72.22%	-0.74%	1.60	3.96	2.48x
mxnet	1.7.0	mobilenetv2_1.0	70.78%	70.87%	-0.12%	1.93	5.33	2.76x
mxnet	1.7.0	resnet18_v1	70.02%	70.14%	-0.17%	3.01	9.49	3.15x
mxnet	1.7.0	squeezenet1.0	56.74%	56.96%	-0.38%	2.38	6.24	2.62x
mxnet	1.7.0	ssd-resnet50_v1	80.21%	80.23%	-0.03%	37.68	178.55	4.74x
mxnet	1.7.0	ssd-mobilenet1.0	74.94%	75.54%	-0.79%	15.28	59.86	3.92x
mxnet	1.7.0	resnet152_v1	78.21%	78.54%	-0.42%	17.79	58.81	3.31x

ONNX Models

Framework	Version	Model	Accuracy			Performance
Framework	Version	Model	INT8 Tuning Accuracy	FP32 Accuracy Baseline	Acc Ratio [(INT8-FP32)/FP32]	INT8 realtime(ms) CLX8280 1s 4c per instance	FP32 realtime(ms) CLX8280 1s 4c per instance	Realtime Latency Ratio[FP32/INT8]
onnxrt	1.8.0	resnet50_v1_5	73.83%	73.99%	-0.22%	11.99	20.62	1.72x
onnxrt	1.8.0	bert_base_mrpc_static	85.29%	86.03%	-0.86%	14.34	32.15	2.24x
onnxrt	1.8.0	bert_base_mrpc_dynamic	85.29%	86.03%	-0.86%	27.57	67.56	2.45x
onnxrt	1.8.0	vgg16	69.45%	69.44%	0.01%	72.53	95.64	1.32x
onnxrt	1.8.0	ssd_mobilenet_v1	22.41%	23.10%	-2.99%	16.27	18.74	1.15x
onnxrt	1.8.0	ssd_mobilenet_v2	23.80%	24.68%	-3.57%	20.59	25.11	1.22x
onnxrt	1.8.0	distilbert_base_mrpc	85.05%	84.56%	0.58%	6.35	17.24	2.72x
onnxrt	1.8.0	mobilebert_mrpc	86.03%	86.27%	-0.28%	15.40	17.52	1.14x
onnxrt	1.8.0	roberta_base_mrpc	88.73%	89.46%	-0.82%	14.08	35.92	2.55x
onnxrt	1.8.0	resnet50-v1-12	74.77%	74.97%	-0.27%	11.13	20.29	1.82x
onnxrt	1.8.0	resnet_v1_5_mlperf	76.11%	76.47%	-0.47%	12.66	20.51	1.62x
onnxrt	1.8.0	mobilenet_v3_mlperf	75.24%	75.39%	-0.20%	3.84	5.76	1.50x
onnxrt	1.8.0	bert_squad_model_zoo	79.93	80.67	-0.91%	91.35	168.07	1.84x
onnxrt	1.8.0	mobilebert_squad_mlperf	89.72	90.03	-0.34%	115.82	122.00	1.05x

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

full_model_list.md

full_model_list.md

Full Validated Models

TensorFlow 2.x models

TensorFlow 1.x models

PyTorch models

Quantization-aware training models

MXNet models

ONNX Models

Files

full_model_list.md

Latest commit

History

full_model_list.md

File metadata and controls

Full Validated Models

TensorFlow 2.x models

TensorFlow 1.x models

PyTorch models

Quantization-aware training models

MXNet models

ONNX Models