Support standard ONNX quantized operators? #9

Nullkooland · 2021-10-04T13:53:48Z

Hi, I've been looking for an end2end quantization deployment solution, so far I've tried the onnxrutime + TVM stack, but onnxruntime only supports naive PTQ methods. Your work looks really promising, however, I wonder can MQBench export the quantized model in the form of standard ONNX quantized ops like QLinearConv, QuantizeLinear, DequantizeLinear, etc?

See: apache/tvm#8838

The text was updated successfully, but these errors were encountered:

Tracin · 2021-10-08T09:38:34Z

We consider deploying quantized model on TVM important, deploying on TVM with QuantDequant node will be supported very soon. Actually we have already done this in PTQ scheme for experiments.

Nullkooland · 2021-11-05T04:03:29Z

@Tracin
I noticed that the v0.0.3 version has ONNX QNN Ops support, however, I cannot export an ONNX QNN model when running the example code in test_quantize_onnxqnn in test_backend.py:

No Op registered for LearnablePerTensorAffine with domain_version of 11

It seems that ONNX does not support the custom fake quantize Op LearnablePerTensorAffine in the quantized model. I also noticed that there's a function deploy_qparams_tvm which uses the ONNXQNNPass, in which fake quantize Ops are replaced with standard ONNX QNN Ops, but this function is not called during convert_deploy. Is there any documentations explaining how to use these utilities to export to a standard ONNX QNN model?

Tracin · 2021-11-05T04:16:29Z

ONNXQNNPass is registered in scheme ONNX_QNN, you can prepare and covert_delpoy using backend=ONNX_QNN, then it will be called.

Nullkooland closed this as completed Nov 5, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support standard ONNX quantized operators? #9

Support standard ONNX quantized operators? #9

Nullkooland commented Oct 4, 2021

Tracin commented Oct 8, 2021

Nullkooland commented Nov 5, 2021

Tracin commented Nov 5, 2021 •

edited

Loading

Support standard ONNX quantized operators? #9

Support standard ONNX quantized operators? #9

Comments

Nullkooland commented Oct 4, 2021

Tracin commented Oct 8, 2021

Nullkooland commented Nov 5, 2021

Tracin commented Nov 5, 2021 • edited Loading

Tracin commented Nov 5, 2021 •

edited

Loading