I used pytorch-quantization to perform PTQ int8 quantization on ResNet50 and exported it to onnx, followed by exporting it to engine. trt. When reasoning, I found that the speed did not increase, but instead slowed down. What went wrong. #6715
Job | Run time |
---|---|
0s | |
0s | |
0s | |
0s | |
0s |