Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

I used pytorch-quantization to perform PTQ int8 quantization on ResNet50 and exported it to onnx, followed by exporting it to engine. trt. When reasoning, I found that the speed did not increase, but instead slowed down. What went wrong. #4304

Open
jishenghuang opened this issue Dec 30, 2024 · 4 comments
Assignees
Labels
Module:Performance General performance issues triaged Issue has been triaged by maintainers

Comments

@jishenghuang
Copy link

Description

Environment

TensorRT Version: 10.7

NVIDIA GPU: rtx3090

NVIDIA Driver Version:

CUDA Version: 11.7

CUDNN Version:

Operating System:

Python Version (if applicable): 3.10

Tensorflow Version (if applicable):

PyTorch Version (if applicable):

Baremetal or Container (if so, version):

Relevant Files

Model link:

Steps To Reproduce

Commands or scripts:

Have you tried the latest release?:

Can this model run on other frameworks? For example run ONNX model with ONNXRuntime (polygraphy run <model.onnx> --onnxrt):

@lix19937
Copy link

lix19937 commented Jan 1, 2025

Can u upload the onnx ?

@jishenghuang
Copy link
Author

[I exported both quantified and unquantified models based on fixed and dynamic batches, and found that the inference speed did not increase. Here is the onnx model I exported.
1.Fixed Batch:
Quantified:
Image

Without quantification:
Image

2.Dynamic Batch:
Quantified:
Image

Without quantification:
Image
Unable to upload onnx model, use screenshot instead. If necessary, I can add your contact information and send you these onnx models

@lix19937
Copy link

If necessary, I can add your contact information and send you these onnx models

lix19937@126.com

@jishenghuang
Copy link
Author

If necessary, I can add your contact information and send you these onnx models

lix19937@126.com

你也是中国的吧,等会给你发邮件发过去。

@LeoZDong LeoZDong added Module:Performance General performance issues triaged Issue has been triaged by maintainers labels Feb 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Module:Performance General performance issues triaged Issue has been triaged by maintainers
Projects
None yet
Development

No branches or pull requests

4 participants