-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TensorRt result is different from onnx model in parseq model #3136
Comments
I can reproduce the issue with TRT 8.6.1, but looks like the issue has been fixed in our internal latest code(commit id 94d9acac4e3), please wait for the new release.
cc @nvpohanh who may know the fix info. |
8.6.1
|
I suspect this was caused by accumulated numerical error since I see there are a lot of transformer blocks. |
I have same problem with VitStr based on timm vision transformer VitStr: Vision transformer: It seems that the problem is related to vision transformer. |
@nvpohanh |
Can you give me an approximate time for the next release? |
With TRT 8.6, could you try exporting the ONNX model with opset17 or above so that the LayerNorms use the I don't have an ETA for next TRT release yet. |
The same problem remains With TRT 8.6 and opset 17. onnx model with opset 17: |
Thanks for your advise. I add cast layers to 11 encoder blocks. Trt version of model work better but onnx checker not pass validation. I got cos similarity between torch model result and trt model and similarity was 0.9578 modified onnx model : |
could you give it a try and see if it solves the e2e accuracy issue? I am just wondering if the MHA part is the issue, or if there are other issues. I am trying this because in the next TRT version, we have some heuristics to force some MatMuls in MHA to run in FP32 and I wonder if that explains why @zerollzeng was able to get better accuracy with the internal version of TRT. |
I run torch and trt model on test data set and the result is do you have any idea how can achieve better result ? |
I see. So adding Casts did recover the accuracy to some extent, but not fully. Several more things to experiment with:
The more Casts added, the slower it gets, but the better accuracy it results in. So the task is to find out which layers are sensitive to FP16 precision the most and to run those layers in FP32 |
I have a mistake in preprocessing stage. I add cast layers to 11 encoder blocks. Trt version of model work exactly same as original torch model. |
closing since it is solved, thanks all! |
NGC 23.09 is still 8.6.1.6 |
Have you ever tried dynamic input? Can you support it? |
cast layers? pytorch model or onnx model? |
I use onnx-modifier |
Can anyone share how to load .engine weights for inference |
https://github.com/fabio-sim/LightGlue-ONNX/blob/main/trt_infer.py |
Hi, Is this issue fixed with latest TensorRT (10.x.x)? |
Hi , is there any drive link which is working right now as all the above are not? |
Description
I converted parseq ocr model from pytorch to onnx and tested it on onnx model and every thing is ok, but when I convert onnx to fp32 or fp16 tensorrt engine, output of the model is very different from onnx model.
I use onnsim to simplify onnx. if i dont use onnxsim all results are nan.
model repo : https://github.com/baudm/parseq
Environment
TensorRT Version: TensorRT-8.6.1.6
NVIDIA GPU: RTX 3060
NVIDIA Driver Version: 531.79
CUDA Version: cuda-12.0
CUDNN Version:cudnn-8.9.1.23_cuda12
Operating System: Win 10
Python Version: 3.8
PyTorch Version: 1.13
Onnx opset : 14
Relevant Files
onnx model: https://drive.google.com/file/d/1CRXsD8Zk5Mo50JYCZytrAtBbFm2oOqvc/view?usp=sharing
trtexec.exe --onnx=parseq/test.onnx --workspace=10000 --saveEngine=parseq/test_fp32.trs --verbose
trt engine fp32: https://drive.google.com/file/d/17eecl4QrRrE1BiLqDE8HJT0wZCVm3BkB/view?usp=sharing
trt engine fp32 log: https://drive.google.com/file/d/1i9KkbKainaNIz5QQvolmScIu53DzFHHv/view?usp=sharing
trtexec.exe --onnx=parseq/test.onnx --fp16 --workspace=10000 --saveEngine=parseq/test_fp16.trs --verbose
trt engine fp16: https://drive.google.com/file/d/1CIzRZ-71a2hXZWnMNtWn7k2tuM3Pi6K_/view?usp=sharing
trt engine fp16 log: https://drive.google.com/file/d/15LOBtarM6RZiiyZaz66qt6Z8nu67JyrN/view?usp=sharing
Steps To Reproduce
I wrote a sample code to compare similarity of onnx and trt inference result. when I use real data, mean of similarity is 0.3 and when I use random number it is near 0.85
sample code:
https://drive.google.com/file/d/1dLo9iD3ZUPVuvU6LNFnwQSCjcLDTiKQr/view?usp=sharing
sample real data:
https://drive.google.com/file/d/1VtQgOYw5ZYQSZmUOGyJ7xPKElC7caFMl/view?usp=sharing
The text was updated successfully, but these errors were encountered: