Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix TRT destroying a runtime before destroying deserialized engines #53937

Merged

Conversation

Tom-Zheng
Copy link
Contributor

@Tom-Zheng Tom-Zheng commented May 18, 2023

PR types

Bug fixes

PR changes

Others

Description

Currently TRT-related UTs will report the following error:

E0326 08:50:28.902647  3101 helper.h:114] 3: [runtime.cpp::~Runtime::346] Error Code 3: API Usage Error (Parameter check failed at: runtime/rt/runtime.cpp::~Runtime::346, condition: mEngineCounter.use_count() == 1. Destroying a runtime before destroying deserialized engines created by the runtime leads to undefined behavior. 

The error is self-explained: Destroying a runtime before destroying deserialized engines created by the runtime is not allowed. This PR fixes this issue.

@paddle-bot
Copy link

paddle-bot bot commented May 18, 2023

你的PR提交成功,感谢你对开源项目的贡献!
请关注后续CI自动化测试结果,详情请参考Paddle-CI手册
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

@paddle-bot paddle-bot bot added contributor External developers status: proposed labels May 18, 2023
@paddle-bot
Copy link

paddle-bot bot commented May 18, 2023

❌ The PR is not created using PR's template. You can refer to this Demo.
Please use PR's template, it helps save our maintainers' time so that more developers get helped.

@jeng1220
Copy link
Collaborator

@tianshuo78520a,
我看 PR-CI-Coverage 是標記 DLA 的代碼沒測到,但這 PR 和 DLA 沒有關係,能麻煩你手動批准嗎?

@jeng1220
Copy link
Collaborator

PR-CI-GpuPS-PSLIB 的 log 顯示

/workspace/Paddle/build/third_party/install/pslib/include/proto/ps.pb.h:17:2: error: \
#error This file was generated by an older version of protoc which is

這 PR 也並沒有改動 protocol buffer. 問題似乎是 CI 設置有問題?

@tianshuo78520a
Copy link
Collaborator

Details

这条任务并不是Required,因此不会影响PR正常Merge。

@tianshuo78520a
Copy link
Collaborator

@tianshuo78520a, 我看 PR-CI-Coverage 是標記 DLA 的代碼沒測到,但這 PR 和 DLA 沒有關係,能麻煩你手動批准嗎?

已经豁免

@jeng1220
Copy link
Collaborator

PR-CI-Windows 的 log 顯示

[2827/3874] Building CUDA object paddle\fluid\operators\CMakeFiles\paddle_operators_unity.dir\paddle_operators_unity_5_cu.cu.obj
FAILED: paddle/fluid/operators/CMakeFiles/paddle_operators_unity.dir/paddle_operators_unity_5_cu.cu.obj 
C:\Python37\Scripts\sccache.exe C:\PROGRA~1\NVIDIA~2\CUDA\v10.2\bin\nvcc.exe ... paddle_operators_unity_5_cu.cu
../paddle/fluid/framework/eigen.h(159): error C2065: 'NumIndices': undeclared identifier

錯誤是發生在 eigen.h,也和這 PR 無關
這應是 develop branch 現在已有的問題

@tianshuo78520a
Copy link
Collaborator

PR-CI-Windows 的 log 顯示

[2827/3874] Building CUDA object paddle\fluid\operators\CMakeFiles\paddle_operators_unity.dir\paddle_operators_unity_5_cu.cu.obj
FAILED: paddle/fluid/operators/CMakeFiles/paddle_operators_unity.dir/paddle_operators_unity_5_cu.cu.obj 
C:\Python37\Scripts\sccache.exe C:\PROGRA~1\NVIDIA~2\CUDA\v10.2\bin\nvcc.exe ... paddle_operators_unity_5_cu.cu
../paddle/fluid/framework/eigen.h(159): error C2065: 'NumIndices': undeclared identifier

錯誤是發生在 eigen.h,也和這 PR 無關 這應是 develop branch 現在已有的問題

windows可能是cache原因导致的,我正在尝试重跑CI

@jeng1220
Copy link
Collaborator

PR-CI-Windows 的 log 顯示

[2827/3874] Building CUDA object paddle\fluid\operators\CMakeFiles\paddle_operators_unity.dir\paddle_operators_unity_5_cu.cu.obj
FAILED: paddle/fluid/operators/CMakeFiles/paddle_operators_unity.dir/paddle_operators_unity_5_cu.cu.obj 
C:\Python37\Scripts\sccache.exe C:\PROGRA~1\NVIDIA~2\CUDA\v10.2\bin\nvcc.exe ... paddle_operators_unity_5_cu.cu
../paddle/fluid/framework/eigen.h(159): error C2065: 'NumIndices': undeclared identifier

錯誤是發生在 eigen.h,也和這 PR 無關 這應是 develop branch 現在已有的問題

windows可能是cache原因导致的,我正在尝试重跑CI

錯誤仍然是 eigen

c:\home\workspace\paddle\third_party\eigen3\eigen\src/Core/GlobalFunctions.h(88): \
error C2065: 'Dest': undeclared identifier

@tianshuo78520a
Copy link
Collaborator

PR-CI-Windows 的 log 顯示

[2827/3874] Building CUDA object paddle\fluid\operators\CMakeFiles\paddle_operators_unity.dir\paddle_operators_unity_5_cu.cu.obj
FAILED: paddle/fluid/operators/CMakeFiles/paddle_operators_unity.dir/paddle_operators_unity_5_cu.cu.obj 
C:\Python37\Scripts\sccache.exe C:\PROGRA~1\NVIDIA~2\CUDA\v10.2\bin\nvcc.exe ... paddle_operators_unity_5_cu.cu
../paddle/fluid/framework/eigen.h(159): error C2065: 'NumIndices': undeclared identifier

錯誤是發生在 eigen.h,也和這 PR 無關 這應是 develop branch 現在已有的問題

windows可能是cache原因导致的,我正在尝试重跑CI

錯誤仍然是 eigen

c:\home\workspace\paddle\third_party\eigen3\eigen\src/Core/GlobalFunctions.h(88): \
error C2065: 'Dest': undeclared identifier

已经通过,CheckPRTemplate可以通过修改PR模板进行重新触发操作。

@jeng1220
Copy link
Collaborator

@yuanlehome , @qingqing01 , and @zhangjun ,
All CI tests were passed. Please help to review the code.
Thanks

Copy link
Contributor

@yuanlehome yuanlehome left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@zhangjun zhangjun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@zhangjun zhangjun merged commit 6e0cf61 into PaddlePaddle:develop May 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
contributor External developers NVIDIA
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants