You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Some time below warning is seen while running the model.
WARNING: [Torch-TensorRT - Debug Build] - Using default stream in enqueueV3() may lead to performance issues due to additional calls to cudaStreamSynchronize() by TensorRT to ensure correct synchronization. Please use non-default stream instead.
To Reproduce
It was reproduced with resnet model with multiple interference call. Both use_python_runtime=False/True have issue
model = models.resnet18(pretrained=True).eval().to("cuda")
input = torch.randn((1, 3, 224, 224)).to("cuda")
compile_spec = {
"inputs": [
torchtrt.Input(
input.shape, dtype=torch.float, format=torch.contiguous_format
)
],
"device": torchtrt.Device("cuda:0"),
"enabled_precisions": {torch.float},
"ir": "dynamo",
"cache_built_engines": False,
"reuse_cached_engines": False,
"use_python_runtime": True,
}
trt_mod = torchtrt.compile(model, **compile_spec)
for i in range(5):
trt_mod(input)
# Clean up model env
torch._dynamo.reset()
Expected behavior
No warning message
Environment
Build information about Torch-TensorRT can be found by turning on debug messages
Torch-TensorRT Version (e.g. 1.0.0):
PyTorch Version (e.g. 1.0):
CPU Architecture:
OS (e.g., Linux):
How you installed PyTorch (conda, pip, libtorch, source):
Build command you used (if compiling from source):
Are you using local sources or building from archives:
Python version:
CUDA version:
GPU models and configuration:
Any other relevant information:
Additional context
The text was updated successfully, but these errors were encountered:
with torch.cuda.stream(torch.cuda.Stream()):
# inference with your compiled model
Yes, running model under new cuda stream will set current stream and enqueueV3() will be executed with this stream or other side stream. Fix 3191 alloc and keep non-default cuda stream and use it for cuda graph capture/replay, enqueueV3 of model.
Bug Description
Some time below warning is seen while running the model.
WARNING: [Torch-TensorRT - Debug Build] - Using default stream in enqueueV3() may lead to performance issues due to additional calls to cudaStreamSynchronize() by TensorRT to ensure correct synchronization. Please use non-default stream instead.
To Reproduce
It was reproduced with resnet model with multiple interference call. Both use_python_runtime=False/True have issue
Expected behavior
No warning message
Environment
conda
,pip
,libtorch
, source):Additional context
The text was updated successfully, but these errors were encountered: