🐛 [Bug] Warning as default stream was used in enqueueV3() #3190

keehyuna · 2024-09-27T04:34:56Z

Bug Description

Some time below warning is seen while running the model.

WARNING: [Torch-TensorRT - Debug Build] - Using default stream in enqueueV3() may lead to performance issues due to additional calls to cudaStreamSynchronize() by TensorRT to ensure correct synchronization. Please use non-default stream instead.

To Reproduce

It was reproduced with resnet model with multiple interference call. Both use_python_runtime=False/True have issue

model = models.resnet18(pretrained=True).eval().to("cuda")
input = torch.randn((1, 3, 224, 224)).to("cuda")
compile_spec = {
    "inputs": [
        torchtrt.Input(
            input.shape, dtype=torch.float, format=torch.contiguous_format
        )
    ],
    "device": torchtrt.Device("cuda:0"),
    "enabled_precisions": {torch.float},
    "ir": "dynamo",
    "cache_built_engines": False,
    "reuse_cached_engines": False,
    "use_python_runtime": True,
}

trt_mod = torchtrt.compile(model, **compile_spec)
for i in range(5):
    trt_mod(input)
# Clean up model env
torch._dynamo.reset()

Expected behavior

No warning message

Environment

Build information about Torch-TensorRT can be found by turning on debug messages

Torch-TensorRT Version (e.g. 1.0.0):
PyTorch Version (e.g. 1.0):
CPU Architecture:
OS (e.g., Linux):
How you installed PyTorch (conda, pip, libtorch, source):
Build command you used (if compiling from source):
Are you using local sources or building from archives:
Python version:
CUDA version:
GPU models and configuration:
Any other relevant information:

Additional context

The text was updated successfully, but these errors were encountered:

sean-xiang-applovin · 2024-10-13T05:14:07Z

I have seen this too, my solutions is to

 with torch.cuda.stream(torch.cuda.Stream()):
    # inference with your compiled model

keehyuna · 2024-10-14T07:55:53Z

I have seen this too, my solutions is to
 with torch.cuda.stream(torch.cuda.Stream()):
    # inference with your compiled model

Yes, running model under new cuda stream will set current stream and enqueueV3() will be executed with this stream or other side stream. Fix 3191 alloc and keep non-default cuda stream and use it for cuda graph capture/replay, enqueueV3 of model.

keehyuna added the bug Something isn't working label Sep 27, 2024

keehyuna self-assigned this Sep 27, 2024

keehyuna mentioned this issue Sep 27, 2024

Fix for warning as default stream was used in enqueueV3 #3191

Merged

7 tasks

keehyuna closed this as completed in #3191 Oct 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

🐛 [Bug] Warning as default stream was used in enqueueV3() #3190

🐛 [Bug] Warning as default stream was used in enqueueV3() #3190

keehyuna commented Sep 27, 2024

sean-xiang-applovin commented Oct 13, 2024

keehyuna commented Oct 14, 2024

🐛 [Bug] Warning as default stream was used in enqueueV3() #3190

🐛 [Bug] Warning as default stream was used in enqueueV3() #3190

Comments

keehyuna commented Sep 27, 2024

Bug Description

To Reproduce

Expected behavior

Environment

Additional context

sean-xiang-applovin commented Oct 13, 2024

keehyuna commented Oct 14, 2024