Failure of TensorRT 10.7 to eliminate concatenation with upstream custom layer #4345
Labels
Documentation
Lack of clarity in documentation
Enhancement
New feature or request
triaged
Issue has been triaged by maintainers
Description
It seems that TensorRT cannot eliminate a concatenation layer if there is an upstream custom layer.
In a simple model that uses all standard operators, TensorRT engine building eliminates concatenation, but after I replaced Add with a CustomAdd that does the same thing as Add, TensorRT engine building does not eliminate the concatenation.
This failure to eliminate concatenation diminishes the benefit of using plugins when a plugin outputs to a concatenation layer, especially in terms of reducing the number of kernels, since failing to eliminate concatenation typically results in kernels called
copyVectorizedKernel
being used to do the copying.From the engine-building log, it appears that the failure is related to a concept called "striding support", but I could not find any documentation on it especially in relation to plugins.
My goal is for the concatenation to also be eliminated in the case involving custom layers, so that there are no unnecessary
copyVectorizedKernel
kernels. If the current behavior is by design, there should be documentation about this caveat regarding the use of plugins.Environment
TensorRT Version: 10.7
NVIDIA GPU: RTX 3080
NVIDIA Driver Version: 565.57.01
CUDA Version: 12.7
CUDNN Version: N/A
Operating System: Ubuntu 24.04
Python Version (if applicable): 3.12 (but irrelevant)
Tensorflow Version (if applicable): N/A
PyTorch Version (if applicable): N/A
Baremetal or Container (if so, version): baremetal
Relevant Files
https://github.com/jchia/trt-copy contains all the details to repro a situation illustrating the problem.
Steps To Reproduce
With the content of the repo at https://github.com/jchia/trt-copy, refer to https://github.com/jchia/trt-copy/blob/master/README.md.
The steps are:
The output of the engine-building steps indicates that concatenation is eliminated when Add is used but not when CustomAdd is used. Details are explained in the README.md.
In particular, for the model with Add (sac16.onnx), there are these lines:
But for the model with CustomAdd (sac16c.onnx), there are these lines:
Commands or scripts:
Have you tried the latest release?: No
Can this model run on other frameworks? For example run ONNX model with ONNXRuntime (
polygraphy run <model.onnx> --onnxrt
): Haven't tried, but it runs on TensorRT, suboptimally.The text was updated successfully, but these errors were encountered: