-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cannot find op_type: "LayerNormalization" when convert the onnx model, using TensorRT 8.6 #2875
Comments
the model downloaded from the above link is controlnet_opt.onnx. would you please double check? |
Yes, I this is the right model, I rename it to unet.opt.onnx |
|
And I just notice that, if I do not use dynamic image shape, the conversion will success. |
I feel like you dynamic shape is invalid
|
And I didn't see the parsing error. |
https://drive.google.com/file/d/1I_l0eOIf_Y4aItCWeJDUpOOqJeKf8zez/view?usp=share_link [04/15/2023-06:06:42] [E] [TRT] ModelImporter.cpp:726: While parsing node number 293 [LayerNormalization -> "/down_blocks.0/attentions.0/transformer_blocks.0/norm1/LayerNormalization_output_0"]: [04/15/2023-06:06:42] [E] [TRT] ModelImporter.cpp:729: --- End node --- |
You said the version is 8.6 but |
Hi this is the whole log of TensorRT 8.6, I use TensorRT 8.6 &&& RUNNING TensorRT.trtexec [TensorRT v8600] # /workspace/out/trtexec --onnx=unet.opt.onnx --saveEngine=unet.opt.plan --minShapes=sample:2x4x32x32,encoder_hidden_states:2x77x768,controlnet_cond:2x3x256x256 --optShapes=sample:4x4x64x64,encoder_hidden_states:4x77x768,controlnet_cond:4x3x512x512 --maxShapes=sample:8x4x128x128,encoder_hidden_states:8*77x768,controlnet_cond:4x3x1024x1024 [04/18/2023-02:10:30] [E] [TRT] ModelImporter.cpp:729: --- End node --- |
hello, can you tell me how you converted controlnet to onnx alone? |
Yes, I change the code of demo/Diffusion, add the model to model.py and export the model's onnx here is an example, I add this model to model.py, and without a dynamic shape, it works correctly. But when with dynamic shape, it gives me an error about layerNorm plugin. def get_model(self): def get_input_names(self): But I found a much easier way to reproduce this issue. just add --build-dynamic-shape this flag. [I] Configuring with profiles: [Profile().add('sample', min=(2, 4, 32, 32), opt=(6, 4, 64, 64), max=(8, 4, 128, 128)).add('encoder_hidden_states', min=(2, 77, 768), opt=(6, 77, 768), max=(8, 77, 768)).add('timestep', min=[1], opt=[1], max=[1])] |
you can reproduce this issue with offical demo code |
are you able to reproduce this issue? do you need any other information? |
Let me answer this issue by myself,in this case, you still need to recompile TensorRT8.6 and set LD_LIBRARY_PATH to TensorRT .so |
Let me answer this issue by myself,in this case, you still need to recompile TensorRT8.6 and set LD_LIBRARY_PATH to TensorRT .so |
same problem in tensorrt8.6. Tried to use onnx simplify but didn't work |
Make sure tensorrt version is higher than 8.6, torch.onnx.export( opset_version >= 17), this issue can be resolved |
"tensorrt version is higher than 8.6" ,u mean the tensorrt 10 or just 8.6.1. I met the same problem when convert sam2_model.onnx to tensorrt engine file using tensorRT version 8.6.1 |
Hi @zerollzeng @brainzha @dongjinxin123 , Hi I'm having the same issue. I'm using tensorRt version 8.4.1 (I know this is an old one but all other models depend on this, so can't afford to change it). How do I make it work in 8.4.1? Do I need to write a custom plugin ? Please help me on this THANKS! [09/18/2024-09:34:49] [E] [TRT] parsers/onnx/ModelImporter.cpp:776: --- End node --- |
Okay, so i finally got my matching versions: |
Hi guys, |
Description
root@50203672e3df:/workspace/onnx# LD_PRELOAD="/workspace/out/libnvinfer_plugin.so" /usr/src/tensorrt/bin/trtexec --onnx=unet.opt.onnx --saveEngine=unet.opt.plan --minShapes=sample:2x4x32x32,encoder_hidden_states:2x77x768,controlnet_cond:2x3x256x256 --optShapes=sample:4x4x64x64,encoder_hidden_states:4x77x768,controlnet_cond:4x3x512x512 --maxShapes=sample:8x4x128x128,encoder_hidden_states:877x768,controlnet_cond:4x3x1024x1024
&&&& RUNNING TensorRT.trtexec [TensorRT v8503] # /usr/src/tensorrt/bin/trtexec --onnx=unet.opt.onnx --saveEngine=unet.opt.plan --minShapes=sample:2x4x32x32,encoder_hidden_states:2x77x768,controlnet_cond:2x3x256x256 --optShapes=sample:4x4x64x64,encoder_hidden_states:4x77x768,controlnet_cond:4x3x512x512 --maxShapes=sample:8x4x128x128,encoder_hidden_states:877x768,controlnet_cond:4x3x1024x1024
[04/14/2023-08:56:14] [I] === Model Options ===
[04/14/2023-08:56:14] [I] Format: ONNX
[04/14/2023-08:56:14] [I] Model: unet.opt.onnx
[04/14/2023-08:56:14] [I] Output:
[04/14/2023-08:56:14] [I] === Build Options ===
[04/14/2023-08:56:14] [I] Max batch: explicit batch
[04/14/2023-08:56:14] [I] Memory Pools: workspace: default, dlaSRAM: default, dlaLocalDRAM: default, dlaGlobalDRAM: default
[04/14/2023-08:56:14] [I] minTiming: 1
[04/14/2023-08:56:14] [I] avgTiming: 8
[04/14/2023-08:56:14] [I] Precision: FP32
[04/14/2023-08:56:14] [I] LayerPrecisions:
[04/14/2023-08:56:14] [I] Calibration:
[04/14/2023-08:56:14] [I] Refit: Disabled
[04/14/2023-08:56:14] [I] Sparsity: Disabled
[04/14/2023-08:56:14] [I] Safe mode: Disabled
[04/14/2023-08:56:14] [I] DirectIO mode: Disabled
[04/14/2023-08:56:14] [I] Restricted mode: Disabled
[04/14/2023-08:56:14] [I] Build only: Disabled
[04/14/2023-08:56:14] [I] Save engine: unet.opt.plan
[04/14/2023-08:56:14] [I] Load engine:
[04/14/2023-08:56:14] [I] Profiling verbosity: 0
[04/14/2023-08:56:14] [I] Tactic sources: Using default tactic sources
[04/14/2023-08:56:14] [I] timingCacheMode: local
[04/14/2023-08:56:14] [I] timingCacheFile:
[04/14/2023-08:56:14] [I] Heuristic: Disabled
[04/14/2023-08:56:14] [I] Preview Features: Use default preview flags.
[04/14/2023-08:56:14] [I] Input(s)s format: fp32:CHW
[04/14/2023-08:56:14] [I] Output(s)s format: fp32:CHW
[04/14/2023-08:56:14] [I] Input build shape: sample=2x4x32x32+4x4x64x64+8x4x128x128
[04/14/2023-08:56:14] [I] Input build shape: encoder_hidden_states=2x77x768+4x77x768+8x768
[04/14/2023-08:56:14] [I] Input build shape: controlnet_cond=2x3x256x256+4x3x512x512+4x3x1024x1024
[04/14/2023-08:56:14] [I] Input calibration shapes: model
[04/14/2023-08:56:14] [I] === System Options ===
[04/14/2023-08:56:14] [I] Device: 0
[04/14/2023-08:56:14] [I] DLACore:
[04/14/2023-08:56:14] [I] Plugins:
[04/14/2023-08:56:14] [I] === Inference Options ===
[04/14/2023-08:56:14] [I] Batch: Explicit
[04/14/2023-08:56:14] [I] Input inference shape: controlnet_cond=4x3x512x512
[04/14/2023-08:56:14] [I] Input inference shape: encoder_hidden_states=4x77x768
[04/14/2023-08:56:14] [I] Input inference shape: sample=4x4x64x64
[04/14/2023-08:56:14] [I] Iterations: 10
[04/14/2023-08:56:14] [I] Duration: 3s (+ 200ms warm up)
[04/14/2023-08:56:14] [I] Sleep time: 0ms
[04/14/2023-08:56:14] [I] Idle time: 0ms
[04/14/2023-08:56:14] [I] Streams: 1
[04/14/2023-08:56:14] [I] ExposeDMA: Disabled
[04/14/2023-08:56:14] [I] Data transfers: Enabled
[04/14/2023-08:56:14] [I] Spin-wait: Disabled
[04/14/2023-08:56:14] [I] Multithreading: Disabled
[04/14/2023-08:56:14] [I] CUDA Graph: Disabled
[04/14/2023-08:56:14] [I] Separate profiling: Disabled
[04/14/2023-08:56:14] [I] Time Deserialize: Disabled
[04/14/2023-08:56:14] [I] Time Refit: Disabled
[04/14/2023-08:56:14] [I] NVTX verbosity: 0
[04/14/2023-08:56:14] [I] Persistent Cache Ratio: 0
[04/14/2023-08:56:14] [I] Inputs:
[04/14/2023-08:56:14] [I] === Reporting Options ===
[04/14/2023-08:56:14] [I] Verbose: Disabled
[04/14/2023-08:56:14] [I] Averages: 10 inferences
[04/14/2023-08:56:14] [I] Percentiles: 90,95,99
[04/14/2023-08:56:14] [I] Dump refittable layers:Disabled
[04/14/2023-08:56:14] [I] Dump output: Disabled
[04/14/2023-08:56:14] [I] Profile: Disabled
[04/14/2023-08:56:14] [I] Export timing to JSON file:
[04/14/2023-08:56:14] [I] Export output to JSON file:
[04/14/2023-08:56:14] [I] Export profile to JSON file:
[04/14/2023-08:56:14] [I]
[04/14/2023-08:56:14] [I] === Device Information ===
[04/14/2023-08:56:14] [I] Selected Device: Tesla T4
[04/14/2023-08:56:14] [I] Compute Capability: 7.5
[04/14/2023-08:56:14] [I] SMs: 40
[04/14/2023-08:56:14] [I] Compute Clock Rate: 1.59 GHz
[04/14/2023-08:56:14] [I] Device Global Memory: 15109 MiB
[04/14/2023-08:56:14] [I] Shared Memory per SM: 64 KiB
[04/14/2023-08:56:14] [I] Memory Bus Width: 256 bits (ECC enabled)
[04/14/2023-08:56:14] [I] Memory Clock Rate: 5.001 GHz
[04/14/2023-08:56:14] [I]
[04/14/2023-08:56:14] [I] TensorRT version: 8.5.3
[04/14/2023-08:56:14] [I] [TRT] [MemUsageChange] Init CUDA: CPU +12, GPU +0, now: CPU 28, GPU 103 (MiB)
[04/14/2023-08:56:16] [I] [TRT] [MemUsageChange] Init builder kernel library: CPU +265, GPU +76, now: CPU 347, GPU 179 (MiB)
[04/14/2023-08:56:16] [I] Start parsing network model
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:604] Reading dangerously large protocol message. If the message turns out to be larger than 2147483647 bytes, parsing will be halted for security reasons. To increase the limit (or to disable these warnings), see CodedInputStream::SetTotalBytesLimit() in google/protobuf/io/coded_stream.h.
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:81] The total number of bytes read was 723472052
[04/14/2023-08:56:16] [I] [TRT] ----------------------------------------------------------------
[04/14/2023-08:56:16] [I] [TRT] Input filename: unet.opt.onnx
[04/14/2023-08:56:16] [I] [TRT] ONNX IR version: 0.0.8
[04/14/2023-08:56:16] [I] [TRT] Opset version: 17
[04/14/2023-08:56:16] [I] [TRT] Producer name: pytorch
[04/14/2023-08:56:16] [I] [TRT] Producer version: 1.14.0
[04/14/2023-08:56:16] [I] [TRT] Domain:
[04/14/2023-08:56:16] [I] [TRT] Model version: 0
[04/14/2023-08:56:16] [I] [TRT] Doc string:
[04/14/2023-08:56:16] [I] [TRT] ----------------------------------------------------------------
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:604] Reading dangerously large protocol message. If the message turns out to be larger than 2147483647 bytes, parsing will be halted for security reasons. To increase the limit (or to disable these warnings), see CodedInputStream::SetTotalBytesLimit() in google/protobuf/io/coded_stream.h.
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:81] The total number of bytes read was 723472052
[04/14/2023-08:56:17] [W] [TRT] onnx2trt_utils.cpp:377: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[04/14/2023-08:56:17] [W] [TRT] onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
[04/14/2023-08:56:17] [I] [TRT] No importer registered for op: LayerNormalization. Attempting to import as plugin.
[04/14/2023-08:56:17] [I] [TRT] Searching for plugin: LayerNormalization, plugin_version: 1, plugin_namespace:
[04/14/2023-08:56:17] [E] [TRT] ModelImporter.cpp:726: While parsing node number 293 [LayerNormalization -> "/down_blocks.0/attentions.0/transformer_blocks.0/norm1/LayerNormalization_output_0"]:
[04/14/2023-08:56:17] [E] [TRT] ModelImporter.cpp:727: --- Begin node ---
[04/14/2023-08:56:17] [E] [TRT] ModelImporter.cpp:728: input: "/down_blocks.0/attentions.0/transformer_blocks.0/norm1/Cast_output_0"
input: "onnx::LayerNormalization_4060"
input: "onnx::LayerNormalization_4059"
output: "/down_blocks.0/attentions.0/transformer_blocks.0/norm1/LayerNormalization_output_0"
name: "/down_blocks.0/attentions.0/transformer_blocks.0/norm1/LayerNormalization"
op_type: "LayerNormalization"
attribute {
name: "axis"
i: -1
type: INT
}
attribute {
name: "epsilon"
f: 1e-05
type: FLOAT
}
[04/14/2023-08:56:17] [E] [TRT] ModelImporter.cpp:729: --- End node ---
[04/14/2023-08:56:17] [E] [TRT] ModelImporter.cpp:732: ERROR: builtin_op_importers.cpp:5428 In function importFallbackPluginImporter:
[8] Assertion failed: creator && "Plugin not found, are the plugin name, version, and namespace correct?"
[04/14/2023-08:56:17] [E] Failed to parse onnx file
[04/14/2023-08:56:17] [I] Finish parsing network model
[04/14/2023-08:56:17] [E] Parsing model failed
[04/14/2023-08:56:17] [E] Failed to create engine from model or file.
[04/14/2023-08:56:17] [E] Engine set up failed
&&&& FAILED TensorRT.trtexec [TensorRT v8503] # /usr/src/tensorrt/bin/trtexec --onnx=unet.opt.onnx --saveEngine=unet.opt.plan --minShapes=sample:2x4x32x32,encoder_hidden_states:2x77x768,controlnet_cond:2x3x256x256 --optShapes=sample:4x4x64x64,encoder_hidden_states:4x77x768,controlnet_cond:4x3x512x512 --maxShapes=sample:8x4x128x128,encoder_hidden_states:8*77x768,controlnet_cond:4x3x1024x1024
Environment
TensorRT Version: 8.6
NVIDIA GPU: T4
NVIDIA Driver Version: 450.
CUDA Version: cuda-12.0
CUDNN Version:
Operating System:
Python Version (if applicable):
Tensorflow Version (if applicable):
PyTorch Version (if applicable):
Baremetal or Container (if so, version): nvcr.io/nvidia/pytorch:23.02-py3
Relevant Files
This is the mode path
https://drive.google.com/file/d/1b-7wg4IkErgQg8AAtRPgJjtMpqNjoWJY/view?usp=share_link
Steps To Reproduce
run in docker image nvcr.io/nvidia/pytorch:23.02-py3
compile tensorRT plugin and put into path /workspace/out/libnvinfer_plugin.so
LD_PRELOAD="/workspace/out/libnvinfer_plugin.so" /usr/src/tensorrt/bin/trtexec --onnx=unet.opt.onnx --saveEngine=unet.opt.plan --minShapes=sample:2x4x32x32,encoder_hidden_states:2x77x768,controlnet_cond:2x3x256x256 --optShapes=sample:4x4x64x64,encoder_hidden_states:4x77x768,controlnet_cond:4x3x512x512 --maxShapes=sample:8x4x128x128,encoder_hidden_states:8*77x768,controlnet_cond:4x3x1024x1024
The text was updated successfully, but these errors were encountered: