🐛 [Bug] Returning list of tensors fails when operations are applied to tensors: `RuntimeError: [Error thrown at core/conversion/conversion.cpp:220] List type. Only a single tensor or a TensorList type is supported.` #899

chaoz-dev · 2022-02-26T02:05:41Z

Bug Description

Returning a list of tensors fails when ops are applied to the tensors prior to appending them to the list that is returned.
This is not the case if tensors are directly appended to the list without applying any operations.

RuntimeError: [Error thrown at core/conversion/conversion.cpp:220] List type. Only a single tensor or a TensorList type is supported.

To Reproduce

Run the following:

  import torch    
  import torch_tensorrt as torchtrt    
      
      
  import torch_tensorrt.logging as logging    
      
  logging.set_reportable_log_level(logging.Level.Info)    
      
  torch.manual_seed(0)    
      
  DEVICE = torch.device("cuda:0")    
  SHAPE = (1, 2)    
      
      
  class Model(torch.nn.Module):    
      def __init__(self):    
          super().__init__()    
      
      def forward(self, x):    
          tensors = []    
          for i in range(3):    
              y = x + x    
              tensors.append(y)    
      
          return tensors    
      
      
  if __name__ == "__main__":    
      tensor = torch.randn(SHAPE, dtype=torch.float32, device=DEVICE)    
      
      model = Model().eval().to(DEVICE)    
      out = model(tensor)    
      print(out)    
      
      model_trt = torchtrt.compile(    
          model,    
          inputs=[    
              torchtrt.Input(shape=SHAPE),    
          ],    
          enabled_precisions={torch.float},    
      )    
      out_trt = model(tensor)    
      print(out_trt)

This throws the following error:

(trtorch-1.0) ~/av-dbg/experimental/chaoz/trtorch (chaoz/trtorch-experiments) $ python index.py 
[tensor([[-1.8493, -0.8507]], device='cuda:0'), tensor([[-1.8493, -0.8507]], device='cuda:0'), tensor([[-1.8493, -0.8507]], device='cuda:0')]
INFO: [Torch-TensorRT] - ir was set to default, using TorchScript as ir
INFO: [Torch-TensorRT] - Module was provided as a torch.nn.Module, trying to script the module with torch.jit.script. In the event of a failure please preconvert your module to TorchScript
INFO: [Torch-TensorRT] - Lowered Graph: graph(%x.1 : Tensor):
  %2 : int = prim::Constant[value=1]()
  %y.1 : Tensor = aten::add(%x.1, %x.1, %2) # /home/chaoz/av-dbg/experimental/chaoz/trtorch/index.py:24:16
  %y.2 : Tensor = aten::add(%x.1, %x.1, %2) # /home/chaoz/av-dbg/experimental/chaoz/trtorch/index.py:24:16
  %y.4 : Tensor = aten::add(%x.1, %x.1, %2) # /home/chaoz/av-dbg/experimental/chaoz/trtorch/index.py:24:16
  %tensors.1 : Tensor[] = prim::ListConstruct(%y.1, %y.2, %y.4)
  return (%tensors.1)

WARNING: [Torch-TensorRT] - Cannot infer input type from calcuations in graph for input x.1. Assuming it is Float32. If not, specify input type explicity
INFO: [Torch-TensorRT] - Skipping partitioning since model is fully supported
INFO: [Torch-TensorRT TorchScript Conversion Context] - [MemUsageChange] Init CUDA: CPU +449, GPU +0, now: CPU 3411, GPU 1873 (MiB)
INFO: [Torch-TensorRT TorchScript Conversion Context] - [MemUsageSnapshot] Begin constructing builder kernel library: CPU 3411 MiB, GPU 1873 MiB
INFO: [Torch-TensorRT TorchScript Conversion Context] - [MemUsageSnapshot] End constructing builder kernel library: CPU 3565 MiB, GPU 1915 MiB
INFO: [Torch-TensorRT] - Settings requested for TensorRT engine:
    Enabled Precisions: Float32 
    TF32 Floating Point Computation Enabled: 1
    Truncate Long and Double: 0
    Make Refittable Engine: 0
    Debuggable Engine: 0
    Strict Types: 0
    GPU ID: 0
    Allow GPU Fallback (if running on DLA): 0
    Min Timing Iterations: 2
    Avg Timing Iterations: 1
    Max Workspace Size: 1073741824
    Max Batch Size: Not set
    Device Type: GPU
    GPU ID: 0
    Engine Capability: standard
    Calibrator Created: 0
INFO: [Torch-TensorRT TorchScript Conversion Context] - Converting Block
INFO: [Torch-TensorRT TorchScript Conversion Context] - Adding Input x.1 (named: input_0): Input(shape: [1, 2], dtype: Float32, format: NCHW\Contiguous\Linear) in engine (conversion.AddInputs)
INFO: [Torch-TensorRT TorchScript Conversion Context] - Adding Layer %y.1 : Tensor = aten::add(%x.1, %x.1, %2) # /home/chaoz/av-dbg/experimental/chaoz/trtorch/index.py:24:16 (ctx.AddLayer)
INFO: [Torch-TensorRT TorchScript Conversion Context] - Adding Layer %y.2 : Tensor = aten::add(%x.1, %x.1, %2) # /home/chaoz/av-dbg/experimental/chaoz/trtorch/index.py:24:16 (ctx.AddLayer)
INFO: [Torch-TensorRT TorchScript Conversion Context] - Adding Layer %y.4 : Tensor = aten::add(%x.1, %x.1, %2) # /home/chaoz/av-dbg/experimental/chaoz/trtorch/index.py:24:16 (ctx.AddLayer)
Traceback (most recent call last):
  File "/home/chaoz/av-dbg/experimental/chaoz/trtorch/index.py", line 37, in <module>
    model_trt = torchtrt.compile(
  File "/home/chaoz/.anaconda3/envs/trtorch-1.0/lib/python3.9/site-packages/torch_tensorrt/_compile.py", line 97, in compile
    return torch_tensorrt.ts.compile(ts_mod, inputs=inputs, enabled_precisions=enabled_precisions, **kwargs)
  File "/home/chaoz/.anaconda3/envs/trtorch-1.0/lib/python3.9/site-packages/torch_tensorrt/ts/_compiler.py", line 119, in compile
    compiled_cpp_mod = _C.compile_graph(module._c, _parse_compile_spec(spec))
RuntimeError: [Error thrown at core/conversion/conversion.cpp:220] List type. Only a single tensor or a TensorList type is supported.

Expected behavior

Graph should return a list of tensors without errors.

Environment

Build information about Torch-TensorRT can be found by turning on debug messages

Torch-TensorRT Version (e.g. 1.0.0): 1.0
PyTorch Version (e.g. 1.0): 1.10.2
CPU Architecture: x86-64
OS (e.g., Linux): Ubuntu 18.04
How you installed PyTorch (conda, pip, libtorch, source): Conda
Build command you used (if compiling from source):
Are you using local sources or building from archives: local
Python version: 3.9
CUDA version: 11.6
GPU models and configuration: Nvidia A10
Any other relevant information:

Additional context

Note that changing the forward function to the following definition:

      def forward(self, x):    
          tensors = []    
          for i in range(3):    
              #  y = x + x    
              tensors.append(x)    
      
          return tensors

will succeed with the following output:

(trtorch-1.0) ~/av-dbg/experimental/chaoz/trtorch (chaoz/trtorch-experiments) $ python index.py
[tensor([[-0.9247, -0.4253]], device='cuda:0'), tensor([[-0.9247, -0.4253]], device='cuda:0'), tensor([[-0.9247, -0.4253]], device='cuda:0')]
INFO: [Torch-TensorRT] - ir was set to default, using TorchScript as ir
INFO: [Torch-TensorRT] - Module was provided as a torch.nn.Module, trying to script the module with torch.jit.script. In the event of a failure please preconvert your module to TorchScript
INFO: [Torch-TensorRT] - Lowered Graph: graph(%x.1 : Tensor):
  %tensors.1 : Tensor[] = prim::ListConstruct(%x.1, %x.1, %x.1)
  return (%tensors.1)

WARNING: [Torch-TensorRT] - Cannot infer input type from calcuations in graph for input x.1. Assuming it is Float32. If not, specify input type explicity
ERROR: [Torch-TensorRT] - Method requested cannot be compiled by Torch-TensorRT.TorchScript.
There is no work to be done since the resulting compiled program will contain an engine that is empty.
This may be because there are no operators that can be added to the TensorRT graph or all operators have a resolved compile time value.

WARNING: [Torch-TensorRT] - Input type for doing shape analysis could not be determined, defaulting to F32
INFO: [Torch-TensorRT] - Partitioned Graph: []
INFO: [Torch-TensorRT] - Segmented Graph: graph(%x.1 : Tensor):
  return ()

WARNING: [Torch-TensorRT] - Didn't generate any TensorRT engines, the compiler did nothing

[tensor([[-0.9247, -0.4253]], device='cuda:0'), tensor([[-0.9247, -0.4253]], device='cuda:0'), tensor([[-0.9247, -0.4253]], device='cuda:0')]

The text was updated successfully, but these errors were encountered:

chaoz-dev · 2022-02-26T02:07:30Z

May be due to #428

narendasan · 2022-03-01T00:11:24Z

The reason the second example works is because you are getting returned back the input module since there was no tensor computation to optimize

WARNING: [Torch-TensorRT] - Didn't generate any TensorRT engines, the compiler did nothing

@peri044 @inocsin Where did we end up on handling tensor lists?

narendasan · 2022-03-01T01:05:58Z

Ah, I think I understand what is happening. In the cases of append and similar cases where PyTorch operates on groupings of Tensors which are already converted, we use a container class so that PyTorch and TensorRT can track data. Looks like we end up with a list of these container classes instead of just ITensors or Tensors and so the PyTorch type system treats it as a Generic List and not a TensorList.

narendasan · 2022-03-01T01:12:51Z

@peri044 I think we decided for this case to wait for collections right?

chaoz-dev · 2022-03-01T02:29:55Z

Ah, I think I understand what is happening. In the cases of append and similar cases where PyTorch operates on groupings of Tensors which are already converted, we use a container class so that PyTorch and TensorRT can track data. Looks like we end up with a list of these container classes instead of just ITensors or Tensors and so the PyTorch type system treats it as a Generic List and not a TensorList.

Ah gotcha, yeah this makes sense, especially given the printed output:

List type. Only a single tensor or a TensorList type is supported.

ashafaei · 2022-03-02T21:14:11Z

I have a slightly different looking setup that triggers a similar problem. Is there a workaround?

# tuple.py
import torch
import torch.nn as nn
import torch_tensorrt as torchtrt


import torch_tensorrt.logging as torchtrt_logging

torchtrt_logging.set_reportable_log_level(torchtrt_logging.Level.Info)

torch.manual_seed(0)

DEVICE = torch.device("cuda:0")
SHAPE = (1, 3, 100, 100)


class Model(torch.nn.Module):
    def __init__(self):
        super().__init__()
        self.seq1 = nn.Sequential(nn.Conv2d(3, 6, 3, 1, 1), nn.ReLU(), nn.MaxPool2d(2, 2))
        self.seq2 = nn.Sequential(nn.Conv2d(6, 6, 3, 1, 1), nn.ReLU(), nn.MaxPool2d(2, 2))
        self.seq3 = nn.Sequential(nn.Conv2d(6, 6, 3, 1, 1), nn.ReLU(), nn.MaxPool2d(2, 2))

    def forward(self, x):
        a = self.seq1(x)
        b = self.seq2(a)
        c = self.seq3(b)

        ## It works fine without this
        a1, a2 = b.split([5, 1], dim=1)
        a1 = a1.max(dim=1, keepdim=True)[0]
        b = torch.cat([a1, a2], dim=1)
        ## ^

        return (b, c)


if __name__ == "__main__":
    tensor = torch.randn(SHAPE, dtype=torch.float32, device=DEVICE)

    model = Model().eval().to(DEVICE)
    out = model(tensor)
    print(f"Model: {out}")

    model_trt = torchtrt.compile(
        model,
        inputs=[
            torchtrt.Input(shape=SHAPE),
        ],
        enabled_precisions={torch.float},
    )
    out_trt = model_trt(tensor)
    print(f"Model TRT: {out_trt}")

    assert torch.max(torch.abs(out[0] - out_trt[0])) < 1e-6
    assert torch.max(torch.abs(out[1] - out_trt[1])) < 1e-6

when I remove the highlighted part of the code, the constructed graph correctly builds a TupleConstruct like this

  %5 : Tensor, %6 : Tensor = prim::ListUnpack(%4)
  %7 : (Tensor, Tensor) = prim::TupleConstruct(%5, %6)
  return (%7)

But when that part is active, I get this error

RuntimeError: Method (but not graphs in general) require a single output. Use None/Tuple for 0 or 2+ outputs

and the constructed graph is wrong

  %13 : Tensor[] = prim::ListConstruct(%5, %10, %7)
  %14 : Tensor[] = tensorrt::execute_engine(%13, %__torch___Model_trt_engine_0x55a84c97b750)
  %15 : Tensor, %16 : Tensor = prim::ListUnpack(%14)
  return (%15, %16)

Is there a way to manually remove the ListUnpack from the generated code? or is there a workaround to make this work?

ashafaei · 2022-03-02T21:26:38Z

So I was looking around in the built model_trt to see if I can either manually remove the ListUnpack node or add a TuplseConstruct. Looks like I was lucky and found this interesting function that addresses this problem.

    model_trt.graph.makeMultiOutputIntoTuple()

If you call it just after the compilation, your graph becomes like this and it'll work.

%14 : Tensor[] = tensorrt::execute_engine(%13, %__torch___Model_trt_engine_0x55b2ee48a140)
%15 : Tensor, %16 : Tensor = prim::ListUnpack(%14)
%17 : (Tensor, Tensor) = prim::TupleConstruct(%15, %16)
return (%17)

It's better to not have the ListUnpack to begin with, but this at least works. If you guys know how can I remove a node from the built graph, please do share.

pupumao · 2022-03-11T11:53:41Z

meet same problem, any way to avoid this bug?

ashafaei · 2022-03-11T19:40:50Z

Just call

model_trt.graph.makeMultiOutputIntoTuple()`

after compilation and it'll work.

pupumao · 2022-03-14T06:50:12Z

@ashafaei
thanks very much for your help
I met the problem when call:

tensorrt_engine_model = torch_tensorrt.ts.convert_method_to_trt_engine(traced_model, "forward", **compile_settings)

this is the error:
Traceback (most recent call last):
File "model_converter.py", line 251, in <module>
engine = get_engine(model_info.trt_engine_path, calib, int8_mode=int8_mode, optimize_params=optimize_params)
File "model_converter.py", line 171, in get_engine
return build_engine(max_batch_size)
File "model_converter.py", line 93, in build_engine
return build_engine_from_jit(max_batch_size)
File "model_converter.py", line 77, in build_engine_from_jit
tensorrt_engine_model = torch_tensorrt.ts.convert_method_to_trt_engine(traced_model, "forward", **compile_settings)
File "/usr/local/lib/python3.6/dist-packages/torch_tensorrt/ts/_compiler.py", line 211, in convert_method_to_trt_engine
return _C.convert_graph_to_trt_engine(module._c, method_name, _parse_compile_spec(compile_spec))
RuntimeError: [Error thrown at core/conversion/conversion.cpp:220] List type. Only a single tensor or a TensorList type is supported.

how could i call the function model_trt.graph.makeMultiOutputIntoTuple()

ashafaei · 2022-03-14T20:58:24Z

@pupumao Are you getting this error during compilation? The workaround I shared is for a different problem. The model gets compiled, but then during inference throws this error. You may have better luck if you use the nn.Module model directly to compile like the sample code I shared above.

ncomly-nvidia · 2022-07-26T19:16:24Z

@peri044 I think we decided for this case to wait for collections right?

@narendasan is this a tested case by collections for v1.2?

narendasan · 2022-08-12T00:25:47Z

Yes I tested this and it works

ncomly-nvidia · 2022-08-12T17:25:59Z

Closing. @chaoz-dev please reopen / file a new issue if there is still a problem.

chaoz-dev added the bug Something isn't working label Feb 26, 2022

ashafaei mentioned this issue Mar 1, 2022

🐛 [Bug] Running a model that returns a tuple or list of size 2 or greater causes segfault #898

Closed

narendasan added the component: core Issues re: The core compiler label May 18, 2022

narendasan self-assigned this May 18, 2022

narendasan added feature request New feature or request and removed bug Something isn't working labels May 24, 2022

github-actions bot assigned bowang007 and peri044 May 24, 2022

ncomly-nvidia closed this as completed Aug 12, 2022

ncomly-nvidia added the release: v1.2 Tagged to be included in v1.2 label Aug 12, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

🐛 [Bug] Returning list of tensors fails when operations are applied to tensors: `RuntimeError: [Error thrown at core/conversion/conversion.cpp:220] List type. Only a single tensor or a TensorList type is supported.` #899

🐛 [Bug] Returning list of tensors fails when operations are applied to tensors: `RuntimeError: [Error thrown at core/conversion/conversion.cpp:220] List type. Only a single tensor or a TensorList type is supported.` #899

chaoz-dev commented Feb 26, 2022 •

edited

Loading

chaoz-dev commented Feb 26, 2022

narendasan commented Mar 1, 2022

narendasan commented Mar 1, 2022

narendasan commented Mar 1, 2022

chaoz-dev commented Mar 1, 2022

ashafaei commented Mar 2, 2022

ashafaei commented Mar 2, 2022 •

edited

Loading

pupumao commented Mar 11, 2022

ashafaei commented Mar 11, 2022

pupumao commented Mar 14, 2022 •

edited

Loading

ashafaei commented Mar 14, 2022

ncomly-nvidia commented Jul 26, 2022 •

edited

Loading

narendasan commented Aug 12, 2022

ncomly-nvidia commented Aug 12, 2022

🐛 [Bug] Returning list of tensors fails when operations are applied to tensors: RuntimeError: [Error thrown at core/conversion/conversion.cpp:220] List type. Only a single tensor or a TensorList type is supported. #899

🐛 [Bug] Returning list of tensors fails when operations are applied to tensors: RuntimeError: [Error thrown at core/conversion/conversion.cpp:220] List type. Only a single tensor or a TensorList type is supported. #899

Comments

chaoz-dev commented Feb 26, 2022 • edited Loading

Bug Description

To Reproduce

Expected behavior

Environment

Additional context

chaoz-dev commented Feb 26, 2022

narendasan commented Mar 1, 2022

narendasan commented Mar 1, 2022

narendasan commented Mar 1, 2022

chaoz-dev commented Mar 1, 2022

ashafaei commented Mar 2, 2022

ashafaei commented Mar 2, 2022 • edited Loading

pupumao commented Mar 11, 2022

ashafaei commented Mar 11, 2022

pupumao commented Mar 14, 2022 • edited Loading

ashafaei commented Mar 14, 2022

ncomly-nvidia commented Jul 26, 2022 • edited Loading

narendasan commented Aug 12, 2022

ncomly-nvidia commented Aug 12, 2022

🐛 [Bug] Returning list of tensors fails when operations are applied to tensors: `RuntimeError: [Error thrown at core/conversion/conversion.cpp:220] List type. Only a single tensor or a TensorList type is supported.` #899

🐛 [Bug] Returning list of tensors fails when operations are applied to tensors: `RuntimeError: [Error thrown at core/conversion/conversion.cpp:220] List type. Only a single tensor or a TensorList type is supported.` #899

chaoz-dev commented Feb 26, 2022 •

edited

Loading

ashafaei commented Mar 2, 2022 •

edited

Loading

pupumao commented Mar 14, 2022 •

edited

Loading

ncomly-nvidia commented Jul 26, 2022 •

edited

Loading