Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🐛 [Bug] Returning list of tensors fails when operations are applied to tensors: RuntimeError: [Error thrown at core/conversion/conversion.cpp:220] List type. Only a single tensor or a TensorList type is supported. #899

Closed
chaoz-dev opened this issue Feb 26, 2022 · 14 comments
Assignees
Labels
component: core Issues re: The core compiler feature request New feature or request release: v1.2 Tagged to be included in v1.2

Comments

@chaoz-dev
Copy link
Contributor

chaoz-dev commented Feb 26, 2022

Bug Description

Returning a list of tensors fails when ops are applied to the tensors prior to appending them to the list that is returned.
This is not the case if tensors are directly appended to the list without applying any operations.

RuntimeError: [Error thrown at core/conversion/conversion.cpp:220] List type. Only a single tensor or a TensorList type is supported.

To Reproduce

Run the following:

  import torch    
  import torch_tensorrt as torchtrt    
      
      
  import torch_tensorrt.logging as logging    
      
  logging.set_reportable_log_level(logging.Level.Info)    
      
  torch.manual_seed(0)    
      
  DEVICE = torch.device("cuda:0")    
  SHAPE = (1, 2)    
      
      
  class Model(torch.nn.Module):    
      def __init__(self):    
          super().__init__()    
      
      def forward(self, x):    
          tensors = []    
          for i in range(3):    
              y = x + x    
              tensors.append(y)    
      
          return tensors    
      
      
  if __name__ == "__main__":    
      tensor = torch.randn(SHAPE, dtype=torch.float32, device=DEVICE)    
      
      model = Model().eval().to(DEVICE)    
      out = model(tensor)    
      print(out)    
      
      model_trt = torchtrt.compile(    
          model,    
          inputs=[    
              torchtrt.Input(shape=SHAPE),    
          ],    
          enabled_precisions={torch.float},    
      )    
      out_trt = model(tensor)    
      print(out_trt)    

This throws the following error:

(trtorch-1.0) ~/av-dbg/experimental/chaoz/trtorch (chaoz/trtorch-experiments) $ python index.py 
[tensor([[-1.8493, -0.8507]], device='cuda:0'), tensor([[-1.8493, -0.8507]], device='cuda:0'), tensor([[-1.8493, -0.8507]], device='cuda:0')]
INFO: [Torch-TensorRT] - ir was set to default, using TorchScript as ir
INFO: [Torch-TensorRT] - Module was provided as a torch.nn.Module, trying to script the module with torch.jit.script. In the event of a failure please preconvert your module to TorchScript
INFO: [Torch-TensorRT] - Lowered Graph: graph(%x.1 : Tensor):
  %2 : int = prim::Constant[value=1]()
  %y.1 : Tensor = aten::add(%x.1, %x.1, %2) # /home/chaoz/av-dbg/experimental/chaoz/trtorch/index.py:24:16
  %y.2 : Tensor = aten::add(%x.1, %x.1, %2) # /home/chaoz/av-dbg/experimental/chaoz/trtorch/index.py:24:16
  %y.4 : Tensor = aten::add(%x.1, %x.1, %2) # /home/chaoz/av-dbg/experimental/chaoz/trtorch/index.py:24:16
  %tensors.1 : Tensor[] = prim::ListConstruct(%y.1, %y.2, %y.4)
  return (%tensors.1)

WARNING: [Torch-TensorRT] - Cannot infer input type from calcuations in graph for input x.1. Assuming it is Float32. If not, specify input type explicity
INFO: [Torch-TensorRT] - Skipping partitioning since model is fully supported
INFO: [Torch-TensorRT TorchScript Conversion Context] - [MemUsageChange] Init CUDA: CPU +449, GPU +0, now: CPU 3411, GPU 1873 (MiB)
INFO: [Torch-TensorRT TorchScript Conversion Context] - [MemUsageSnapshot] Begin constructing builder kernel library: CPU 3411 MiB, GPU 1873 MiB
INFO: [Torch-TensorRT TorchScript Conversion Context] - [MemUsageSnapshot] End constructing builder kernel library: CPU 3565 MiB, GPU 1915 MiB
INFO: [Torch-TensorRT] - Settings requested for TensorRT engine:
    Enabled Precisions: Float32 
    TF32 Floating Point Computation Enabled: 1
    Truncate Long and Double: 0
    Make Refittable Engine: 0
    Debuggable Engine: 0
    Strict Types: 0
    GPU ID: 0
    Allow GPU Fallback (if running on DLA): 0
    Min Timing Iterations: 2
    Avg Timing Iterations: 1
    Max Workspace Size: 1073741824
    Max Batch Size: Not set
    Device Type: GPU
    GPU ID: 0
    Engine Capability: standard
    Calibrator Created: 0
INFO: [Torch-TensorRT TorchScript Conversion Context] - Converting Block
INFO: [Torch-TensorRT TorchScript Conversion Context] - Adding Input x.1 (named: input_0): Input(shape: [1, 2], dtype: Float32, format: NCHW\Contiguous\Linear) in engine (conversion.AddInputs)
INFO: [Torch-TensorRT TorchScript Conversion Context] - Adding Layer %y.1 : Tensor = aten::add(%x.1, %x.1, %2) # /home/chaoz/av-dbg/experimental/chaoz/trtorch/index.py:24:16 (ctx.AddLayer)
INFO: [Torch-TensorRT TorchScript Conversion Context] - Adding Layer %y.2 : Tensor = aten::add(%x.1, %x.1, %2) # /home/chaoz/av-dbg/experimental/chaoz/trtorch/index.py:24:16 (ctx.AddLayer)
INFO: [Torch-TensorRT TorchScript Conversion Context] - Adding Layer %y.4 : Tensor = aten::add(%x.1, %x.1, %2) # /home/chaoz/av-dbg/experimental/chaoz/trtorch/index.py:24:16 (ctx.AddLayer)
Traceback (most recent call last):
  File "/home/chaoz/av-dbg/experimental/chaoz/trtorch/index.py", line 37, in <module>
    model_trt = torchtrt.compile(
  File "/home/chaoz/.anaconda3/envs/trtorch-1.0/lib/python3.9/site-packages/torch_tensorrt/_compile.py", line 97, in compile
    return torch_tensorrt.ts.compile(ts_mod, inputs=inputs, enabled_precisions=enabled_precisions, **kwargs)
  File "/home/chaoz/.anaconda3/envs/trtorch-1.0/lib/python3.9/site-packages/torch_tensorrt/ts/_compiler.py", line 119, in compile
    compiled_cpp_mod = _C.compile_graph(module._c, _parse_compile_spec(spec))
RuntimeError: [Error thrown at core/conversion/conversion.cpp:220] List type. Only a single tensor or a TensorList type is supported.

Expected behavior

Graph should return a list of tensors without errors.

Environment

Build information about Torch-TensorRT can be found by turning on debug messages

  • Torch-TensorRT Version (e.g. 1.0.0): 1.0
  • PyTorch Version (e.g. 1.0): 1.10.2
  • CPU Architecture: x86-64
  • OS (e.g., Linux): Ubuntu 18.04
  • How you installed PyTorch (conda, pip, libtorch, source): Conda
  • Build command you used (if compiling from source):
  • Are you using local sources or building from archives: local
  • Python version: 3.9
  • CUDA version: 11.6
  • GPU models and configuration: Nvidia A10
  • Any other relevant information:

Additional context

Note that changing the forward function to the following definition:

      def forward(self, x):    
          tensors = []    
          for i in range(3):    
              #  y = x + x    
              tensors.append(x)    
      
          return tensors    

will succeed with the following output:

(trtorch-1.0) ~/av-dbg/experimental/chaoz/trtorch (chaoz/trtorch-experiments) $ python index.py
[tensor([[-0.9247, -0.4253]], device='cuda:0'), tensor([[-0.9247, -0.4253]], device='cuda:0'), tensor([[-0.9247, -0.4253]], device='cuda:0')]
INFO: [Torch-TensorRT] - ir was set to default, using TorchScript as ir
INFO: [Torch-TensorRT] - Module was provided as a torch.nn.Module, trying to script the module with torch.jit.script. In the event of a failure please preconvert your module to TorchScript
INFO: [Torch-TensorRT] - Lowered Graph: graph(%x.1 : Tensor):
  %tensors.1 : Tensor[] = prim::ListConstruct(%x.1, %x.1, %x.1)
  return (%tensors.1)

WARNING: [Torch-TensorRT] - Cannot infer input type from calcuations in graph for input x.1. Assuming it is Float32. If not, specify input type explicity
ERROR: [Torch-TensorRT] - Method requested cannot be compiled by Torch-TensorRT.TorchScript.
There is no work to be done since the resulting compiled program will contain an engine that is empty.
This may be because there are no operators that can be added to the TensorRT graph or all operators have a resolved compile time value.

WARNING: [Torch-TensorRT] - Input type for doing shape analysis could not be determined, defaulting to F32
INFO: [Torch-TensorRT] - Partitioned Graph: []
INFO: [Torch-TensorRT] - Segmented Graph: graph(%x.1 : Tensor):
  return ()

WARNING: [Torch-TensorRT] - Didn't generate any TensorRT engines, the compiler did nothing

[tensor([[-0.9247, -0.4253]], device='cuda:0'), tensor([[-0.9247, -0.4253]], device='cuda:0'), tensor([[-0.9247, -0.4253]], device='cuda:0')]
@chaoz-dev chaoz-dev added the bug Something isn't working label Feb 26, 2022
@chaoz-dev chaoz-dev changed the title 🐛 [Bug] Returning list of tensors fails when operations are applied to tensors 🐛 [Bug] Returning list of tensors fails when operations are applied to tensors: RuntimeError: [Error thrown at core/conversion/conversion.cpp:220] List type. Only a single tensor or a TensorList type is supported. Feb 26, 2022
@chaoz-dev
Copy link
Contributor Author

May be due to #428

@narendasan
Copy link
Collaborator

The reason the second example works is because you are getting returned back the input module since there was no tensor computation to optimize

WARNING: [Torch-TensorRT] - Didn't generate any TensorRT engines, the compiler did nothing

@peri044 @inocsin Where did we end up on handling tensor lists?

@narendasan
Copy link
Collaborator

Ah, I think I understand what is happening. In the cases of append and similar cases where PyTorch operates on groupings of Tensors which are already converted, we use a container class so that PyTorch and TensorRT can track data. Looks like we end up with a list of these container classes instead of just ITensors or Tensors and so the PyTorch type system treats it as a Generic List and not a TensorList.

@narendasan
Copy link
Collaborator

@peri044 I think we decided for this case to wait for collections right?

@chaoz-dev
Copy link
Contributor Author

Ah, I think I understand what is happening. In the cases of append and similar cases where PyTorch operates on groupings of Tensors which are already converted, we use a container class so that PyTorch and TensorRT can track data. Looks like we end up with a list of these container classes instead of just ITensors or Tensors and so the PyTorch type system treats it as a Generic List and not a TensorList.

Ah gotcha, yeah this makes sense, especially given the printed output:

List type. Only a single tensor or a TensorList type is supported.

@chaoz-dev chaoz-dev changed the title 🐛 [Bug] Returning list of tensors fails when operations are applied to tensors: RuntimeError: [Error thrown at core/conversion/conversion.cpp:220] List type. Only a single tensor or a TensorList type is supported. 🐛 [Bug] Returning list of tensors fails when operations are applied to tensors: RuntimeError: [Error thrown at core/conversion/conversion.cpp:220] List type. Only a single tensor or a TensorList type is supported. Mar 2, 2022
@ashafaei
Copy link

ashafaei commented Mar 2, 2022

I have a slightly different looking setup that triggers a similar problem. Is there a workaround?

# tuple.py
import torch
import torch.nn as nn
import torch_tensorrt as torchtrt


import torch_tensorrt.logging as torchtrt_logging

torchtrt_logging.set_reportable_log_level(torchtrt_logging.Level.Info)

torch.manual_seed(0)

DEVICE = torch.device("cuda:0")
SHAPE = (1, 3, 100, 100)


class Model(torch.nn.Module):
    def __init__(self):
        super().__init__()
        self.seq1 = nn.Sequential(nn.Conv2d(3, 6, 3, 1, 1), nn.ReLU(), nn.MaxPool2d(2, 2))
        self.seq2 = nn.Sequential(nn.Conv2d(6, 6, 3, 1, 1), nn.ReLU(), nn.MaxPool2d(2, 2))
        self.seq3 = nn.Sequential(nn.Conv2d(6, 6, 3, 1, 1), nn.ReLU(), nn.MaxPool2d(2, 2))

    def forward(self, x):
        a = self.seq1(x)
        b = self.seq2(a)
        c = self.seq3(b)

        ## It works fine without this
        a1, a2 = b.split([5, 1], dim=1)
        a1 = a1.max(dim=1, keepdim=True)[0]
        b = torch.cat([a1, a2], dim=1)
        ## ^

        return (b, c)


if __name__ == "__main__":
    tensor = torch.randn(SHAPE, dtype=torch.float32, device=DEVICE)

    model = Model().eval().to(DEVICE)
    out = model(tensor)
    print(f"Model: {out}")

    model_trt = torchtrt.compile(
        model,
        inputs=[
            torchtrt.Input(shape=SHAPE),
        ],
        enabled_precisions={torch.float},
    )
    out_trt = model_trt(tensor)
    print(f"Model TRT: {out_trt}")

    assert torch.max(torch.abs(out[0] - out_trt[0])) < 1e-6
    assert torch.max(torch.abs(out[1] - out_trt[1])) < 1e-6

when I remove the highlighted part of the code, the constructed graph correctly builds a TupleConstruct like this

  %5 : Tensor, %6 : Tensor = prim::ListUnpack(%4)
  %7 : (Tensor, Tensor) = prim::TupleConstruct(%5, %6)
  return (%7)

But when that part is active, I get this error

RuntimeError: Method (but not graphs in general) require a single output. Use None/Tuple for 0 or 2+ outputs

and the constructed graph is wrong

  %13 : Tensor[] = prim::ListConstruct(%5, %10, %7)
  %14 : Tensor[] = tensorrt::execute_engine(%13, %__torch___Model_trt_engine_0x55a84c97b750)
  %15 : Tensor, %16 : Tensor = prim::ListUnpack(%14)
  return (%15, %16)

Is there a way to manually remove the ListUnpack from the generated code? or is there a workaround to make this work?

@ashafaei
Copy link

ashafaei commented Mar 2, 2022

So I was looking around in the built model_trt to see if I can either manually remove the ListUnpack node or add a TuplseConstruct. Looks like I was lucky and found this interesting function that addresses this problem.

    model_trt.graph.makeMultiOutputIntoTuple()

If you call it just after the compilation, your graph becomes like this and it'll work.

%14 : Tensor[] = tensorrt::execute_engine(%13, %__torch___Model_trt_engine_0x55b2ee48a140)
%15 : Tensor, %16 : Tensor = prim::ListUnpack(%14)
%17 : (Tensor, Tensor) = prim::TupleConstruct(%15, %16)
return (%17)

It's better to not have the ListUnpack to begin with, but this at least works. If you guys know how can I remove a node from the built graph, please do share.

@pupumao
Copy link

pupumao commented Mar 11, 2022

meet same problem, any way to avoid this bug?

@ashafaei
Copy link

Just call

model_trt.graph.makeMultiOutputIntoTuple()`

after compilation and it'll work.

@pupumao
Copy link

pupumao commented Mar 14, 2022

@ashafaei
thanks very much for your help
I met the problem when call:

tensorrt_engine_model = torch_tensorrt.ts.convert_method_to_trt_engine(traced_model, "forward", **compile_settings)

this is the error:
Traceback (most recent call last):
File "model_converter.py", line 251, in <module>
engine = get_engine(model_info.trt_engine_path, calib, int8_mode=int8_mode, optimize_params=optimize_params)
File "model_converter.py", line 171, in get_engine
return build_engine(max_batch_size)
File "model_converter.py", line 93, in build_engine
return build_engine_from_jit(max_batch_size)
File "model_converter.py", line 77, in build_engine_from_jit
tensorrt_engine_model = torch_tensorrt.ts.convert_method_to_trt_engine(traced_model, "forward", **compile_settings)
File "/usr/local/lib/python3.6/dist-packages/torch_tensorrt/ts/_compiler.py", line 211, in convert_method_to_trt_engine
return _C.convert_graph_to_trt_engine(module._c, method_name, _parse_compile_spec(compile_spec))
RuntimeError: [Error thrown at core/conversion/conversion.cpp:220] List type. Only a single tensor or a TensorList type is supported.

how could i call the function model_trt.graph.makeMultiOutputIntoTuple()

@ashafaei
Copy link

@pupumao Are you getting this error during compilation? The workaround I shared is for a different problem. The model gets compiled, but then during inference throws this error. You may have better luck if you use the nn.Module model directly to compile like the sample code I shared above.

@narendasan narendasan added the component: core Issues re: The core compiler label May 18, 2022
@narendasan narendasan self-assigned this May 18, 2022
@narendasan narendasan added feature request New feature or request and removed bug Something isn't working labels May 24, 2022
@ncomly-nvidia
Copy link
Contributor

ncomly-nvidia commented Jul 26, 2022

@peri044 I think we decided for this case to wait for collections right?

@narendasan is this a tested case by collections for v1.2?

@narendasan
Copy link
Collaborator

Yes I tested this and it works

@ncomly-nvidia
Copy link
Contributor

Closing. @chaoz-dev please reopen / file a new issue if there is still a problem.

@ncomly-nvidia ncomly-nvidia added the release: v1.2 Tagged to be included in v1.2 label Aug 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component: core Issues re: The core compiler feature request New feature or request release: v1.2 Tagged to be included in v1.2
Projects
None yet
Development

No branches or pull requests

7 participants