Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Unbounded Dynamism for torch.export #6393

Closed
lsy323 opened this issue Jan 26, 2024 · 7 comments
Closed

Support Unbounded Dynamism for torch.export #6393

lsy323 opened this issue Jan 26, 2024 · 7 comments
Assignees
Labels
dynamism Dynamic Shape Features

Comments

@lsy323
Copy link
Collaborator

lsy323 commented Jan 26, 2024

🚀 Feature

The unbounded dynamic shape needs to be propagated through ops.

Scope:

  • Only for export use case.

Example:

opcode         name           target                   args                     kwargs
-------------  -------------  -----------------------  -----------------------  --------
placeholder    arg0_1         arg0_1                   ()                       {}
placeholder    l_embeddings_  l_embeddings_            ()                       {}
call_function  sym_size_int   aten.sym_size.int        (l_embeddings_, 0)       {}
call_function  mul            <built-in function mul>  (sym_size_int, 2)        {}
call_function  expand         aten.expand.default      (arg0_1, [mul, -1, -1])  {}
output         output         output                   ((expand,),)             {}

In LazyIR, we need to capture the aten.sym_size.int and the subsequent arithmetic operations on the SymInt. So that the semantic can be lowered.

In the lowered HLO graph, we should have something like

main(%arg0: tensor<?x3x224x224xf32>, %arg1: tensor<1x1x768xf32>) -> tensor<?x1x768xf32> {
    %1 = get_dimension_size(%arg0, dim = 0) 
    %2 = expand(%arg1, %1, dim=0)
    return %2 : tensor<?x1x768xf32>
}

Rough Plan

  • We need to trace and lower torch ops with SymInt output in LTC. (for aten.sym_size.int, which generates a SymInt)
  • The arithmetic on the SymInt needs to be traced. This shouldn't be hard to achieve if the corresponding LazyIR node can be created for the SymInt.
  • When the sym_int version of the op is lowered, it needs to retrieve the underlying LazyIR of the SymInt argument

Open questions

  1. Would it make sense to handle both bounded and unbounded dynamism under the same workflow/infra? The source of the dynamic dim needs to be traced for unbounded dynamic case, but not for bounded dynamism.

Example

dynamic_dim = input.shape[0]
dynamic_dim = dynamic_dim * 2
expanded = input.expand([dynamic * 2, -1, -1])

Let's say input.shape[0] has a bound of <= 5. In bounded dynamism, only knowing the upper bound in the op is enough. In unbounded dynamism, the arithmetic on SymInt needs to be traced and lowered in LTC.

  1. Not sure if there is any API to create an unbounded dynamic tensor, so the graph can be traced with it.
@lsy323 lsy323 added the dynamism Dynamic Shape Features label Jan 26, 2024
@JackCaoG
Copy link
Collaborator

@ezyang We are trying to support unbounded dynamic shape only for export.

The exported FX Graph is dynamic and has sym_size op that extracts the size of a tensor and store it in a SymInt. During LTC tracing, we are using static tensors. (Not sure if there is a way to create a tensor with unbounded dynamic dimension and trace with it) In this process, all tensors are static from PyTorch's perspective so it will try to collapse the symint into concrete integer instead of treating the size as a symbolic things. What's the correct way to make PyTorch always treat symint as unbacked and not collapsing it? We want to lower the symint to an IR in the final HLO.

@ezyang
Copy link
Collaborator

ezyang commented Jan 29, 2024

Are you dead set on lazy tensor for this, or can you be flexible?

Because if you can be flexible, I would tell you to use torch.export and you'd get a graph with non-collapsed dynamic dimensions and size compute for whatever you needed to be dynamic. And then life is good. To get it to work with LTC... ugh, I don't even wanna think about it lol.

@JackCaoG
Copy link
Collaborator

@ezyang we are already using the torch.export. The issue is we need to lower that exported FX to HLO. I am wondering how does inductor handles this, does inductor support unbounded dynamism?

@ezyang
Copy link
Collaborator

ezyang commented Jan 29, 2024

Inductor does just handle it! For example, when generating kernels for pointwise operations, we support symbolic ranges. We can also deal with symbolic indexing formulas and use sympy to simplify them.

@lsy323
Copy link
Collaborator Author

lsy323 commented Feb 5, 2024

cc @miladm @vanbasten23

@lsy323
Copy link
Collaborator Author

lsy323 commented Mar 20, 2024

#6653 Provides support with limitations to this uses case by applying a FX pass to group the sym_size and the following op into a single xla op, the op will be lowered with dynamism semantics in torch_xla. The limitation is if there are some arithmetics on the SymInt, the FX passes is not flexibly enough to capture those operations.

@lsy323 lsy323 reopened this Sep 17, 2024
@lsy323 lsy323 closed this as completed Sep 17, 2024
@lsy323
Copy link
Collaborator Author

lsy323 commented Sep 17, 2024

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dynamism Dynamic Shape Features
Projects
None yet
Development

No branches or pull requests

3 participants