Support Unbounded Dynamism for torch.export #6393

lsy323 · 2024-01-26T22:46:20Z

🚀 Feature

The unbounded dynamic shape needs to be propagated through ops.

Scope:

Only for export use case.

Example:

opcode         name           target                   args                     kwargs
-------------  -------------  -----------------------  -----------------------  --------
placeholder    arg0_1         arg0_1                   ()                       {}
placeholder    l_embeddings_  l_embeddings_            ()                       {}
call_function  sym_size_int   aten.sym_size.int        (l_embeddings_, 0)       {}
call_function  mul            <built-in function mul>  (sym_size_int, 2)        {}
call_function  expand         aten.expand.default      (arg0_1, [mul, -1, -1])  {}
output         output         output                   ((expand,),)             {}

In LazyIR, we need to capture the aten.sym_size.int and the subsequent arithmetic operations on the SymInt. So that the semantic can be lowered.

In the lowered HLO graph, we should have something like

main(%arg0: tensor<?x3x224x224xf32>, %arg1: tensor<1x1x768xf32>) -> tensor<?x1x768xf32> {
    %1 = get_dimension_size(%arg0, dim = 0) 
    %2 = expand(%arg1, %1, dim=0)
    return %2 : tensor<?x1x768xf32>
}

Rough Plan

We need to trace and lower torch ops with SymInt output in LTC. (for aten.sym_size.int, which generates a SymInt)
The arithmetic on the SymInt needs to be traced. This shouldn't be hard to achieve if the corresponding LazyIR node can be created for the SymInt.
When the sym_int version of the op is lowered, it needs to retrieve the underlying LazyIR of the SymInt argument

Open questions

Would it make sense to handle both bounded and unbounded dynamism under the same workflow/infra? The source of the dynamic dim needs to be traced for unbounded dynamic case, but not for bounded dynamism.

Example

dynamic_dim = input.shape[0]
dynamic_dim = dynamic_dim * 2
expanded = input.expand([dynamic * 2, -1, -1])

Let's say input.shape[0] has a bound of <= 5. In bounded dynamism, only knowing the upper bound in the op is enough. In unbounded dynamism, the arithmetic on SymInt needs to be traced and lowered in LTC.

Not sure if there is any API to create an unbounded dynamic tensor, so the graph can be traced with it.

The text was updated successfully, but these errors were encountered:

JackCaoG · 2024-01-26T23:20:59Z

@ezyang We are trying to support unbounded dynamic shape only for export.

The exported FX Graph is dynamic and has sym_size op that extracts the size of a tensor and store it in a SymInt. During LTC tracing, we are using static tensors. (Not sure if there is a way to create a tensor with unbounded dynamic dimension and trace with it) In this process, all tensors are static from PyTorch's perspective so it will try to collapse the symint into concrete integer instead of treating the size as a symbolic things. What's the correct way to make PyTorch always treat symint as unbacked and not collapsing it? We want to lower the symint to an IR in the final HLO.

ezyang · 2024-01-29T04:26:16Z

Are you dead set on lazy tensor for this, or can you be flexible?

Because if you can be flexible, I would tell you to use torch.export and you'd get a graph with non-collapsed dynamic dimensions and size compute for whatever you needed to be dynamic. And then life is good. To get it to work with LTC... ugh, I don't even wanna think about it lol.

JackCaoG · 2024-01-29T18:17:59Z

@ezyang we are already using the torch.export. The issue is we need to lower that exported FX to HLO. I am wondering how does inductor handles this, does inductor support unbounded dynamism?

ezyang · 2024-01-29T20:04:04Z

Inductor does just handle it! For example, when generating kernels for pointwise operations, we support symbolic ranges. We can also deal with symbolic indexing formulas and use sympy to simplify them.

lsy323 · 2024-02-05T19:16:18Z

cc @miladm @vanbasten23

lsy323 · 2024-03-20T16:36:54Z

#6653 Provides support with limitations to this uses case by applying a FX pass to group the sym_size and the following op into a single xla op, the op will be lowered with dynamism semantics in torch_xla. The limitation is if there are some arithmetics on the SymInt, the FX passes is not flexibly enough to capture those operations.

lsy323 · 2024-09-17T05:37:39Z

Code contributions that enabled this unbounded dynamism:

Propagating dynamism information from PyTorch down to StableHLO

Enable passing down dynamic dimensions from torch to XLA #5790

Support unbounded dynamism for torch ops

FX passes to to support dynamism in op argument

lsy323 added the dynamism Dynamic Shape Features label Jan 26, 2024

lsy323 closed this as completed Feb 21, 2024

lsy323 mentioned this issue Mar 20, 2024

Add fx passes to support unbounded dynamism in torch op arg #6653

Merged

miladm assigned lsy323 Sep 17, 2024

lsy323 reopened this Sep 17, 2024

lsy323 closed this as completed Sep 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support Unbounded Dynamism for torch.export #6393

Support Unbounded Dynamism for torch.export #6393

lsy323 commented Jan 26, 2024 •

edited

Loading

JackCaoG commented Jan 26, 2024

ezyang commented Jan 29, 2024

JackCaoG commented Jan 29, 2024

ezyang commented Jan 29, 2024

lsy323 commented Feb 5, 2024 •

edited

Loading

lsy323 commented Mar 20, 2024

lsy323 commented Sep 17, 2024 •

edited

Loading

Support Unbounded Dynamism for torch.export #6393

Support Unbounded Dynamism for torch.export #6393

Comments

lsy323 commented Jan 26, 2024 • edited Loading

🚀 Feature

Rough Plan

Open questions

JackCaoG commented Jan 26, 2024

ezyang commented Jan 29, 2024

JackCaoG commented Jan 29, 2024

ezyang commented Jan 29, 2024

lsy323 commented Feb 5, 2024 • edited Loading

lsy323 commented Mar 20, 2024

lsy323 commented Sep 17, 2024 • edited Loading

lsy323 commented Jan 26, 2024 •

edited

Loading

lsy323 commented Feb 5, 2024 •

edited

Loading

lsy323 commented Sep 17, 2024 •

edited

Loading