Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Lowering aten::_index_put_impl_ into tm_tensor.scatter op #616

Closed
erman-gurses opened this issue Feb 21, 2022 · 8 comments
Closed
Assignees

Comments

@erman-gurses
Copy link
Collaborator

erman-gurses commented Feb 21, 2022

(#614)

@pashu123 pashu123 added the help wanted Extra attention is needed label Feb 21, 2022
@erman-gurses erman-gurses changed the title Lowering _index_put_impl_ into tm_tensor.scatter op [RFC] Lowering _index_put_impl_ into tm_tensor.scatter op Feb 21, 2022
@erman-gurses erman-gurses changed the title [RFC] Lowering _index_put_impl_ into tm_tensor.scatter op [RFC] Lowering aten::_index_put_impl_ into tm_tensor.scatter op Feb 22, 2022
@cathyzhyi
Copy link
Contributor

cathyzhyi commented Feb 23, 2022

This op is quite tricky. We need to transform the operands meet the requirement for for tm_tensor.scatter

(assuming readers are familiar with both aten.index_put and tm_tensor.scatter)

For all indices tensors:

  1. broadcast to the same shape (call it indices_shape for now)
  2. flatten to 1d tensor
  3. unsqueeze to make the last dimension size one
  4. concatenate all the tensors together

For values operand:

  1. broadcast to the values_shape. values_shape is consisted of 2 parts. First part is the same as the indices_shape. The second part is input tensor shape with dimensions matching the given indices tensors removed.
  2. collapse the indices_shape part to be 1 dimension.

Some steps above are more convenient to be decomposed to other torch ops (like flatten to 1d tensor, unsqueeze, concatenate), some are more convenient to be converted directly to tm_tensor or linalg ops. The conversion should go into ConvertTorchToTMTensor pass to be added by #574.

We might not be able to handle all the cases aten.index_put support. For example, it allows a mixed of None and tensors for the indices list in any order. We might want to start with no None indices first.

@silvasean
Copy link
Contributor

We will need a new pass called something like DecomposeOrConvertTorchOp to convert Torch ops to a mixed of torch and aten ops.

Do you mean a mix of torch and TMTensor ops? I think this pass would naturally be the scope of a ConvertTorchToTMTensor pass.

@cathyzhyi
Copy link
Contributor

sry, I meant a mixed of torch and linalg/tmtensor ops.

ConvertTorchToTMTensor

We have another pass called lowerAlgorithmicOps #574 which lowers algorithmic torch ops to linalg and tm_tensor ops. I feel calling ConvertTorchToTMTensor for this new pass would cause confusion as it is not explicit the torch can also be decomposed to Torch ops.

@silvasean
Copy link
Contributor

I left a comment in #574. I think that there should be LowerAlgorithmicOps (similar to DecomposeComplexOps) as a Torch->Torch lowering. Then ConvertTorchToTMTensor which happens in backends (i.e. after the Torch backend contract, just before ConvertTorchToLinalg).

@cathyzhyi
Copy link
Contributor

I left a comment in #574. I think that there should be LowerAlgorithmicOps (similar to DecomposeComplexOps) as a Torch->Torch lowering. Then ConvertTorchToTMTensor which happens in backends (i.e. after the Torch backend contract, just before ConvertTorchToLinalg).

Ha, I see. There seems to be confusion about passes naming. I updated my comments. Thanks for clarifying this!

@cathyzhyi cathyzhyi removed the help wanted Extra attention is needed label Feb 28, 2022
@vivekkhandelwal1 vivekkhandelwal1 added the help wanted Extra attention is needed label Mar 7, 2022
@vivekkhandelwal1
Copy link
Collaborator

vivekkhandelwal1 commented Mar 7, 2022

Hi @cathyzhyi @silvasean
I am working on the implementation of this op. I need this op for the functorch decomposition of aten::embedding_dense_backward.
https://github.com/pytorch/functorch/blob/main/functorch/_src/decompositions.py#L318-L333

@register_decomposition(aten.embedding_dense_backward)
def embedding_dense_backward(grad_output: Tensor, indices: Tensor, num_weights: int, padding_idx: int, scale_grad_by_freq: bool):
    numel = indices.numel()
    grad = grad_output.view(numel, grad_output.size(-1))
    grad_weight = aten.new_zeros(grad_output, (num_weights, grad_output.shape[-1]))
    indices_rank1 = indices.view(numel)
    if scale_grad_by_freq:
        counts = aten.new_zeros(indices, (num_weights,))
        ones = aten.new_ones(indices, (numel,))
        counts = aten.index_put(counts, [indices_rank1], ones, accumulate=True)
        grad_weights_scale = aten.index(counts, [indices_rank1])
        grad = grad / grad_weights_scale.unsqueeze(1)
    skip_padding = (indices_rank1 != padding_idx).unsqueeze(1)
    skip_padding = skip_padding.expand_as(grad)
    zero_grad = aten.full_like(grad, 0)
    return aten.index_put(grad_weight, [indices_rank1], aten.where(skip_padding, grad, zero_grad), accumulate=True)

I have gone through the comments mentioned above. Since this op seems a bit complex, should we implement this op as of now for the cases required, which is a 1-d tensor of indices and a 1-d tensor of values with
indices.size() == values.size()
We can add the remaining cases as a TO-DO and add them as and when required. The current implementation of tm_tensor.scatter op easily supports the 1-d case.

@silvasean
Copy link
Contributor

Starting with the 1D version of this and emitting a clear diagnostic for the unsupported cases seems like a good incremental first step.

@cathyzhyi cathyzhyi removed the help wanted Extra attention is needed label Mar 9, 2022
@cathyzhyi
Copy link
Contributor

closed by #650

qedawkins pushed a commit to nod-ai/torch-mlir that referenced this issue Oct 3, 2022
…lvm#616)

* Generate files per configuration & refer to them as such

This is a step towards fully supporting multi-config generators such as VS and Xcode. Right now, the generated files as well as path to files in the project point to a single location which often does not exist unless CMAKE_BUILD_TYPE was specified on the command line which defeats the purpose of multi-config.

This change does a few things:
1) Files that are generated by the build now reside in the config directory when using a multi-config generator. For example, instead of: src/ExternalUtil.hpp, when using VS or Xcode, we have: src/Debug/ExternalUtil.hpp, etc.
2) The targets that use the generated files now have an explicit dependency
3) The lit configuration now correctly refers to the binaries in the onnx-mlir build tree per configuration
4) The paths to onnx-mlir and the build directories are correct in the generated files
5) Executables now have the right extensions when they are referenced in tools and scripts
6) A couple of unnecessary variables are removed

Bonus: move the CTest includes where they are needed

After this change, the path to LLVM_PROJ_BIN is still not correct always since it still refers to CMAKE_BUILD_TYPE which is not always known at configuration time. This can be fixed by using find_package(LLVM) or find_package(MLIR). Using find_package is optimal, but it is a bigger change, so I am splitting it from the file generation for simplicity. The find_package(MLIR) change will be sent for review after this change (and maybe a couple more simple ones) is in.

Signed-off-by: Stella Stamenova <stilis@microsoft.com>

* Correctly generate the files for single generators as well

Signed-off-by: Stella Stamenova <stilis@microsoft.com>

* Unfortunately, generator expressions are not enough to make this work
and we have to do it the hard way

Signed-off-by: Stella Stamenova <stilis@microsoft.com>

* Fix a copy-paste error

Signed-off-by: Stella Stamenova <stilis@microsoft.com>

* Fix another copy-paste error because apparently I need to learn how to copy-paste

Signed-off-by: Stella Stamenova <stilis@microsoft.com>

* Make all python string raw strings - on (windows) systems where the path contains short directories such as a, \a ends up being treated as a special character causing failures

Signed-off-by: Stella Stamenova <stilis@microsoft.com>

* Add comments to explain the usage of CMAKE_CFG_INTDIR and add it for lit as well for consistency

Signed-off-by: Stella Stamenova <stilis@microsoft.com>

* Don't move the ctest include

It is not pertinent to the multiconfig changes and it is causing a weird issue with test generation on some platforms

Signed-off-by: Stella Stamenova <stilis@microsoft.com>

* After thinking through the problem some more, we can simplify the logic for the generation of the files by using generator expressions in a couple of more places including the CMAKE_*_OUTPUT_DIRECTORY properties which support generator expressions. This allows the files to be generated for each config option and then we can always reference them in that location by using CMAKE_BUILD_TYPE when dealing with a single-config generator.

By making this change this way, the CMAKE_*_OUTPUT_DIRECTORY set now fully controls where all of the outputs are located which means that if we wanted to align with LLVM for the binary layout, it is a simple 3-line change.

Signed-off-by: Stella Stamenova <stilis@microsoft.com>

* Add comment in runtime/jni/cmakelists.txt to explain why we need to copy the file.

Signed-off-by: Stella Stamenova <stilis@microsoft.com>

* Set the build path for both single- and multi-config generators to contain the build mode. For example, build_dir/bin/release and build_dir/lib_release

Signed-off-by: Stella Stamenova <stilis@microsoft.com>

* Change the build paths for onnx-mlir from build_dir/bin/release to build_dir/release/bin

Signed-off-by: Stella Stamenova <stilis@microsoft.com>

Co-authored-by: Kevin O'Brien <caomhin@us.ibm.com>
rsuderman pushed a commit that referenced this issue May 17, 2024
…ering (#3351)

Addresses [Shark-Turbine
#196](nod-ai/SHARK-TestSuite#196)

Related tracker [Shark-Turbine
#566](nod-ai/SHARK-ModelDev#566)

Related onnx.Resize issues [Shark-Turbine
#616](nod-ai/SHARK-ModelDev#616)
BaneTrifa pushed a commit to BaneTrifa/torch-mlir that referenced this issue May 24, 2024
…ering (llvm#3351)

Addresses [Shark-Turbine
llvm#196](nod-ai/SHARK-TestSuite#196)

Related tracker [Shark-Turbine
llvm#566](nod-ai/SHARK-ModelDev#566)

Related onnx.Resize issues [Shark-Turbine
llvm#616](nod-ai/SHARK-ModelDev#616)
sjarus pushed a commit to sjarus/torch-mlir that referenced this issue Jun 6, 2024
…ering (llvm#3351)

Addresses [Shark-Turbine
llvm#196](nod-ai/SHARK-TestSuite#196)

Related tracker [Shark-Turbine
llvm#566](nod-ai/SHARK-ModelDev#566)

Related onnx.Resize issues [Shark-Turbine
llvm#616](nod-ai/SHARK-ModelDev#616)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants