[Hexagon][LLVM] Enable/test tensorized Hexagon DMA on 2d transformed layout #10905

Lunderberg · 2022-04-05T16:36:10Z

In the CodeGenLLVM::CreateIntrinsic handler for builtin::address_of(), pass N-d indices to CodeGenLLVM::CreateBufferPtr. The base class implementation still asserts that there is a flat memory space, while the CodeGenHexagon::CreateBufferPtr override allows 2-d memory.
Enable tensorization in test_cache_read_write.py, using tir.address_of to pass the lowered value.

Co-authored-by: Adam Straw astraw@octoml.ai

- In the `CodeGenLLVM::CreateIntrinsic` handler for `builtin::address_of()`, pass N-d indices to `CodeGenLLVM::CreateBufferPtr`. The base class implementation still asserts that there is a flat memory space, while the `CodeGenHexagon::CreateBufferPtr` override allows 2-d memory. - Enable tensorization in `test_cache_read_write.py`, using `tir.address_of` to pass the lowered value. Co-authored-by: Adam Straw <astraw@octoml.ai>

Lunderberg · 2022-04-05T16:37:49Z

This change shouldn't depend on the changes introduced in #10878 and #10903, but local testing was done including those changes as well.

Previously, any `buffer_bind_scope` attribute that provides a view into a non-flat buffer would result in an error. After this commit, `buffer_bind_scope` may be used for non-flat buffers, but use of `arg_buffer->elem_offset` within the body of the bind statement is still an error. The `BufferNode::elem_offset` field represents the offset between the pointer of the backing allocation and the first element of the buffer. This offset is only well-defined for flat memory spaces.

…fails)

…orization_straw

csullivan · 2022-04-07T23:31:00Z

tests/python/contrib/test_hexagon/test_cache_read_write.py

+def layout_transform_2d(n):
+    return [n // 16, te.AXIS_SEPARATOR, n % 16]
+
+
 @requires_hexagon_toolchain
 def test_cache_read_write(hexagon_session):


Can we have a test demonstrating the approach for discontiguous memory and contiguous memory? I notice you are overwriting the old test coverage with this change and it's likely useful to maintain coverage for both cases.

csullivan · 2022-04-07T23:35:04Z

src/tir/ir/buffer.cc

+    // Sentinel value for ArgBinder::BindBuffer to state that any usage
+    // of element offset is invalid.
+    slice.CopyOnWrite()->elem_offset = PrimExpr();


Expanding this comment to explain why the use of element offset is invalid in the Nd case, or even better a short TODO to update once the IR/buffer is changed to support Nd offset would help. It took me a little while to understand why this was necessary.

@Lunderberg Let me know if you are OK with the comment rewrite here including the TODO I wrote related to PR #10816.

The change and the TODO look good to me.

csullivan · 2022-04-12T15:27:46Z

Thanks @adstraw @Lunderberg! This is merged

@yzh119

* main: (527 commits) [hexagon] 'add_hvx' test to explore HVX usage. (apache#10604) [COMMUNITY] @yzh119 -> Reviewer (apache#10993) [Metaschedule] Make custom schedule_rule registration optional (apache#10975) [ONNX] Add imports for BERT contrib operators (apache#10949) sort axes (apache#10985) [Hexagon] Remove HexagonBuffer external constructor and support (apache#10978) [CI] Update GPU image (apache#10992) [Runtime][Vulkan] Add RGP support to TVM for vulkan device (apache#10953) [FIX] resolve int64/32 for AttrStmtNode (apache#10983) [TVMC] Allow output module name to be passed as a command line argument (apache#10962) [ONNX] Add MatMulInteger importer (apache#10450) [COMMUNITY] @guberti -> Reviewer (apache#10976) Support `qnn.conv2d` in FoldExplicitPading (apache#10982) change Hexagon docker version (apache#10981) remove exception handling of autotvm xgboost extract functions (apache#10948) [CUDNN] Add partitioning support for conv2d and log_softmax (apache#10961) [Hexagon][LLVM] Enable/test tensorized Hexagon DMA on 2d transformed layout (apache#10905) [Hexagon] Move aot/graph_executor interactions into launcher (apache#10907) [HEXAGON] Split huge 1D DMA Transfers into smaller transfers with legal sizes. (apache#10971) [CI][DOCKER] Add pytest-lazy-fixture to images (apache#10970) ...

…layout (apache#10905) * [Hexagon][LLVM] Enable/test tensorized Hexagon DMA - In the `CodeGenLLVM::CreateIntrinsic` handler for `builtin::address_of()`, pass N-d indices to `CodeGenLLVM::CreateBufferPtr`. The base class implementation still asserts that there is a flat memory space, while the `CodeGenHexagon::CreateBufferPtr` override allows 2-d memory. - Enable tensorization in `test_cache_read_write.py`, using `tir.address_of` to pass the lowered value. Co-authored-by: Adam Straw <astraw@octoml.ai> * [TIR] Allow buffer_bind_scope of N-d buffers Previously, any `buffer_bind_scope` attribute that provides a view into a non-flat buffer would result in an error. After this commit, `buffer_bind_scope` may be used for non-flat buffers, but use of `arg_buffer->elem_offset` within the body of the bind statement is still an error. The `BufferNode::elem_offset` field represents the offset between the pointer of the backing allocation and the first element of the buffer. This offset is only well-defined for flat memory spaces. * update test to tensorize cache_read `y` (works) and cache_write `z` (fails) * add `split` to allow for tensorization of cache_write of `z` * fix typo and cleanup comment * add back original 1d test_cache_read_write * update comments * format error Co-authored-by: Adam Straw <astraw@octoml.ai>

Lunderberg and others added 3 commits April 5, 2022 14:14

Merge branch 'main' into hexagon_2d_dma_tensorization_straw

8bbb147

update test to tensorize cache_read y (works) and cache_write z (…

6a1e46b

…fails)

Lunderberg changed the title ~~[Hexagon][LLVM] Enable/test tensorized Hexagon DMA~~ [Hexagon][LLVM] Enable/test tensorized Hexagon DMA on 2d transformed layout Apr 6, 2022

adstraw added 4 commits April 6, 2022 11:06

add split to allow for tensorization of cache_write of z

aa4e72b

fix typo and cleanup comment

1567918

Merge remote-tracking branch 'upstream/main' into hexagon_2d_dma_tens…

96b2649

…orization_straw

Merge remote-tracking branch 'upstream/main' into hexagon_2d_dma_tens…

44d507d

…orization_straw

csullivan reviewed Apr 7, 2022

View reviewed changes

adstraw added 3 commits April 11, 2022 15:43

add back original 1d test_cache_read_write

8e99921

update comments

243cc37

format error

dd3847c

csullivan approved these changes Apr 12, 2022

View reviewed changes

csullivan merged commit 11d22bd into apache:main Apr 12, 2022

Lunderberg deleted the hexagon_2d_dma_tensorization branch April 12, 2022 19:56

driazati mentioned this pull request Jul 14, 2022

TVM v0.9.0.rc0 Release Candidate Notes #12102

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Hexagon][LLVM] Enable/test tensorized Hexagon DMA on 2d transformed layout #10905

[Hexagon][LLVM] Enable/test tensorized Hexagon DMA on 2d transformed layout #10905

Lunderberg commented Apr 5, 2022

Lunderberg commented Apr 5, 2022

csullivan Apr 7, 2022

adstraw Apr 11, 2022

csullivan Apr 7, 2022

adstraw Apr 11, 2022

Lunderberg Apr 12, 2022

csullivan commented Apr 12, 2022

[Hexagon][LLVM] Enable/test tensorized Hexagon DMA on 2d transformed layout #10905

[Hexagon][LLVM] Enable/test tensorized Hexagon DMA on 2d transformed layout #10905

Conversation

Lunderberg commented Apr 5, 2022

Lunderberg commented Apr 5, 2022

csullivan Apr 7, 2022

Choose a reason for hiding this comment

adstraw Apr 11, 2022

Choose a reason for hiding this comment

csullivan Apr 7, 2022

Choose a reason for hiding this comment

adstraw Apr 11, 2022

Choose a reason for hiding this comment

Lunderberg Apr 12, 2022

Choose a reason for hiding this comment

csullivan commented Apr 12, 2022