Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lower cache_read and cache_write to Hexagon DMA via tensorize #10365

Merged
merged 4 commits into from
Feb 25, 2022

Conversation

adstraw
Copy link
Contributor

@adstraw adstraw commented Feb 23, 2022

Adds support to lower cache_read and cache_write schedule primitives to Hexagon User DMA using tensorize and new mem_copy intrinsic.

@adstraw
Copy link
Contributor Author

adstraw commented Feb 23, 2022

HUGE thanks to @masahi who also contributed to this PR.

@masahi masahi self-assigned this Feb 24, 2022
Copy link
Contributor

@tmoreau89 tmoreau89 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks @adstraw and @masahi for getting the pattern matching from cache_read/write to DMA working!

* Same semantics as memcpy(destination, source, size)
* Allows for device specific implementations e.g. direct memory access (DMA)
*/
TVM_DLL const Op& mem_copy();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @tqchen @junrushao1994 if they want a better name here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to make it more specific? We have a few examples above like texture2d_load/store. What about dma_mem_copy() or any hexagon-specific name?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about something with "DMA" in the name but I realized that DMA is more of an implementation detail. I ended up referencing DMA in the comments instead. Happy to take suggestions including dma_mem_copy.

Copy link
Member

@masahi masahi Feb 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about:

  • mem_copy_local (to emphasize the local nature of the intended usage, unlike TVMArrayCopyFromTo @junrushao1994 mentioned which might involve RPC)
  • mem_copy_1d (if we intend to add 2d rectangular memcpy later)

But both of them don't particularly sound better.

@masahi
Copy link
Member

masahi commented Feb 24, 2022

cc @kparzysz-quic

@masahi masahi merged commit dcbdedd into apache:main Feb 25, 2022
@adstraw adstraw deleted the hexagon-tir-to-dma-lower branch February 25, 2022 19:16
pfk-beta pushed a commit to pfk-beta/tvm that referenced this pull request Apr 11, 2022
…#10365)

* Lower cache_read and cache_write to Hexagon DMA via tensorize

* rework test to be compatible with launcher

* remove cpu device api mem_copy implementation and test
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants