Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Dma] Don't check Dma unit dims during subsumption and fold them during controlCodeLowering #1070

Merged
merged 5 commits into from
Jan 31, 2025

Conversation

yzhang93
Copy link
Contributor

@yzhang93 yzhang93 commented Jan 30, 2025

Before this PR, we need to check the number of sizes/strides before dma loop subsumption to make sure the number of dims are not exceed the maximum after subsumption. However, this blocks some opportunities for loop subsumption when there are unit dimensions which are not canonicalized because the offsets of these dimensions are none 0. For example the following loop cannot be subsumed because there are already 4 dimensions on L3 source side.

scf.for %arg2 = %c0 to %c6 step %c1 {
  %1 = affine.apply affine_map<(d0) -> (d0 + 1)>(%arg2)
  amdaie.npu.dma_cpy_nd %0([] [] [], [1, %1, 0, 0] [1, 1, 32, 32] [8192, 1024, 32, 1])
}

This PR relaxes the constraint by only checking the non-unit dimensions, so the above loop can be subsumed into dma as

amdaie.npu.dma_cpy_nd %0([] [] [], [0, 1, 1, 0, 0] [6, 1, 1, 32, 32] [1024, 8192, 1024, 32, 1])

And this dma can be further canonicalized.

@yzhang93 yzhang93 changed the title [WIP][Dma] Don't check Dma unit dims during subsumption and fold them… [WIP][Dma] Don't check Dma unit dims during subsumption and fold them during controlCodeLowering Jan 30, 2025
@yzhang93 yzhang93 force-pushed the dma_check_nonunit_dim branch 2 times, most recently from c84333b to c25585f Compare January 30, 2025 21:08
@yzhang93 yzhang93 changed the title [WIP][Dma] Don't check Dma unit dims during subsumption and fold them during controlCodeLowering [Dma] Don't check Dma unit dims during subsumption and fold them during controlCodeLowering Jan 31, 2025
@yzhang93 yzhang93 marked this pull request as ready for review January 31, 2025 06:54
@yzhang93 yzhang93 force-pushed the dma_check_nonunit_dim branch from 8d25a83 to 6c23ff0 Compare January 31, 2025 19:28
Copy link
Collaborator

@jtuyls jtuyls left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@yzhang93 yzhang93 merged commit e2b0e9d into nod-ai:main Jan 31, 2025
7 checks passed
@yzhang93 yzhang93 deleted the dma_check_nonunit_dim branch February 3, 2025 20:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants