Fix attention + enable VMMA #43

raikonenfnu · 2025-01-23T02:42:12Z

Update attention to work ToM and support VMMA:

Update translation info to use "pipeline"
Update attention IREE IR to specify QK and KV MMA schedule separately S.T it works with ToM IREE
Refactor to use enum.Enum to represent intrinsics
Add VMMA support and helper functions to maximize perf

1. Update attention IREE IR to specify QK and KV MMA schedule separately S.T it works with ToM IREE 2. Refactor to use enum.Enum to represent intrinsics 3. Add VMMA support and helper functions to maximize perf Signed-off-by: Stanley Winata <stanley.winata@amd.com>

saienduri

Thanks!!

saienduri and others added 2 commits January 22, 2025 17:23

attribute change

6129b48

raikonenfnu requested a review from saienduri January 23, 2025 02:42

saienduri approved these changes Jan 23, 2025

View reviewed changes

raikonenfnu merged commit 87c0c8c into nod-ai:main Jan 23, 2025
1 check failed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix attention + enable VMMA #43

Fix attention + enable VMMA #43

raikonenfnu commented Jan 23, 2025

saienduri left a comment

Fix attention + enable VMMA #43

Fix attention + enable VMMA #43

Conversation

raikonenfnu commented Jan 23, 2025

saienduri left a comment

Choose a reason for hiding this comment