Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GPU] Prefer TileAndFuse pipeline over SIMT pipeline #18793

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

nirvedhmeshram
Copy link
Contributor

@nirvedhmeshram nirvedhmeshram commented Oct 16, 2024

TileandFuse is the modernized pipeline that we would want to use over the older SIMT pipeline when possible.

Copy link
Member

@kuhar kuhar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we observe the difference using the gemm shapes in https://github.com/nod-ai/iree-kernel-benchmark ?

@nirvedhmeshram
Copy link
Contributor Author

Can we observe the difference using the gemm shapes in https://github.com/nod-ai/iree-kernel-benchmark ?

As I am running it I am realizing that all of but three shapes in the benchmark are multiples of 32 so dont go down these default path, but lets see what I get for those 3.

@kuhar
Copy link
Member

kuhar commented Oct 16, 2024

@nirvedhmeshram We can add a few more. Do you have some specific shapes you are interested in?

@nirvedhmeshram
Copy link
Contributor Author

@nirvedhmeshram We can add a few more. Do you have some specific shapes you are interested in?

@kuhar actually on close inspection I do see some shapes that go down SIMT for e.g

        %2 = linalg.matmul_transpose_a ins(%arg0, %arg1 : tensor<8192x14336xbf16>, tensor<8192x4xbf16>)
                                       outs(%1 : tensor<14336x4xf32>)

However, I am realizing this PR is most likely just an NFC, because TileandFuse just bails on such shapes as it wont find a schedule
https://github.com/iree-org/iree/blob/main/compiler/src/iree/compiler/Codegen/Dialect/GPU/TargetUtils/ConfigUtils.cpp#L210-L213

However I will change that in a follow up, next
#18791

@MaheshRavishankar
Copy link
Contributor

Lets fold this into the subseqent PR? Hard to see what this flag is for.

@nirvedhmeshram
Copy link
Contributor Author

Lets fold this into the subseqent PR? Hard to see what this flag is for.

Ya that is fine by me, Quinn suggested not to do too much at once, but since this is doing nothing, folding makes sense

@kuhar
Copy link
Member

kuhar commented Oct 16, 2024

Ah, interesting, could you add this shape to iree-kernel-benchmark? We can tag it as 'corner_case'.

@nirvedhmeshram
Copy link
Contributor Author

Ah, interesting, could you add this shape to iree-kernel-benchmark? We can tag it as 'corner_case'.

@jakub that shape is already generated by the benchmark so I was saying no need for adding more cases as the sizes pulled from the models seem to have enough diversity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants