GPU vectorize pass #970

adam-smnk · 2024-09-13T15:10:16Z

Adds GPU kernel vectorization pass.
Extends CUDA lowering pass to support vector operations.

GPU-specific vectorization pass that guides upstream Linalg vectorizer to process operations within a GPU kernel or prepared for outlining. CUDA-specific pass gets extended to allow lowering of vector ops within GPU kernel.

The vectorization is for now disabled within the GPU pipeline due to lack of vector operation unrolling. When vector sizes exceed hardware supported lengths, pipeline gets stuck on GPU binary compilation step. This will be addressed by a separate transformation pass in the future.

rengolin

Looks good with comment. Nice that the GPU pipeline is getting closer to the CPU one.

lib/TPP/GPU/GpuToCuda.cpp

adam-smnk requested review from rolfmorel and KavithaTipturMadhu September 13, 2024 15:13

adam-smnk force-pushed the gpu-vectorize-pass branch from 25c9339 to ef50749 Compare October 3, 2024 09:30

rengolin approved these changes Oct 3, 2024

View reviewed changes

lib/TPP/GPU/GpuToCuda.cpp Outdated Show resolved Hide resolved

rengolin mentioned this pull request Oct 3, 2024

Outline to GPU launch before further lowering #971

Merged

adam-smnk added 3 commits October 3, 2024 11:40

GPU vectorizer

fb72b25

CUDA - handle vector ops

f7cf99c

Refactor after rebase

5bfe7a9

adam-smnk force-pushed the gpu-vectorize-pass branch from ef50749 to 5bfe7a9 Compare October 3, 2024 09:47

Cleanup passes

976b347

adam-smnk merged commit 894bbc4 into plaidml:main Oct 3, 2024
6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GPU vectorize pass #970

GPU vectorize pass #970

adam-smnk commented Sep 13, 2024

rengolin left a comment

GPU vectorize pass #970

GPU vectorize pass #970

Conversation

adam-smnk commented Sep 13, 2024

rengolin left a comment

Choose a reason for hiding this comment