Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add scalar reduction codegen schedule #1284

Open
wants to merge 10 commits into
base: main
Choose a base branch
from
Open

Conversation

Yancey1989
Copy link
Collaborator

@Yancey1989 Yancey1989 commented Mar 1, 2024

add scalar-reduction codegen template , the algorithm comes from https://developer.download.nvidia.com/assets/cuda/files/reduction.pdf

benchmark with PyTorch:

$bsx$seqlenx151936 disc Pytorch
2x768x151936xf32 0.53 ms 0.55ms
2x1024x151936xf32 0.67 ms 0.7 ms
2x2048x151936xf32 1.38 ms 1.4 ms

@Yancey1989 Yancey1989 changed the title [WIP]support scalar reduction support scalar reduction Mar 8, 2024
eedalong
eedalong previously approved these changes Mar 12, 2024
Copy link
Collaborator

@eedalong eedalong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Yancey1989 Yancey1989 changed the title support scalar reduction Add scalar reduction codegen schedule Mar 20, 2024
@eedalong eedalong self-requested a review March 22, 2024 02:08
eedalong
eedalong previously approved these changes Mar 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants