-
Notifications
You must be signed in to change notification settings - Fork 5.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add fused_attention_op: add impl wrappers. #35903
Add fused_attention_op: add impl wrappers. #35903
Conversation
Thanks for your contribution! |
@@ -0,0 +1,324 @@ | |||
/* Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fmha_ref.h 是使用 paddle 堆起来的 attn 吗?感觉名字有点歧义,后续可以改下名字。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
是使用 paddle 堆起来的 attn。下一个PR修改名字。
PR types
Function optimization
PR changes
OPs
Describe
The first PR of "add fused_attention_op":
1.Add impl wrappers for gemm and fmha parts in fused_attention_op.
2.Fix bugs in layer_norm and attn_bias_add.cu.h.
3.Fix bugs in elementwise_op_impl.cu.h for ternary elementwise_add impl.