-
Notifications
You must be signed in to change notification settings - Fork 5.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Samedims elementwise #32148
Samedims elementwise #32148
Conversation
b493ab6
to
5358737
Compare
94f79ac
to
49de951
Compare
082730f
to
5f667fc
Compare
|
||
using VecType = AlignedVector<T, VecSize>; | ||
|
||
inline __device__ void load_vector(VecType args[], int idx) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JamesLim-sy 对load、store函数有什么review建议?
04b5e61
to
80ed55d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. 一些小的修改建议,可以后续PR再考虑。
}; | ||
|
||
template <typename T> | ||
int GetVectorizedSizeImpl(const T *pointer) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
在一些复杂、多维的情况下,能否向量化需要满足2个条件:
- 地址对齐
- size是VecSize的倍数。比如broadcast的配置[M, N][N],要求N是VecSize的倍数。记录一下,same dims的情况简单一些。
int remain = size - VecSize * tid; | ||
remain = remain > 0 ? remain : 0; | ||
if (remain >= VecSize) { | ||
auto data = ElementwiseDataWrapper<ET, VecSize, T>(out, in0, in1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
咦,ElementwiseDataWrapper是在CUDA Kernel内部定义的么,可以挪到外层(即LaunchElementwiseCudaKernel
)函数里面定义?而且ElementwiseDataWrapper
本身具备load/store_vector
、load/store_scalar
的功能,这里if、else也可以使用同一个ElementwiseDataWrapper
对象。
break; | ||
default: | ||
PADDLE_THROW(platform::errors::Unimplemented( | ||
"Unsupported vectorized size: %d !", vec_size)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
是不是default就当成vec_size=1
来处理比较好?
PR types
Performance optimization
PR changes
OPs
Describe
Common same dims elementwise op template