Samedims elementwise #32148

ZzSean · 2021-04-08T06:41:15Z

PR types

Performance optimization

PR changes

OPs

Describe

Common same dims elementwise op template

paddle/fluid/operators/elementwise/elementwise_op_impl.h

paddle/fluid/operators/elementwise/elementwise_add_op.h

paddle/fluid/operators/elementwise/elementwise_op_impl.cu.h

paddle/fluid/operators/elementwise/elementwise_add_op.h

paddle/fluid/operators/elementwise/elementwise_op_impl.cu.h

Xreki · 2021-04-14T10:30:43Z

paddle/fluid/operators/elementwise/elementwise_op_impl.cu.h

+
+  using VecType = AlignedVector<T, VecSize>;
+
+  inline __device__ void load_vector(VecType args[], int idx) {


@JamesLim-sy 对load、store函数有什么review建议？

paddle/fluid/operators/elementwise/elementwise_add_op.cu

Xreki

LGTM. 一些小的修改建议，可以后续PR再考虑。

Xreki · 2021-04-15T07:10:20Z

paddle/fluid/operators/elementwise/elementwise_op_impl.cu.h

+};
+
+template <typename T>
+int GetVectorizedSizeImpl(const T *pointer) {


在一些复杂、多维的情况下，能否向量化需要满足2个条件：

地址对齐

size是VecSize的倍数。比如broadcast的配置[M, N][N]，要求N是VecSize的倍数。记录一下，same dims的情况简单一些。

Xreki · 2021-04-15T07:26:18Z

paddle/fluid/operators/elementwise/elementwise_op_impl.cu.h

+  int remain = size - VecSize * tid;
+  remain = remain > 0 ? remain : 0;
+  if (remain >= VecSize) {
+    auto data = ElementwiseDataWrapper<ET, VecSize, T>(out, in0, in1);


咦，ElementwiseDataWrapper是在CUDA Kernel内部定义的么，可以挪到外层（即LaunchElementwiseCudaKernel）函数里面定义？而且ElementwiseDataWrapper本身具备load/store_vector、load/store_scalar的功能，这里if、else也可以使用同一个ElementwiseDataWrapper对象。

paddle/fluid/operators/elementwise/elementwise_op_impl.cu.h

Xreki · 2021-04-18T11:07:03Z

paddle/fluid/operators/elementwise/elementwise_op_impl.cu.h

+      break;
+    default:
+      PADDLE_THROW(platform::errors::Unimplemented(
+          "Unsupported vectorized size: %d !", vec_size));


是不是default就当成vec_size=1来处理比较好？

paddle/fluid/operators/elementwise/elementwise_add_op.cu

ZzSean force-pushed the samedims_elementwise branch from b493ab6 to 5358737 Compare April 9, 2021 08:25

ZzSean force-pushed the samedims_elementwise branch from 94f79ac to 49de951 Compare April 12, 2021 02:54

Xreki reviewed Apr 12, 2021

View reviewed changes

ZzSean force-pushed the samedims_elementwise branch from 082730f to 5f667fc Compare April 14, 2021 08:07

Xreki reviewed Apr 14, 2021

View reviewed changes

ZzSean added 10 commits April 15, 2021 06:33

Common same dims elementwise op template

87d5514

add cuda kernel

5d11585

kernel impl

63a3056

pass add

71cfedf

change expr to func

f819787

fix

845de86

modified style

e4a4c00

fix

ac0c7a0

fix

aa181f0

fix

80ed55d

ZzSean force-pushed the samedims_elementwise branch from 04b5e61 to 80ed55d Compare April 15, 2021 06:33

Xreki approved these changes Apr 15, 2021

View reviewed changes

Xreki reviewed Apr 18, 2021

View reviewed changes

paddle/fluid/operators/elementwise/elementwise_op_impl.cu.h Show resolved Hide resolved

Xreki reviewed Apr 18, 2021

View reviewed changes

paddle/fluid/operators/elementwise/elementwise_add_op.cu Show resolved Hide resolved

Xreki merged commit 2c18258 into PaddlePaddle:develop Apr 18, 2021

ZzSean deleted the samedims_elementwise branch April 19, 2021 07:21

Xreki mentioned this pull request Sep 8, 2021

Implement FunctionTraits to support two kinds of elementwise functor and remove some old codes for broadcast. #35487

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Samedims elementwise #32148

Samedims elementwise #32148

ZzSean commented Apr 8, 2021

Xreki Apr 14, 2021

Xreki left a comment

Xreki Apr 15, 2021

Xreki Apr 15, 2021

Xreki Apr 18, 2021 •

edited

Loading


		using VecType = AlignedVector<T, VecSize>;

		inline __device__ void load_vector(VecType args[], int idx) {

Samedims elementwise #32148

Samedims elementwise #32148

Conversation

ZzSean commented Apr 8, 2021

PR types

PR changes

Describe

Xreki Apr 14, 2021

Choose a reason for hiding this comment

Xreki left a comment

Choose a reason for hiding this comment

Xreki Apr 15, 2021

Choose a reason for hiding this comment

Xreki Apr 15, 2021

Choose a reason for hiding this comment

Xreki Apr 18, 2021 • edited Loading

Choose a reason for hiding this comment

Xreki Apr 18, 2021 •

edited

Loading