【Hackathon No.59】addmm 算子FP16/BF16单测完善 #53111

co63oc · 2023-04-20T05:51:52Z

PR types

Others

PR changes

Others

Description

addmm算子FP16/BF16单测完善
float16类型使用cublasHgemm
增加 bfloat16 类型调用cublasGemmEx
使用blas VCOPY， SCAL的float16类型编译错误，增加mt_blas使用MPType类型调用VCOPY， SCAL

反向中VCOPY, SCAL不支持float16, bfloat16，增加CopyOrScaleFunctor修改数据

提交CI测试需要设置精度float16 check_out atol=1e-2

如果float16设置 self.check_output(atol=1e-2), PR-CI-Model-benchmark CI有错误，然后修改为verify_output比较

paddle-bot · 2023-04-20T05:51:57Z

你的PR提交成功，感谢你对开源项目的贡献!
请关注后续CI自动化测试结果，详情请参考Paddle-CI手册。
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

co63oc · 2023-04-25T07:54:32Z

@luotao1 @ZzSean CI已完成

ZzSean · 2023-04-26T03:12:36Z

paddle/phi/kernels/funcs/blas/blas_impl.hip.h

@@ -982,6 +982,25 @@ inline void Blas<phi::GPUContext>::GEMM(bool transA,
  });
 }

+template <>
+template <>
+inline void Blas<phi::GPUContext>::GEMM(bool transA UNUSED,


这个函数虽然没有用到，但是也可以参考729行的bf16 GEMM实现？

ZzSean · 2023-04-26T03:18:16Z

paddle/phi/kernels/impl/addmm_grad_kernel_impl.h

+      // The VCOPY does not support the float16, bfloat16
+      if (!is_float16_or_bfloat16) {
+        mt_blas.VCOPY(
+            total_elems, out_grad.data<MPType>(), input_grad->data<MPType>());


这里既然不支持fp16和bf16，是不是没必要用MPType了

不使用MPType编译器有错误，编译器不按程序if判断，按T类型编译，这里如果使用out_grad.data，编译器会提示不支持float16类型参数

ZzSean · 2023-04-26T03:18:43Z

paddle/phi/kernels/impl/addmm_grad_kernel_impl.h

@@ -78,19 +107,45 @@ void AddmmGradKernel(const Context& dev_ctx,
        Array2(input_grad->dims()[0], input_grad->dims()[1]);

    if (row_compress && col_compress) {
-      eigen_dinput.device(place) =
-          eigen_dout.sum().eval().reshape(eigen_dinput_shape);
+      eigen_dinput.device(place) = eigen_dout.template cast<MPType>()


这里如果是非fp16和bf16的话，多了两个cast感觉会影响性能，要不也增加个分支

已增加分支

co63oc · 2023-04-27T03:08:42Z

@ZzSean 修改CI已完成

ZzSean · 2023-04-28T02:42:12Z

paddle/phi/kernels/funcs/blas/blas_impl.hip.h

+                                        int ldb,
+                                        phi::dtype::bfloat16 beta,
+                                        phi::dtype::bfloat16 *C,
+                                        int ldc UNUSED) const {


这个UNUSED应该删掉吧

ZzSean · 2023-04-28T02:45:15Z

paddle/phi/kernels/funcs/blas/blas_impl.hip.h

+      context_.GetComputeCapability(),
+      80,
+      phi::errors::InvalidArgument(
+          "rocblas fp16 gemm requires GPU compute capability >= 80,"


这里应该是bf16把

ZzSean

LGTM

* Add addmm tests * Fix code

paddle-bot bot added contributor External developers status: proposed labels Apr 20, 2023

luotao1 assigned luotao1, Vvsmile, ZzSean and Ligoml Apr 20, 2023

co63oc force-pushed the addmm branch 8 times, most recently from 60d7c61 to 11e09f1 Compare April 24, 2023 01:25

co63oc force-pushed the addmm branch from 11e09f1 to 7fbee26 Compare April 26, 2023 01:30

ZzSean reviewed Apr 26, 2023

View reviewed changes

Add addmm tests

688b0d9

co63oc force-pushed the addmm branch from 94bed24 to 688b0d9 Compare April 26, 2023 08:30

ZzSean reviewed Apr 28, 2023

View reviewed changes

Fix code

4aa7473

ZzSean approved these changes May 4, 2023

View reviewed changes

luotao1 merged commit 841efcd into PaddlePaddle:develop May 5, 2023

Ligoml mentioned this pull request May 5, 2023

【PaddlePaddle Hackathon 第四期】任务总览 #51281

Closed

luotao1 mentioned this pull request May 5, 2023

【PaddlePaddle Hackathon 4】No.59 : add fp16 and bf16 for addmm #51831

Closed

ZzSean pushed a commit to ZzSean/Paddle that referenced this pull request May 5, 2023

【Hackathon No.59】addmm 算子FP16/BF16单测完善 (PaddlePaddle#53111)

5cb4631

* Add addmm tests * Fix code

ZzSean mentioned this pull request May 5, 2023

[Cherry-Pick] AMP OP&Test support from Hackathon #53522

Merged

co63oc deleted the addmm branch May 11, 2023 08:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

【Hackathon No.59】addmm 算子FP16/BF16单测完善 #53111

【Hackathon No.59】addmm 算子FP16/BF16单测完善 #53111

co63oc commented Apr 20, 2023 •

edited

Loading

paddle-bot bot commented Apr 20, 2023

co63oc commented Apr 25, 2023

ZzSean Apr 26, 2023

co63oc Apr 26, 2023

ZzSean Apr 26, 2023

co63oc Apr 26, 2023

ZzSean Apr 26, 2023 •

edited

Loading

co63oc Apr 26, 2023 •

edited

Loading

co63oc commented Apr 27, 2023 •

edited

Loading

ZzSean Apr 28, 2023

co63oc Apr 28, 2023

ZzSean Apr 28, 2023

co63oc Apr 28, 2023

ZzSean left a comment

【Hackathon No.59】addmm 算子FP16/BF16单测完善 #53111

【Hackathon No.59】addmm 算子FP16/BF16单测完善 #53111

Conversation

co63oc commented Apr 20, 2023 • edited Loading

PR types

PR changes

Description

paddle-bot bot commented Apr 20, 2023

co63oc commented Apr 25, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ZzSean Apr 26, 2023 • edited Loading

Choose a reason for hiding this comment

co63oc Apr 26, 2023 • edited Loading

Choose a reason for hiding this comment

co63oc commented Apr 27, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ZzSean left a comment

Choose a reason for hiding this comment

co63oc commented Apr 20, 2023 •

edited

Loading

ZzSean Apr 26, 2023 •

edited

Loading

co63oc Apr 26, 2023 •

edited

Loading

co63oc commented Apr 27, 2023 •

edited

Loading