conv2d support bfloat16 #32221

Avin0323 · 2021-04-13T03:10:43Z

PR types

Others

PR changes

Others

Describe

PR功能

conv2d、conv2d_grad、conv2d_grad_grad、depthwise_conv2d_grad_grad共四个OP添加bfloat16数据类型支持；
test_conv2d_op.py单测中，添加create_test_cudnn_bf16_class用于测试conv2d在bfloat16数据类型下的单测；
OpTest单测框架中，check_output_with_place方法检查前向计算结果时，针对bfloat16类型，使用相对误差替代绝对误差；

PR改动

conv2d在CUDNN下添加bfloat16数据类型支持

SearchAlgorithm<cudnnConvolutionFwdAlgoPerf_t>逻辑中，添加对于bfloat16数据类型处理，即当使用bfloat16时，cudnnSetConvolutionMathType调用中使用CUDNN_DEFAULT_MATH;
GetExpectedKernelType添加运行时检查逻辑，当使用bfloat16类型时，限制library为framework::LibraryType::kCUDNN以及platform::CudnnVersion()大于8100；
编译期根据CUDNN_VERSION判断是否编译进行bfloat16类型的Kernel注册，只有cudnn 8.1及以上版本编译时才注册；

test_conv2d_op.py中添加bfloat16数据类型单测

利用 relu supports bfloat16 data type #32542 中对OpTest单测框架在bfloat16类型的支持，添加create_test_cudnn_bf16_class方法拓展conv2d在bfloat16类型上的单测，test_check_output用来测试前向计算，test_check_grad_no_filter、test_check_filter_no_grad用来测试反向计算；
在TestConv2DOP中针对bfloat16类型测试：
1. 作为参考值output使用float类型计算及表示，input、filter使用convert_float_to_uint16将float类型转换为uint16类型标识（在paddle内则按bfloat16处理）；
2. self.inputs_fp32记录input、filter原始float类型，该记录用于后续计算反向参考值使用；
test_check_grad_no_filter、test_check_filter_no_grad检查反向时，使用OpTest单测框架中提供的get_numeric_gradient方法完成，与其他类型检查不同地方在于inputs参数使用self.inputs_fp32，从而减小数据多次转换带来的误差；

OpTest单测框架检查前向时使用相对误差

check_output_with_place中，检查精度使用numpy.allclose方法，在该方法调用中添加rtol参数，当数据类型为bfloat16时，设置rtol=1e-2，其他情况设置rtol=1e-5（默认值）；

自测结果

单测中设置rtol=1e-2、atol=1e-2，测试bfloat16数据类型对应前向、反向计算结果，本地及CI均测试通过；
单测中使用bfloat16类型检查前向、反向结果使用的网络如下图所示，可以看到只有conv2d使用bfloat16类型（目前uint16表示bfloat16）：

paddle-bot-old · 2021-04-13T03:10:46Z

Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

… conv2d-support-bf16

AshburnLee · 2021-04-23T07:29:32Z

Describe中可以提供下单测结果

AshburnLee · 2021-04-23T11:17:02Z

paddle/fluid/platform/cudnn_helper.h

+  static const cudnnDataType_t type = CUDNN_DATA_BFLOAT16;
+#else
+  static const cudnnDataType_t type = CUDNN_DATA_HALF;
+#endif


#else分支不需要吧。当cudnn版本 < 8.1时，整个class应该不被编译。所以是不是在class整体头尾分别加上#if和#endif。

因为conv2d对于bfloat16需要编译成功，代码逻辑中CudnnDataType<bfloat16>部分会被实例化，如果将整个模板特化使用预处理会出现编译失败问题。

cudnn8.1版本以下也不该用half类型，应该直接挂掉。另外，你加的都是#ifdef做编译时判断，运行时判断也要加一下？可以参考（但也不完整，需要加下cudnn version的判断）：

Paddle/paddle/fluid/operators/conv_op.cc

Lines 187 to 191 in 79f7ba6

if (input_data_type == framework::proto::VarType::FP16) {

PADDLE_ENFORCE_EQ(library, framework::LibraryType::kCUDNN,

platform::errors::InvalidArgument(

"float16 can only be used when CUDNN is used"));

}

Avin0323 · 2021-04-26T05:13:21Z

Describe中可以提供下单测结果

done

Xreki

cudnn bf16使用TensorCore计算，cudnnSetConvolutionMathType需要设置特定的值吗？

Xreki · 2021-04-27T08:28:26Z

paddle/fluid/operators/conv_cudnn_op.cu

@@ -51,6 +51,13 @@ template <typename T>
 class CUDNNConvOpKernel : public framework::OpKernel<T> {
 public:
  void Compute(const framework::ExecutionContext& ctx) const override {
+#if CUDNN_VERSION_MIN(8, 1, 0)


这个检查能不能放到一个公共的地方，比如CudnnDataType<bfloat16>里面？

CudnnDataType<bfloat16>里只能做编译期检查，这里直接改为cudnn8.1以下不添加bfloat16数据类型的Kernel。

Xreki · 2021-04-27T08:43:46Z

paddle/fluid/platform/cudnn_helper.h

+  static const cudnnDataType_t type = CUDNN_DATA_BFLOAT16;
+#else
+  static const cudnnDataType_t type = CUDNN_DATA_HALF;
+#endif


cudnn8.1版本以下也不该用half类型，应该直接挂掉。另外，你加的都是#ifdef做编译时判断，运行时判断也要加一下？可以参考（但也不完整，需要加下cudnn version的判断）：

Paddle/paddle/fluid/operators/conv_op.cc

Lines 187 to 191 in 79f7ba6

if (input_data_type == framework::proto::VarType::FP16) {

PADDLE_ENFORCE_EQ(library, framework::LibraryType::kCUDNN,

platform::errors::InvalidArgument(

"float16 can only be used when CUDNN is used"));

}

Xreki · 2021-04-27T09:01:34Z

python/paddle/fluid/tests/unittests/test_conv2d_op.py

+create_test_cudnn_bf16_class(TestWithStride, grad_check=False)
+create_test_cudnn_bf16_class(TestWithGroup, grad_check=False)
+create_test_cudnn_bf16_class(TestWith1x1, grad_check=False)
+create_test_cudnn_bf16_class(TestWithInput1x1Filter1x1, grad_check=False)


都不检查梯度？

之前试参考cpu上bf16测试，重新commit代码已默认添加反向测试。

Xreki · 2021-04-27T09:02:53Z

python/paddle/fluid/tests/unittests/test_conv2d_op.py

@@ -167,6 +167,37 @@ def test_check_grad_no_input(self):
    globals()[cls_name] = TestConv2DCUDNNFp16


+def create_test_cudnn_bf16_class(parent, grad_check=True):


conv的测试不需要依赖OpTest单测框架的增强？

需要的，目前已merge最新代码，同步OpTest单测框架改动。

paddle-bot-old · 2021-05-03T02:35:30Z

Sorry to inform you that cd612c5's CIs have passed for more than 7 days. To prevent PR conflicts, you need to re-run all CIs manually.

AshburnLee · 2021-05-21T03:25:40Z

python/paddle/fluid/tests/unittests/mkldnn/test_fusion_lstm_bf16_mkldnn_op.py

@@ -32,7 +32,8 @@ def set_confs(self):
    def test_check_output(self):
        for use_seq in {True, False}:
            self.attrs['use_seq'] = use_seq
-            self.check_output(check_dygraph=False, no_check_set=["Cell"])
+            self.check_output(
+                check_dygraph=False, no_check_set=["Cell"], atol=2e-2)


这里指明了atol，因为你把op_test.py中的atol值改了。这样还是会影响到其他op的单测吧，我觉得最好不改op_test.py，重写OpTest函数就不会影响到其他op单测了。

这里只影响bfloat16前向精度测试，之前单测框架中写死用0.03，PR的修改只是取消这种固定值，在有需要的各个单测中指定即可。

修改了op单测中的精度检查方式，影响了mkldnn的单测，请@luotao1 review一下。

Xreki · 2021-05-24T05:48:52Z

paddle/fluid/operators/conv_op.cc

+                      platform::errors::InvalidArgument(
+                          "bfloat16 can only be used when CUDNN is used"));
+#else
+    PADDLE_ENFORCE_NE(


这个else的逻辑似乎没有必要，永远都不会走到的样子。

需要检查运行时的cudnn version是不是>=8.1.0

Xreki · 2021-05-24T06:19:23Z

python/paddle/fluid/tests/unittests/test_conv2d_op.py

@@ -167,6 +168,52 @@ def test_check_grad_no_input(self):
    globals()[cls_name] = TestConv2DCUDNNFp16


+def create_test_cudnn_bf16_class(parent, check_grad=False):


是不是应该设置check_grad=True，以及梯度检查相关的变量no_need_check_grad?

Xreki · 2021-05-24T06:21:11Z

python/paddle/fluid/tests/unittests/test_conv2d_op.py

+            }
+            self.inputs_fp32 = {
+                'Input': OpTest.np_dtype_to_fluid_dtype(input),
+                'Filter': OpTest.np_dtype_to_fluid_dtype(filter)


这是还构造了fp32的conv2d？在PR描述里面说明一下单测检查的逻辑吧。

Xreki · 2021-05-24T06:22:29Z

python/paddle/fluid/tests/unittests/mkldnn/test_fusion_lstm_bf16_mkldnn_op.py

@@ -32,7 +32,8 @@ def set_confs(self):
    def test_check_output(self):
        for use_seq in {True, False}:
            self.attrs['use_seq'] = use_seq
-            self.check_output(check_dygraph=False, no_check_set=["Cell"])
+            self.check_output(
+                check_dygraph=False, no_check_set=["Cell"], atol=2e-2)


修改了op单测中的精度检查方式，影响了mkldnn的单测，请@luotao1 review一下。

Avin0323 · 2021-05-26T04:31:33Z

python/paddle/fluid/tests/unittests/mkldnn/test_fusion_lstm_bf16_mkldnn_op.py

@@ -32,7 +32,8 @@ def set_confs(self):
    def test_check_output(self):
        for use_seq in {True, False}:
            self.attrs['use_seq'] = use_seq
-            self.check_output(check_dygraph=False, no_check_set=["Cell"])
+            self.check_output(
+                check_dygraph=False, no_check_set=["Cell"], atol=2e-2)


In file python/paddle/fluid/tests/unittests/op_test.py, atol = 0.03 is not a good way to check forward accuracy. This PR modified the relative error of checking the accuracy of bfload16 data type and deleted the limitation of 0.03. And add atol = 2e-2 here to keep the same accuracy limit as before to ensure the test pass.

luotao1

LGTM

Xreki

LGTM.

Xreki · 2021-06-01T06:41:01Z

paddle/fluid/operators/conv_cudnn_op.cu

@@ -1413,6 +1413,31 @@ REGISTER_OP_KERNEL(
    paddle::operators::CUDNNConvDoubleGradOpKernel<float>,
    paddle::operators::CUDNNConvDoubleGradOpKernel<plat::float16>);
 #else
+#if CUDNN_VERSION_MIN(8, 1, 0)


注册的代码超过100+行了，可以简化下。这些注册无非3种类型：

CUDA，CUDNN < 8.1，支持float、double、float16

CUDA，CUDNN >= 8.1，支持float、double、float16、bfloat16

ROCM，支持float、float16

可以定义一些注册的宏，比如：REGISTER_CONV_CUDNN_KERNEL_WITH_FP64_BF16、REGISTER_CONV_CUDNN_KERNEL_WITH_FP64、REGISTER_CONV_CUDNN_KERNEL_WITH_BF16？

好的，后续跟进。

Xreki · 2021-06-02T01:35:26Z

python/paddle/fluid/tests/unittests/test_conv2d_op.py

+
+        def init_kernel_type(self):
+            self.use_cudnn = True
+            self.no_need_check_grad = True


self.no_need_check_grad = True还保留在，有什么影响吗？

这个主要是防止父类里的单测被执行到。

luotao1

LGTM for skip unittest

phlrain

LGTM for check_dygraph

conv2d support bfloat16, test=develop

687e28b

Avin0323 added 4 commits April 22, 2021 06:17

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

252dbcf

… conv2d-support-bf16

limit bfloat16 type supported with cudnn version, test=develop

747d096

fix compilation error, test=develop

8bab3d7

add conv2d unittest with bfloat16 data type, test=develop

74bb02d

Avin0323 marked this pull request as draft April 22, 2021 15:12

Avin0323 marked this pull request as ready for review April 22, 2021 15:12

Avin0323 marked this pull request as draft April 22, 2021 15:14

Avin0323 marked this pull request as ready for review April 22, 2021 15:14

Avin0323 marked this pull request as draft April 23, 2021 02:35

Avin0323 marked this pull request as ready for review April 23, 2021 02:35

AshburnLee reviewed Apr 23, 2021

View reviewed changes

fix conv2d unittest with bfloat16 data type, test=develop

cd612c5

Avin0323 changed the title ~~[WIP]conv2d support bfloat16~~ conv2d support bfloat16 Apr 26, 2021

Xreki reviewed Apr 27, 2021

View reviewed changes

Avin0323 added 6 commits May 18, 2021 07:18

modify according to the review comments, test=develop

4069f78

Merge branch 'develop' into conv2d-support-bf16

7d3a4d5

fix compilation error, test=develop

c41fe74

revert code, test=develop

f3ca4b8

fix conv2d unittests error with bfloat16, test=develop

5a3e730

fix test_fusion_lstm_bf16_mkldnn_op error, test=develop

f23f1d2

AshburnLee reviewed May 21, 2021

View reviewed changes

Xreki reviewed May 24, 2021

View reviewed changes

Avin0323 added 2 commits May 25, 2021 01:30

check cudnn version in runtime, test=develop

15f3315

fix check cudnn version error, test=develop

394d8d4

Avin0323 commented May 26, 2021

View reviewed changes

luotao1 previously approved these changes May 26, 2021

View reviewed changes

Avin0323 added 2 commits May 26, 2021 13:13

Merge branch 'develop' into conv2d-support-bf16

12cc70b

Merge branch 'develop' into conv2d-support-bf16

62fcd51

Avin0323 dismissed luotao1’s stale review via 62fcd51 June 1, 2021 08:00

Xreki approved these changes Jun 2, 2021

View reviewed changes

luotao1 approved these changes Jun 2, 2021

View reviewed changes

phlrain approved these changes Jun 2, 2021

View reviewed changes

Xreki merged commit 5981bee into PaddlePaddle:develop Jun 2, 2021

Avin0323 deleted the conv2d-support-bf16 branch June 2, 2021 02:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

conv2d support bfloat16 #32221

conv2d support bfloat16 #32221

Avin0323 commented Apr 13, 2021 •

edited

Loading

paddle-bot-old bot commented Apr 13, 2021

AshburnLee commented Apr 23, 2021

AshburnLee Apr 23, 2021 •

edited

Loading

Avin0323 Apr 25, 2021

Xreki Apr 27, 2021

Avin0323 commented Apr 26, 2021

Xreki left a comment

Xreki Apr 27, 2021

Avin0323 May 18, 2021

Xreki Apr 27, 2021

Xreki Apr 27, 2021

Avin0323 May 18, 2021

Xreki Apr 27, 2021

Avin0323 May 18, 2021

paddle-bot-old bot commented May 3, 2021

AshburnLee May 21, 2021

Avin0323 May 21, 2021

Xreki May 24, 2021

Xreki May 24, 2021

Avin0323 May 25, 2021

Xreki May 24, 2021

Avin0323 May 25, 2021

Xreki May 24, 2021

Avin0323 May 25, 2021

Xreki May 24, 2021

Avin0323 May 26, 2021

luotao1 left a comment

Xreki left a comment

Xreki Jun 1, 2021

Avin0323 Jun 2, 2021

Xreki Jun 2, 2021

Avin0323 Jun 2, 2021

luotao1 left a comment

phlrain left a comment

	if (input_data_type == framework::proto::VarType::FP16) {
	PADDLE_ENFORCE_EQ(library, framework::LibraryType::kCUDNN,
	platform::errors::InvalidArgument(
	"float16 can only be used when CUDNN is used"));
	}

		@@ -167,6 +167,37 @@ def test_check_grad_no_input(self):
		globals()[cls_name] = TestConv2DCUDNNFp16


		def create_test_cudnn_bf16_class(parent, grad_check=True):

		@@ -167,6 +168,52 @@ def test_check_grad_no_input(self):
		globals()[cls_name] = TestConv2DCUDNNFp16


		def create_test_cudnn_bf16_class(parent, check_grad=False):

conv2d support bfloat16 #32221

conv2d support bfloat16 #32221

Conversation

Avin0323 commented Apr 13, 2021 • edited Loading

PR types

PR changes

Describe

PR功能

PR改动

自测结果

paddle-bot-old bot commented Apr 13, 2021

AshburnLee commented Apr 23, 2021

AshburnLee Apr 23, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Avin0323 commented Apr 26, 2021

Xreki left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

paddle-bot-old bot commented May 3, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

luotao1 left a comment

Choose a reason for hiding this comment

Xreki left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

luotao1 left a comment

Choose a reason for hiding this comment

phlrain left a comment

Choose a reason for hiding this comment

Avin0323 commented Apr 13, 2021 •

edited

Loading

AshburnLee Apr 23, 2021 •

edited

Loading