-
Notifications
You must be signed in to change notification settings - Fork 5.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
conv2d support bfloat16 #32221
conv2d support bfloat16 #32221
Conversation
Thanks for your contribution! |
Describe中可以提供下单测结果 |
paddle/fluid/platform/cudnn_helper.h
Outdated
static const cudnnDataType_t type = CUDNN_DATA_BFLOAT16; | ||
#else | ||
static const cudnnDataType_t type = CUDNN_DATA_HALF; | ||
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#else分支不需要吧。当cudnn版本 < 8.1时,整个class应该不被编译。所以是不是在class整体头尾分别加上#if和#endif。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
因为conv2d对于bfloat16需要编译成功,代码逻辑中CudnnDataType<bfloat16>
部分会被实例化,如果将整个模板特化使用预处理会出现编译失败问题。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cudnn8.1版本以下也不该用half类型,应该直接挂掉。另外,你加的都是#ifdef
做编译时判断,运行时判断也要加一下?可以参考(但也不完整,需要加下cudnn version的判断):
Paddle/paddle/fluid/operators/conv_op.cc
Lines 187 to 191 in 79f7ba6
if (input_data_type == framework::proto::VarType::FP16) { | |
PADDLE_ENFORCE_EQ(library, framework::LibraryType::kCUDNN, | |
platform::errors::InvalidArgument( | |
"float16 can only be used when CUDNN is used")); | |
} |
done |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cudnn bf16使用TensorCore计算,cudnnSetConvolutionMathType
需要设置特定的值吗?
@@ -51,6 +51,13 @@ template <typename T> | |||
class CUDNNConvOpKernel : public framework::OpKernel<T> { | |||
public: | |||
void Compute(const framework::ExecutionContext& ctx) const override { | |||
#if CUDNN_VERSION_MIN(8, 1, 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个检查能不能放到一个公共的地方,比如CudnnDataType<bfloat16>
里面?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CudnnDataType<bfloat16>
里只能做编译期检查,这里直接改为cudnn8.1以下不添加bfloat16数据类型的Kernel。
paddle/fluid/platform/cudnn_helper.h
Outdated
static const cudnnDataType_t type = CUDNN_DATA_BFLOAT16; | ||
#else | ||
static const cudnnDataType_t type = CUDNN_DATA_HALF; | ||
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cudnn8.1版本以下也不该用half类型,应该直接挂掉。另外,你加的都是#ifdef
做编译时判断,运行时判断也要加一下?可以参考(但也不完整,需要加下cudnn version的判断):
Paddle/paddle/fluid/operators/conv_op.cc
Lines 187 to 191 in 79f7ba6
if (input_data_type == framework::proto::VarType::FP16) { | |
PADDLE_ENFORCE_EQ(library, framework::LibraryType::kCUDNN, | |
platform::errors::InvalidArgument( | |
"float16 can only be used when CUDNN is used")); | |
} |
create_test_cudnn_bf16_class(TestWithStride, grad_check=False) | ||
create_test_cudnn_bf16_class(TestWithGroup, grad_check=False) | ||
create_test_cudnn_bf16_class(TestWith1x1, grad_check=False) | ||
create_test_cudnn_bf16_class(TestWithInput1x1Filter1x1, grad_check=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
都不检查梯度?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
之前试参考cpu上bf16测试,重新commit代码已默认添加反向测试。
@@ -167,6 +167,37 @@ def test_check_grad_no_input(self): | |||
globals()[cls_name] = TestConv2DCUDNNFp16 | |||
|
|||
|
|||
def create_test_cudnn_bf16_class(parent, grad_check=True): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
conv的测试不需要依赖OpTest
单测框架的增强?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
需要的,目前已merge最新代码,同步OpTest
单测框架改动。
Sorry to inform you that cd612c5's CIs have passed for more than 7 days. To prevent PR conflicts, you need to re-run all CIs manually. |
@@ -32,7 +32,8 @@ def set_confs(self): | |||
def test_check_output(self): | |||
for use_seq in {True, False}: | |||
self.attrs['use_seq'] = use_seq | |||
self.check_output(check_dygraph=False, no_check_set=["Cell"]) | |||
self.check_output( | |||
check_dygraph=False, no_check_set=["Cell"], atol=2e-2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里指明了atol,因为你把op_test.py中的atol值改了。这样还是会影响到其他op的单测吧,我觉得最好不改op_test.py,重写OpTest函数就不会影响到其他op单测了。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里只影响bfloat16前向精度测试,之前单测框架中写死用0.03,PR的修改只是取消这种固定值,在有需要的各个单测中指定即可。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
修改了op单测中的精度检查方式,影响了mkldnn的单测,请@luotao1 review一下。
paddle/fluid/operators/conv_op.cc
Outdated
platform::errors::InvalidArgument( | ||
"bfloat16 can only be used when CUDNN is used")); | ||
#else | ||
PADDLE_ENFORCE_NE( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- 这个else的逻辑似乎没有必要,永远都不会走到的样子。
- 需要检查运行时的cudnn version是不是>=8.1.0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -167,6 +168,52 @@ def test_check_grad_no_input(self): | |||
globals()[cls_name] = TestConv2DCUDNNFp16 | |||
|
|||
|
|||
def create_test_cudnn_bf16_class(parent, check_grad=False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
是不是应该设置check_grad=True
,以及梯度检查相关的变量no_need_check_grad
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
} | ||
self.inputs_fp32 = { | ||
'Input': OpTest.np_dtype_to_fluid_dtype(input), | ||
'Filter': OpTest.np_dtype_to_fluid_dtype(filter) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这是还构造了fp32的conv2d?在PR描述里面说明一下单测检查的逻辑吧。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -32,7 +32,8 @@ def set_confs(self): | |||
def test_check_output(self): | |||
for use_seq in {True, False}: | |||
self.attrs['use_seq'] = use_seq | |||
self.check_output(check_dygraph=False, no_check_set=["Cell"]) | |||
self.check_output( | |||
check_dygraph=False, no_check_set=["Cell"], atol=2e-2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
修改了op单测中的精度检查方式,影响了mkldnn的单测,请@luotao1 review一下。
@@ -32,7 +32,8 @@ def set_confs(self): | |||
def test_check_output(self): | |||
for use_seq in {True, False}: | |||
self.attrs['use_seq'] = use_seq | |||
self.check_output(check_dygraph=False, no_check_set=["Cell"]) | |||
self.check_output( | |||
check_dygraph=False, no_check_set=["Cell"], atol=2e-2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In file python/paddle/fluid/tests/unittests/op_test.py
, atol = 0.03
is not a good way to check forward accuracy. This PR modified the relative error of checking the accuracy of bfload16
data type and deleted the limitation of 0.03. And add atol = 2e-2
here to keep the same accuracy limit as before to ensure the test pass.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
@@ -1413,6 +1413,31 @@ REGISTER_OP_KERNEL( | |||
paddle::operators::CUDNNConvDoubleGradOpKernel<float>, | |||
paddle::operators::CUDNNConvDoubleGradOpKernel<plat::float16>); | |||
#else | |||
#if CUDNN_VERSION_MIN(8, 1, 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
注册的代码超过100+行了,可以简化下。这些注册无非3种类型:
- CUDA,CUDNN < 8.1,支持float、double、float16
- CUDA,CUDNN >= 8.1,支持float、double、float16、bfloat16
- ROCM,支持float、float16
可以定义一些注册的宏,比如:REGISTER_CONV_CUDNN_KERNEL_WITH_FP64_BF16、REGISTER_CONV_CUDNN_KERNEL_WITH_FP64、REGISTER_CONV_CUDNN_KERNEL_WITH_BF16?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
好的,后续跟进。
|
||
def init_kernel_type(self): | ||
self.use_cudnn = True | ||
self.no_need_check_grad = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self.no_need_check_grad = True
还保留在,有什么影响吗?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个主要是防止父类里的单测被执行到。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM for skip unittest
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM for check_dygraph
PR types
Others
PR changes
Others
Describe
PR功能
conv2d
、conv2d_grad
、conv2d_grad_grad
、depthwise_conv2d_grad_grad
共四个OP添加bfloat16
数据类型支持;test_conv2d_op.py
单测中,添加create_test_cudnn_bf16_class
用于测试conv2d
在bfloat16
数据类型下的单测;OpTest
单测框架中,check_output_with_place
方法检查前向计算结果时,针对bfloat16
类型,使用相对误差替代绝对误差;PR改动
conv2d
在CUDNN
下添加bfloat16
数据类型支持SearchAlgorithm<cudnnConvolutionFwdAlgoPerf_t>
逻辑中,添加对于bfloat16
数据类型处理,即当使用bfloat16
时,cudnnSetConvolutionMathType
调用中使用CUDNN_DEFAULT_MATH
;GetExpectedKernelType
添加运行时检查逻辑,当使用bfloat16
类型时,限制library
为framework::LibraryType::kCUDNN
以及platform::CudnnVersion()
大于8100;CUDNN_VERSION
判断是否编译进行bfloat16
类型的Kernel注册,只有cudnn 8.1及以上版本编译时才注册;test_conv2d_op.py
中添加bfloat16
数据类型单测OpTest
单测框架在bfloat16
类型的支持,添加create_test_cudnn_bf16_class
方法拓展conv2d
在bfloat16
类型上的单测,test_check_output
用来测试前向计算,test_check_grad_no_filter
、test_check_filter_no_grad
用来测试反向计算;TestConv2DOP
中针对bfloat16
类型测试:output
使用float
类型计算及表示,input
、filter
使用convert_float_to_uint16
将float
类型转换为uint16
类型标识(在paddle内则按bfloat16
处理);self.inputs_fp32
记录input
、filter
原始float
类型,该记录用于后续计算反向参考值使用;test_check_grad_no_filter
、test_check_filter_no_grad
检查反向时,使用OpTest
单测框架中提供的get_numeric_gradient
方法完成,与其他类型检查不同地方在于inputs
参数使用self.inputs_fp32
,从而减小数据多次转换带来的误差;OpTest
单测框架检查前向时使用相对误差check_output_with_place
中,检查精度使用numpy.allclose
方法,在该方法调用中添加rtol
参数,当数据类型为bfloat16
时,设置rtol=1e-2
,其他情况设置rtol=1e-5
(默认值);自测结果
bfloat16
数据类型对应前向、反向计算结果,本地及CI均测试通过;bfloat16
类型检查前向、反向结果使用的网络如下图所示,可以看到只有conv2d
使用bfloat16
类型(目前uint16
表示bfloat16
):