-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cudnn conv op #4195
Cudnn conv op #4195
Conversation
fix #3691 |
paddle/api/PaddleAPI.h
Outdated
@@ -19,6 +19,7 @@ limitations under the License. */ | |||
#include <stdexcept> | |||
#include <string> | |||
#include <vector> | |||
#include "paddle/gserver/dataproviders/MultiDataProvider.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
revert this file in this PR.
paddle/operators/cudnn_conv_op.cc
Outdated
@@ -0,0 +1,110 @@ | |||
/* Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserve. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
文件命名:conv实现可以统一conv开头,这样所有conv相关的代码都放在一起。
paddle/operators/cudnn_conv_op.cc
Outdated
return output_size; | ||
} | ||
|
||
class CudnnConvOp : public framework::OperatorWithKernel { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Op的定义可以和ConvGemm公共一个吧
paddle/operators/cudnn_conv_op.cc
Outdated
} | ||
}; | ||
|
||
class CudnnConvGradOp : public framework::OperatorWithKernel { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
同forward的定义,可以和conv gemm公用一个
paddle/operators/cudnn_conv_op.cu
Outdated
@@ -0,0 +1,260 @@ | |||
/* Copyright (c) 2016 PaddlePaddle Authors All Rights Reserve. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cudnn实现可以用g++
编译,直接用.cc
是不是就行了,但是注册GPU kernel时,又需要在.cu
里,所以那还是这样吧..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
嗯,cudnn的API都是host执行的,但API调用后是在device执行的。
paddle/operators/cudnn_conv_op.cu
Outdated
cudnn_output_grad_desc, cudnn_conv_desc, | ||
// dxDesc: Handle to the previously initialized output tensor | ||
// descriptor. | ||
cudnn_input_grad_desc, CUDNN_CONVOLUTION_BWD_DATA_PREFER_FASTEST, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CUDNN_CONVOLUTION_BWD_DATA_PREFER_FASTEST
使用和forward同样的问题。
paddle/operators/cudnn_conv_op.cu
Outdated
cudnn_workspace = paddle::memory::Alloc(gpu, workspace_size_in_bytes); | ||
// ------------------- cudnn conv backward data --------------------- | ||
// FIXME(typhoonzero): template type T may not be the same as cudnn call. | ||
float alpha = 1.0f, beta = 0.0f; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above
paddle/operators/cudnn_conv_op.cu
Outdated
T* input_grad_data = input_grad->mutable_data<T>(ctx.GetPlace()); | ||
PADDLE_ENFORCE(platform::dynload::cudnnConvolutionBackwardData( | ||
h, &alpha, cudnn_filter_desc, filter_data + i * group_offset_filter, | ||
cudnn_output_grad_desc, output_grad_data + i * group_offset_Y, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
感觉cudnn_xxx_desc
可以把cudnn_
前缀去了,名字会短一些。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cudnn_xxx_desc
都是诸如cudnnTensorDescriptor_t
的类型,xxx_desc
的名称是诸如ScopedTensorDescriptor
的类型。cudnn_xxx_desc
也会多次调用,所以保存一个临时变量也是有必要的。
paddle/operators/cudnn_conv_op.h
Outdated
using Tensor = framework::Tensor; | ||
|
||
// FIXME(typhoonzer): If CudnnConvOp is running on CPU | ||
// reuse the code from gemm_conv2d_op.h. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
觉得这个可以不用写,GPU上GEMM Conv、CUDNN Conv的选择可以后续Python里(或其他方式)自动选择。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
加这个是考虑如果用户配置了使用cudnn conv op,并且需要测试cpu上的执行这样用户的程序可以不用做修改。感觉也没必要增加在pybind中注册类似USE_GPU_ONLY_OP
const std::vector<int>& dims) { | ||
// the format is not used now, but it maybe useful feature | ||
const std::vector<int>& dims, | ||
const int groups = 1) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ScopedTensorDescriptor
不止Conv层可以用,Pooling/BatchNorm均可用,所以这里的group是不是应该在conv op里处理,而不是这个接口处理
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个groups觉得还是移到cudnn_conv_op里处理吧
paddle/operators/cudnn_conv_op.cc
Outdated
void InferShape(const framework::InferShapeContext &ctx) const override { | ||
auto in = ctx.Input<Tensor>("Input"); | ||
auto filter = ctx.Input<Tensor>("Filter"); | ||
auto out = ctx.Output<framework::Tensor>("Output"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
从ctx获取的数据要进行非空检测
PADDLE_ENFORCE_NOT_NULL(ctx.InputVar("Input"),
"Input(Input) of CudnnConvOp should not be null.");
PADDLE_ENFORCE_NOT_NULL(ctx.OutputVar("Output"),
"Output(Output) of CudnnConvOp should not be null.");
paddle/operators/cudnn_conv_op.cc
Outdated
void InferShape(const framework::InferShapeContext &ctx) const override { | ||
auto in = ctx.Input<Tensor>("Input"); | ||
auto filter = ctx.Input<Tensor>("Filter"); | ||
auto out = ctx.Output<framework::Tensor>("Output"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
auto out = ctx.Output<framework::Tensor>("Output");
=> auto out = ctx.Output<Tensor>("Output");
paddle/operators/cudnn_conv_op.cc
Outdated
auto d_in = ctx.Output<framework::Tensor>(framework::GradVarName("Input")); | ||
auto d_filter = | ||
ctx.Output<framework::Tensor>(framework::GradVarName("Filter")); | ||
if (d_in) d_in->Resize(in->dims()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- 同上,从ctx获取数据进行非空检测
- 另外:
auto d_in = ctx.Output<framework::Tensor>(framework::GradVarName("Input"));
auto d_filter = ctx.Output<framework::Tensor>(framework::GradVarName("Filter"));
=>
auto d_in = ctx.Output<Tensor>(framework::GradVarName("Input"));
auto d_filter = ctx.Output<Tensor>(framework::GradVarName("Filter"));
paddle/operators/cudnn_conv_op.cc
Outdated
AddAttr<std::vector<int>>("strides", "").SetDefault(std::vector<int>{}); | ||
AddAttr<std::vector<int>>("paddings", "paddings of convolution operator.") | ||
.SetDefault(std::vector<int>{}); | ||
// FIXME(typhoonzero): cudnn doesn't support "group" Attributes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dilations default{1,1},strides default{1,1},paddings default{0,0}
paddle/operators/cudnn_conv_op.cc
Outdated
@@ -0,0 +1,110 @@ | |||
/* Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserve. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
conv => conv2d,下同
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
改成了conv_cudnn_op.x
后续考虑是否可以和3d合并?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
我觉得2d 和 3d的可以合并
… cudnn_conv_op
… cudnn_conv_op
@qingqing01 I've encountered some wired problem, I move dims update with groups to // TensorDimWithGroups convert 4d Tensor dims to dims split by groups.
std::vector<int> TensorDimWithGroups(const framework::DDim& dims,
const int groups) {
// Update tensor descriptor dims setting if groups > 1
// FIXME(typhoonzero): Assume using NCHW order
std::vector<int> ret = Dims2Vector(dims);
if (groups > 1) {
ret[1] = ret[1] / groups;
}
return ret;
}
// FilterDimWithGroups convert filter dims to dims split by groups.
std::vector<int> FilterDimWithGroups(const framework::DDim& dims,
const int groups) {
// filter layout: MCHW
std::vector<int> ret = Dims2Vector(dims);
if (groups > 1) {
ret[0] = ret[0] / groups;
}
return ret;
} and call Then running gradient check fails like:
but put modifying dims in Tested |
072eded
to
2065e83
Compare
paddle/operators/conv_cudnn_op.cc
Outdated
REGISTER_OP(conv_cudnn, ops::CudnnConvOp, ops::CudnnConvOpMaker, | ||
conv_cudnn_grad, ops::CudnnConvGradOp); | ||
REGISTER_OP_CPU_KERNEL(conv_cudnn, | ||
ops::CudnnConvKernel<paddle::platform::CPUPlace, float>); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is unnecessary, you can directly register "GemmConv2DKernel", like this:
REGISTER_OP_CPU_KERNEL(conv_cudnn,
ops:: GemmConv2DKernel <paddle::platform::CPUPlace, float>);
And
REGISTER_OP(conv_cudnn, ops:: Conv2DOp, ops::CudnnConvOpMaker,
conv_cudnn_grad, ops:: Conv2DOpGrad);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point will try.
paddle/operators/conv2d_op.h
Outdated
void ConvInferShape(framework::InferShapeContextBase* ctx); | ||
void ConvGradInferShape(framework::InferShapeContextBase* ctx); | ||
|
||
template <typename Place, typename T> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is not appropriate to carry out the function alone.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Need to update this PR to get cudnn handle after #4593 |
paddle/operators/conv_cudnn_op.cu
Outdated
T alpha = 1.0f, beta = 0.0f; | ||
if (input_grad) { | ||
for (int i = 0; i < groups; i++) { | ||
T* input_grad_data = input_grad->mutable_data<T>(ctx.GetPlace()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
line 238 should be moved out of loop.
And input_grad
should be clean cleared. like this https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/operators/lookup_table_op.h#L60
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Done.
paddle/operators/conv_cudnn_op.cu
Outdated
// ------------------- cudnn conv backward filter --------------------- | ||
if (filter_grad) { | ||
for (int i = 0; i < groups; i++) { | ||
T* filter_grad_data = filter_grad->mutable_data<T>(ctx.GetPlace()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
另外,后续需要增加一个对比不同实现结果一致的单测,不止是conv需要,其他的一些op也需要,这个可以后续做~
paddle/operators/conv2d_op.h
Outdated
@@ -24,6 +24,38 @@ namespace operators { | |||
|
|||
using Tensor = framework::Tensor; | |||
|
|||
// Base convolution operator definations for other conv | |||
// like operators to reuse the implementation. | |||
inline int outputSize(int input_size, int filter_size, int padding, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
outputSize -> OutputSize
const std::vector<int>& dims) { | ||
// the format is not used now, but it maybe useful feature | ||
const std::vector<int>& dims, | ||
const int groups = 1) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个groups觉得还是移到cudnn_conv_op里处理吧
paddle/operators/conv_cudnn_op.h
Outdated
|
||
namespace paddle { | ||
namespace operators {} // namespace operators | ||
} // namespace paddle |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
paddle/operators/conv_cudnn_op.h
可以去掉吧,在conv_cudnn.cc/cu
里直接使用paddle/operators/conv2d_op.h
就可以吧。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个groups觉得还是移到cudnn_conv_op里处理吧
前面有个comment,groups的处理挪出去会有一个诡异的bug,后续跟进下,可否先放在这里,其他op在使用的时候,不加group参数和原先的表现一致。
这个merge后创建一个issue,后续PR fix这个问题。
paddle/platform/cudnn_helper.h
Outdated
@@ -14,6 +14,7 @@ limitations under the License. */ | |||
|
|||
#pragma once | |||
|
|||
#include <iostream> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove this line.
paddle/operators/conv_cudnn_op.cc
Outdated
CudnnConvOpMaker(framework::OpProto* proto, | ||
framework::OpAttrChecker* op_checker) | ||
: Conv2DOpMaker(proto, op_checker) { | ||
AddAttr<std::vector<int>>("dilations", "paddings of convolution operator.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment is not correct for attr dilations
.
paddle/operators/conv_cudnn_op.cc
Outdated
: Conv2DOpMaker(proto, op_checker) { | ||
AddAttr<std::vector<int>>("dilations", "paddings of convolution operator.") | ||
.SetDefault(std::vector<int>{1, 1}); | ||
AddAttr<int>("workspace_size_MB", "workspace size for cudnn, in MB.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need more detailed comments.
distributed under the License is distributed on an "AS IS" BASIS, | ||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
See the License for the specific language governing permissions and | ||
limitations under the License. */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
License和conv_cudnn_op.cc
保持一致,去掉缩进。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
保持和其他一致,conv_cudnn_op.cc
增加了缩进。
paddle/operators/conv_cudnn_op.cu
Outdated
|
||
int group_offset_X = input_channels / groups * input_height * input_width; | ||
int group_offset_Y = | ||
output_channels / groups * output_height * output_width; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
group_offset_X
-> group_offset_in
group_offset_Y
-> group_offset_out
命名保持一致,不要X、input混用,Y、output混用。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approve this PR, some remaining issues can be updated later.
Fix #4194
Fix #4187
TODO list all done, please review, thanks~
paddle::memory
to allock cudnn workspace.