-
Notifications
You must be signed in to change notification settings - Fork 5.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Delete extra input (Bias, ResidualData) in OpMaker of conv2d #49121
Conversation
你的PR提交成功,感谢你对开源项目的贡献! |
cccaafe
to
dc02b62
Compare
… clear_conv2d_extra
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@@ -354,7 +354,9 @@ void CPUQuantizeSquashPass::OpDequantSquash(Graph* graph) const { | |||
FindOutputNameByVarName(any_op->Op(), dequant_in->Name()); | |||
|
|||
if (output_name.empty()) return; | |||
|
|||
if (any_op->Op()->Type() == "conv2d") { | |||
any_op->Op()->SetType("fused_conv2d"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So all int8-oneDNN kernels should be executed as fused kernels by default?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we are trying to delete the extra inputs and attributes in base op, so some extra attributes for int8-oneDNN kernel are also removed, currently we have to put them into fused kernel to execute because no better choice.
I think a good way to execute int8-oneDNN kernel is creating a new kernel for int8-oneDNN, but it is difficult to implement at the current stage, maybe we could come up with a good solution in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, thank you for explaining
@@ -556,6 +553,11 @@ | |||
extra : | |||
attrs : [bool use_mkldnn = false] | |||
|
|||
- op : fused_conv2d |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fused_conv2d
has own operator with extra attributes declared. Is it necessary to add op_compat anyway?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it is necessary.
The extra attributes declared in fused_conv2d.pbtxt
are used in Pass. The info of extra attributes is getted from op_compat.yaml
when executor run the kernel, so it is needed to add op_compat for extra attributes.
… clear_conv2d_extra
@jczaja Is this PR need to run the tests? |
@zyfncg Hi, I just noticed this PR. Yes, I started our tests and will be able to share results within 2 days. |
@zyfncg We tested this PR and noticed performance problem on ERNIE 3.0 bf16 model on most recent platform Results:QPS (this PR): 122.81 Commandline
|
@jczaja This command doesn't work in my local machine, where can I get the model file of |
Ok. @yaomichael will help in providing model for you. |
@yaomichael Hi~ could you send the model file to my email: zhangyunfei07@baidu.com ? |
@jczaja Does this performance problem only occur on the specific CPU platform? The results of performance test are same between this PR and develop on my local machine. Do I need a CPU platform which support bf16 to debug? |
@zyfncg This performance regression was found on recently published Intel processor: Sapphire Rapids (SPR) as this one has hardware support of bf16 instructions. I do not have results from other processors. So I have started this test on other processors that you have to see if problem is also there (bf16 instructions are emulated there). I will update you once I got some results. |
Thanks! |
@zyfncg We run more tests and found that actually this test ernie 3.0 got some issue (problem on our side). Your PR is fine according to our other tests. Please proceed with review and merge |
@jczaja Thank you very much! This PR will be merged after resolving conflicts. |
… clear_conv2d_extra
bdb9a0e
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
PR types
Others
PR changes
OPs
Describe
删除conv2d算子OpMaker中的Extra类型输入参数(
Bias
,ResidualData
)。在前序工作(PR47579, PR48848)的基础上,conv2d算子Extra输入参数的使用依赖基本被解除,因此本PR移除了conv2d算子的Extra输入参数(
Bias
,ResidualData
)。对于conv2d中仅用于fuse场景的Extra属性(如
fuse_activation
,fuse_residual_connection
等)由于不再被使用在本PR中也进行了删除。Based on previous work (PR47579, PR48848), The dependence on the use of the extra input parameters(
Bias
,ResidualData
) of the conv2d operator has been removed, so this PR deletes the extra input parameters (Bias
,ResidualData
) in OpMaker of conv2d.Extra attributes (such as
fuse_activation
,fuse_residual_connection
, etc.) that are only used in fused kernel for conv2d are no longer used, so we also delete them in this PR.