-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
onednn: remove fc_elementwise_add fusion #55504
onednn: remove fc_elementwise_add fusion #55504
Conversation
你的PR提交成功,感谢你对开源项目的贡献! |
8bc05ad
to
8de4ee8
Compare
@kolinwei hi, please help to check the approval in coverage job. The tests will be removed along with the feature. |
@vivienfanghuagood hi, can you take a look at this? |
@@ -173,7 +173,6 @@ if(WITH_MKLDNN) | |||
pass_library(conv_elementwise_add_mkldnn_fuse_pass inference DIR mkldnn) | |||
pass_library(int8_scale_calculation_mkldnn_pass inference DIR mkldnn) | |||
pass_library(params_quantization_mkldnn_pass inference DIR mkldnn) | |||
pass_library(fc_elementwise_add_mkldnn_fuse_pass inference DIR mkldnn) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this pass is removed, will the performance of some models be affected?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some will be impact. A simple case should be linear -> add
, add won't be fused into linear and cause extra computation. However, the current post-sum fusion need memory inplace and it will be conflict with memory reuse opimization and cause wrong accuracy. We'll re-enable the fusion by using post-binary fusion which doesn't actually need memory inplace but this needs more validation work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
单测删除
* onednn: remove fc+eltwiseadd fusion pass * onednn: remove post-sum fusion in fc kernel * onednn: tests: make unfused add run into f32
* onednn: remove fc+eltwiseadd fusion pass * onednn: remove post-sum fusion in fc kernel * onednn: tests: make unfused add run into f32
* onednn: remove fc+eltwiseadd fusion pass * onednn: remove post-sum fusion in fc kernel * onednn: tests: make unfused add run into f32
PR types
Bug fixes
PR changes
OPs
Description
post-sum fusion is conflicted with memory_optimization_pass because the post-sum and dst share the same buffer. This pr is to deprecate the fusion to make language model happy. We'll switch to use matmul and post-binary fusion to improve the performance then.