-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refine clip_by_global_norm #38209
Refine clip_by_global_norm #38209
Conversation
Thanks for your contribution! |
… dev/clip_by_global_norm
… dev/clip_by_global_norm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
python/paddle/fluid/clip.py
Outdated
if g.dtype == core.VarDesc.VarType.FP16 else clip_var) | ||
new_grad = layers.elementwise_mul(x=g, y=clip_input) | ||
params_and_grads.append((p, new_grad)) | ||
if global_norm_var > max_global_norm: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
建议在上面把 global_norm_var > max_global_norm 处理为一个bool flag,这样不用在循环里run多次compare OP
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tks, Done!
… dev/clip_by_global_norm
PR types
Performance optimization
PR changes
APIs
Describe
优化ClipByGlobalNorm性能:
以10*10的
paddle.nn.Linear
为例,重复进行100轮优化,clip_by_global_nrom的调用耗时分析如下:(1)优化前:
(2)优化后:
优化前后耗时比为:0.77/0.54=1.43,优化29.9%。