support stage2 for gradient merge. #47711

wuhuachaocoding · 2022-11-07T05:43:55Z

PR types

Others

PR changes

Others

Describe

update dp + stage2 for gradient merge.

paddle-bot · 2022-11-07T05:43:58Z

你的PR提交成功，感谢你对开源项目的贡献!
请关注后续CI自动化测试结果，详情请参考Paddle-CI手册。
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

haohongxiang · 2022-11-16T03:10:38Z

python/paddle/distributed/fleet/meta_parallel/sharding/group_sharded_stage2.py

@@ -658,6 +662,8 @@ def _opt_step(self):
                    # Wait for the last reduce task. This wait must before grad scale function.
                    assert self._comm_task is not None
                    self._comm_task.wait()
+
+                dp_allreduce_func()


dp group内的allreduce操作，最好移动到grad_scale后进行。

haohongxiang

LGTM

update dp + stage for gradient merge.

03eadcf

wuhuachaocoding added 11 commits November 7, 2022 11:18

update.

8bf65f9

update.

622dbf8

update.

f4b56e0

update.

acd666d

update.

d0866db

update.

1a72911

update for GM

b5209df

Merge remote-tracking branch 'upstream/develop' into stage2_GM

cb3b898

Merge remote-tracking branch 'upstream/develop' into stage2_GM

d48ff41

update test.

8b334c8

update test.

4a9795a

haohongxiang reviewed Nov 16, 2022

View reviewed changes

move dp_allreduce behind the grad_func.

f340ecc

wuhuachaocoding changed the title ~~update dp + stage2 for gradient merge.~~ support stage2 for gradient merge. Nov 16, 2022

haohongxiang approved these changes Nov 16, 2022

View reviewed changes

haohongxiang merged commit c20eb7a into PaddlePaddle:develop Nov 17, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

support stage2 for gradient merge. #47711

support stage2 for gradient merge. #47711

wuhuachaocoding commented Nov 7, 2022 •

edited

Loading

paddle-bot bot commented Nov 7, 2022

haohongxiang Nov 16, 2022

wuhuachaocoding Nov 16, 2022

haohongxiang left a comment

support stage2 for gradient merge. #47711

support stage2 for gradient merge. #47711

Conversation

wuhuachaocoding commented Nov 7, 2022 • edited Loading

PR types

PR changes

Describe

paddle-bot bot commented Nov 7, 2022

haohongxiang Nov 16, 2022

Choose a reason for hiding this comment

wuhuachaocoding Nov 16, 2022

Choose a reason for hiding this comment

haohongxiang left a comment

Choose a reason for hiding this comment

wuhuachaocoding commented Nov 7, 2022 •

edited

Loading