Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[heterps]fix pipe-train bug: save and pull overlap may cause error #37233

Merged
merged 1 commit into from
Nov 16, 2021

Conversation

danleifeng
Copy link
Contributor

PR types

Bug fixes

PR changes

Others

Describe

fix pipe-train bug: save and pull overlap may cause get_field error
before:

Wed Nov 10 23:00:12 INFO: going to save delta model after update
I1110 23:00:29.506243 15886 ps_gpu_wrapper.cc:293] pull sparse from CpuPS into GpuPS cost 28.0899 seconds.
I1110 23:00:45.613412 15886 ps_gpu_wrapper.cc:425] GpuPs prepare for build hbm cost 16.1071 seconds.
I1110 23:00:46.988054 15886 ps_gpu_wrapper.cc:468] GpuPs build table total costs: 1.37453 s.
I1110 23:00:46.988124 15886 ps_gpu_wrapper.cc:536] thread BuildGPUTask end, cost time: 1.37462s
Wed Nov 10 23:01:16 INFO: begin save delta model for 20211106 - 3

after:

Tue Nov 16 08:51:46 INFO: going to save delta model after update
Tue Nov 16 08:52:10 INFO: begin save delta model for 20210815 - 1
going to save_delta_model
Tue Nov 16 08:55:47 INFO: ===========going to train day/pass x/2===========
I1116 08:55:47.992974 33335 ps_gpu_wrapper.cc:535] BuildPull start.
Tue Nov 16 08:55:47 INFO: cur_pass: 3 cur_path:
I1116 08:55:48.008092 33335 ps_gpu_wrapper.cc:293] pull sparse from CpuPS into GpuPS cost 0.015066 seconds.
I1116 08:55:48.038103 33335 ps_gpu_wrapper.cc:425] GpuPs prepare for build hbm cost 0.029945 seconds.
I1116 08:55:48.121467 33335 ps_gpu_wrapper.cc:468] GpuPs build table total costs: 0.083289 s.
I1116 08:55:48.121517 33335 ps_gpu_wrapper.cc:541] BuildPull + BuildGPUTask end, cost time: 0.128521s
I1116 08:55:48.121523 33335 ps_gpu_wrapper.cc:569] BeginPass end, cost time: 0.128571s

@paddle-bot-old
Copy link

Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

@danleifeng danleifeng merged commit 62ec644 into PaddlePaddle:develop Nov 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants