Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run on_validation_end only on main process in DDP #1125

Merged
merged 1 commit into from Mar 12, 2020
Merged

Run on_validation_end only on main process in DDP #1125

merged 1 commit into from Mar 12, 2020

Conversation

ghost
Copy link

@ghost ghost commented Mar 12, 2020

Fixes #1119.

Copy link
Contributor

@jeffling jeffling left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks fine. I can't think of any reason why we would want checkpoints on non-master processes.

@Borda Borda added bug Something isn't working ready PRs ready to be merged labels Mar 12, 2020
@Borda Borda added this to the 0.7.2 milestone Mar 12, 2020
@Borda Borda merged commit b4d4e48 into Lightning-AI:master Mar 12, 2020
@ghost ghost deleted the fix-checkpoint branch March 12, 2020 14:51
tullie pushed a commit to tullie/pytorch-lightning that referenced this pull request Apr 3, 2020
Co-authored-by: xingzhaolee <xingzhaolee@users.noreply.github.com>
@Borda Borda modified the milestones: v0.7., v0.7.x Apr 18, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working ready PRs ready to be merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Checkpoint fails in single node multi-GPU mode using DDP
2 participants