validation_epoch_end with DDP #5808
Replies: 2 comments
-
Hey @adarsh-kr, Are you using Pytorch Lightning Metric API or implementing our own. Metric API should take care of sync across gpus. We use this function to sync values: https://github.com/PyTorchLightning/pytorch-lightning/blob/master/pytorch_lightning/utilities/distributed.py#L121 Would you mind creating a simple script with BoringModel to reproduce your use case, so we can assist you better. Best regards, |
Beta Was this translation helpful? Give feedback.
-
Is the outputs in If this property is guaranteed in |
Beta Was this translation helpful? Give feedback.
-
What is your question?
I am trying to implement a metric which needs access to whole data. So instead of updating the metric in
*_step()
methods, I am trying to collect the outputs in the*_epoch_end()
methods. However, the outputs contain only the output of the partition of the data each device gets. Basically if there aren
devices, then each device is getting1/n
of the total outputs.Stackoverflow Post
What's your environment?
Beta Was this translation helpful? Give feedback.
All reactions