Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move _worker_check() logic to process_dataloader() in DDPSpawnStrategy class. #12216

Closed
ninginthecloud opened this issue Mar 3, 2022 · 1 comment
Labels
data handling Generic data-related topic priority: 2 Low priority task refactor strategy

Comments

@ninginthecloud
Copy link
Contributor

ninginthecloud commented Mar 3, 2022

Proposed refactor

This issue follows @ananthsub 's #11756 to move strategy-specific dataloader logic to the stategies.

Motivation

_worker_check() in data_connector.py is used to check the number of workers for the dataloader when user sets up DDP_SPAWN strategy.
https://github.com/PyTorchLightning/pytorch-lightning/blob/cc43d07db1ab77385feff04c01f040c5cad805a9/pytorch_lightning/trainer/connectors/data_connector.py#L274

Since this dataloader verfication logic is dpp_spawn strategy specific, it'll be better we moved the logic to DDPSpawnStrategy class. By doing so, _worker_check() can be removed from data_connector.py.

Pitch

This logic check could be moved to process_dataloader() in DDPSpawnStrategy class.

Additional context

cc: @edward-io @four4fish @ananthsub


If you enjoy Lightning, check out our other projects! ⚡

  • Metrics: Machine learning metrics for distributed, scalable PyTorch applications.

  • Lite: enables pure PyTorch users to scale their existing code on any kind of device while retaining full control over their own loops and optimization logic.

  • Flash: The fastest way to get a Lightning baseline! A collection of tasks for fast prototyping, baselining, fine-tuning, and solving problems with deep learning.

  • Bolts: Pretrained SOTA Deep Learning models, callbacks, and more for research and production with PyTorch Lightning and PyTorch.

  • Lightning Transformers: Flexible interface for high-performance research using SOTA Transformers leveraging Pytorch Lightning, Transformers, and Hydra.

cc @justusschock @awaelchli @ninginthecloud @rohitgr7 @otaj @tchaton @akihironitta

@awaelchli
Copy link
Contributor

Closing because no longer relevant. The logic was recently reworked (#18737, #18649, #18591) and the worker_check function is no longer just ddp-spawn specific.

@carmocca carmocca removed this from the future milestone Nov 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
data handling Generic data-related topic priority: 2 Low priority task refactor strategy
Projects
None yet
Development

No branches or pull requests

4 participants