Skip to content

Commit

Permalink
Fix DistribType for ddp_cpu (spawn) (#7492)
Browse files Browse the repository at this point in the history
  • Loading branch information
alanhdu authored and edgarriba committed May 18, 2021
1 parent 25115a1 commit f11917f
Show file tree
Hide file tree
Showing 4 changed files with 41 additions and 4 deletions.
27 changes: 27 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,33 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).

### Removed

- Prune deprecated classif. metrics from `pytorch_lightning.metrics.functional.classification` ([7499](https://github.com/PyTorchLightning/pytorch-lightning/pull/7499))


- Removed deprecated data parallel classes `LightningDataParallel` and `LightningDistributedDataParallel` from `pytorch_lightning.overrides.data_parallel` ([7510](https://github.com/PyTorchLightning/pytorch-lightning/pull/7510))


- Removed deprecated trainer attributes - `get_model` and `accelerator_backend` ([7502](https://github.com/PyTorchLightning/pytorch-lightning/pull/7502))


- Removed deprecated utils modules `model_utils`, `warning_utils`, `xla_device_utils` and partially `argparse_utils` ([7503](https://github.com/PyTorchLightning/pytorch-lightning/pull/7503))


- Removed deprecated trainer attributes - `on_cpu`, `on_tpu`, `use_tpu`, `on_gpu`, `use_dp`, `use_ddp`, `use_ddp2`, `use_horovod`, `use_single_gpu` ([#7501](https://github.com/PyTorchLightning/pytorch-lightning/pull/7501))


### Fixed


- Fixed parsing of multiple training dataloaders ([#7433](https://github.com/PyTorchLightning/pytorch-lightning/pull/7433))


- Fixed recursive passing of `wrong_type` keyword argument in `pytorch_lightning.utilities.apply_to_collection` ([#7433](https://github.com/PyTorchLightning/pytorch-lightning/pull/7433))


- Fixed setting correct `DistribType` for `ddp_cpu` (spawn) backend ([#7492](https://github.com/PyTorchLightning/pytorch-lightning/pull/7492))


## [1.3.1] - 2021-05-11

### Fixed
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -522,7 +522,7 @@ def set_distributed_mode(self, distributed_backend: Optional[str] = None):

# special case with DDP on CPUs
if self.distributed_backend == "ddp_cpu":
self._distrib_type = DistributedType.DDP
self._distrib_type = DistributedType.DDP_SPAWN
if self.num_gpus > 0:
rank_zero_warn(
'You requested one or more GPUs, but set the backend to `ddp_cpu`. Training will not use GPUs.'
Expand Down
8 changes: 5 additions & 3 deletions tests/accelerators/test_accelerator_connector.py
Original file line number Diff line number Diff line change
Expand Up @@ -437,13 +437,15 @@ def test_ipython_incompatible_backend_error(*_):
with pytest.raises(MisconfigurationException, match="backend ddp is not compatible"):
Trainer(accelerator="ddp", gpus=2)

with pytest.raises(MisconfigurationException, match="backend ddp is not compatible"):
Trainer(accelerator="ddp_cpu", num_processes=2)

with pytest.raises(MisconfigurationException, match="backend ddp2 is not compatible"):
Trainer(accelerator="ddp2", gpus=2)


@mock.patch("pytorch_lightning.utilities._IS_INTERACTIVE", return_value=True)
def test_ipython_compatible_backend(*_):
Trainer(accelerator="ddp_cpu", num_processes=2)


@pytest.mark.parametrize(
["accelerator", "plugin"],
[('ddp_spawn', 'ddp_sharded'), (None, 'ddp_sharded')],
Expand Down
8 changes: 8 additions & 0 deletions tests/trainer/test_trainer.py
Original file line number Diff line number Diff line change
Expand Up @@ -1190,6 +1190,7 @@ def test_num_sanity_val_steps_neg_one(tmpdir, limit_val_batches):
),
(
dict(accelerator="ddp_cpu", num_processes=2, gpus=None),
<<<<<<< HEAD
dict(
use_dp=False,
use_ddp=True,
Expand All @@ -1199,6 +1200,9 @@ def test_num_sanity_val_steps_neg_one(tmpdir, limit_val_batches):
use_single_gpu=False,
num_processes=2,
),
=======
dict(_distrib_type=DistributedType.DDP_SPAWN, _device_type=DeviceType.CPU, num_gpus=0, num_processes=2),
>>>>>>> 6ac16ff3... Fix DistribType for `ddp_cpu` (spawn) (#7492)
),
(
dict(accelerator="ddp2", gpus=None),
Expand Down Expand Up @@ -1250,6 +1254,7 @@ def test_num_sanity_val_steps_neg_one(tmpdir, limit_val_batches):
),
(
dict(accelerator="ddp_cpu", num_processes=2, gpus=1),
<<<<<<< HEAD
dict(
use_dp=False,
use_ddp=True,
Expand All @@ -1259,6 +1264,9 @@ def test_num_sanity_val_steps_neg_one(tmpdir, limit_val_batches):
use_single_gpu=False,
num_processes=2,
),
=======
dict(_distrib_type=DistributedType.DDP_SPAWN, _device_type=DeviceType.CPU, num_gpus=0, num_processes=2),
>>>>>>> 6ac16ff3... Fix DistribType for `ddp_cpu` (spawn) (#7492)
),
(
dict(accelerator="ddp2", gpus=1),
Expand Down

0 comments on commit f11917f

Please sign in to comment.