Skip to content

Commit

Permalink
fix errors when setting zero3 leaf modules with torch.compile (#6564)
Browse files Browse the repository at this point in the history
When setting zero3 leaf modules to a higher level module and running
with torch.compile, there are a few errors from ZeROOrderedDict.

First it doesn't support Deep copy for not having a constructor with no
parameters.

Second, it doesn't check the existence of ds_status attr on param before
accessing the attr.

change contributed by Haifeng Chen

Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>
Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com>
  • Loading branch information
3 people authored Sep 26, 2024
1 parent c85c870 commit ba58682
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions deepspeed/runtime/zero/parameter_offload.py
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ def _apply_forward_and_backward_to_tensors_only(module, forward_function, backwa

class ZeROOrderedDict(OrderedDict):

def __init__(self, parent_module, *args, **kwargs):
def __init__(self, parent_module=None, *args, **kwargs):
"""A replacement for ``collections.OrderedDict`` to detect external ZeRO params.
Args:
Expand All @@ -56,7 +56,7 @@ def __getitem__(self, key):
if param is None:
return param

if param.ds_status == ZeroParamStatus.NOT_AVAILABLE:
if hasattr(param, "ds_status") and param.ds_status == ZeroParamStatus.NOT_AVAILABLE:
if self._parent_module._parameters._in_forward:
register_external_parameter(FWD_MODULE_STACK[-1], param)
param.all_gather()
Expand Down

0 comments on commit ba58682

Please sign in to comment.