You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
I get AttributeError: Can't pickle local object 'FlopsProfiler.start_profile.<locals>. register_module_hooks.<locals>.start_time_hook' when I run torch.save on a model that has been run get_model_profile.
I checked the flops_profiler code and found that the part that should be if not hasattr(module, "__start_time_hook_handle__"): is if not hasattr(module, "__start_time_hook_handle"):.
After correcting the above, the error no longer occurs.
To Reproduce
Run get_model_profile and torch.save on a model that has the same module in several different parts.
My reproduction code is here.
Expected behavior AttributeError: Can't pickle local object... does not occur in torch.save after get_model_profile
(or all modules do not contain start_time_hook in _forward_pre_hooks)
ds_report output ds_report could not be complete due to AttributeError: module 'os' has no attribute 'statvfs'
I have installed deepspeed-0.12.7+40342055-py3-none-any.whl to Python 3.12 on Windows 10
System info (please complete the following information):
Collecting environment information...
PyTorch version: 2.2.2+cu121
Is debug build: False
CUDA used to build PyTorch: 12.1
ROCM used to build PyTorch: N/A
OS: Microsoft Windows 10 Home
GCC version: (x86_64-posix-seh-rev0, Built by MinGW-W64 project) 8.1.0
Clang version: Could not collect
CMake version: Could not collect
Libc version: N/A
Python version: 3.12.3 (tags/v3.12.3:f6650f9, Apr 9 2024, 14:05:25) [MSC v.1938 64 bit (AMD64)] (64
-bit runtime)
Python platform: Windows-10-10.0.19045-SP0
Is CUDA available: True
CUDA runtime version: 8.0.60
CUDA_MODULE_LOADING set to: LAZY
GPU models and configuration: GPU 0: NVIDIA GeForce RTX 4060 Ti
Nvidia driver version: 546.01
cuDNN version: Could not collect
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True
CPU:
Architecture=9
CurrentClockSpeed=3696
DeviceID=CPU0
Family=198
L2CacheSize=1536
L2CacheSpeed=
Manufacturer=GenuineIntel
MaxClockSpeed=3696
Name=Intel(R) Core(TM) i7-8700K CPU @ 3.70GHz
ProcessorType=3
Revision=
Versions of relevant libraries:
[pip3] numpy==1.26.4
[pip3] torch==2.2.2+cu121
[conda] Could not collect
The text was updated successfully, but these errors were encountered:
Hi @threewayhandshake - could you confirm if you are still hitting this on the current DeepSpeed released wheel that is build specifically for Windows?
I'll check on the solution you have and create a PR and we can discuss there as well if this is still relevant?
Describe the bug
I get
AttributeError: Can't pickle local object 'FlopsProfiler.start_profile.<locals>. register_module_hooks.<locals>.start_time_hook'
when I run torch.save on a model that has been run get_model_profile.I checked the flops_profiler code and found that the part that should be
if not hasattr(module, "__start_time_hook_handle__"):
isif not hasattr(module, "__start_time_hook_handle"):
.After correcting the above, the error no longer occurs.
To Reproduce
Run
get_model_profile
andtorch.save
on a model that has the same module in several different parts.My reproduction code is here.
Expected behavior
AttributeError: Can't pickle local object...
does not occur intorch.save
afterget_model_profile
(or all modules do not contain
start_time_hook
in _forward_pre_hooks)ds_report output
ds_report
could not be complete due toAttributeError: module 'os' has no attribute 'statvfs'
I have installed
deepspeed-0.12.7+40342055-py3-none-any.whl
to Python 3.12 on Windows 10System info (please complete the following information):
The text was updated successfully, but these errors were encountered: