fix deprecated `torch.cuda.amp.GradScaler` FutureWarning for pytorch 2.4+ #3132

Mon-ius · 2024-09-29T00:29:10Z

With the release of PyTorch 2.4, the torch.cuda.amp.GradScaler API has been officially marked as deprecated.

This pull request (PR) addresses the following issue:

accelerate/accelerator.py:494: FutureWarning: `torch.cuda.amp.GradScaler(args...)` is deprecated. Please use `torch.amp.GradScaler('cuda', args...)` instead.
  self.scaler = torch.cuda.amp.GradScaler(**kwargs)

BenjaminBossan · 2024-09-30T16:17:23Z

Thanks for working on this deprecation. Could you please run make quality and make style on your PR?

Mon-ius · 2024-10-01T01:01:27Z

@BenjaminBossan I have finished ur requests, would u plz trigger the workflows 🤗

HuggingFaceDocBuilderDev · 2024-10-01T10:34:23Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Mon-ius · 2024-10-01T11:21:09Z

strange that, I run make quality okay on my side.

make quality
ruff check .
All checks passed!
ruff format --check .
150 files already formatted
doc-builder style src/accelerate docs/source --max_len 119 --check_only

@BenjaminBossan I haved checked the e583d2e results, seems that this workflow has bugs to be fixed.

It indicates all test passed, but failed at last,

make quality
  shell: /usr/bin/bash -e {0}
  env:
    pythonLocation: /opt/hostedtoolcache/Python/[3](https://github.com/huggingface/accelerate/actions/runs/11116652373/job/30907116341#step:5:3).8.18/x64
    LD_LIBRARY_PATH: /opt/hostedtoolcache/Python/3.8.18/x6[4](https://github.com/huggingface/accelerate/actions/runs/11116652373/job/30907116341#step:5:4)/lib
ruff check .
All checks passed!
ruff format --check .
1[5](https://github.com/huggingface/accelerate/actions/runs/11116652373/job/30907116341#step:5:5)0 files already formatted
doc-builder style src/accelerate docs/source --max_len 119 --check_only
Traceback (most recent call last):
  File "/opt/hostedtoolcache/Python/3.8.18/x[6](https://github.com/huggingface/accelerate/actions/runs/11116652373/job/30907116341#step:5:6)4/bin/doc-builder", line 8, in <module>
    sys.exit(main())
  File "/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages/doc_builder/commands/doc_builder_cli.py", line 4[7](https://github.com/huggingface/accelerate/actions/runs/11116652373/job/30907116341#step:5:8), in main
    args.func(args)
  File "/opt/hostedtoolcache/Python/3.[8](https://github.com/huggingface/accelerate/actions/runs/11116652373/job/30907116341#step:5:9).18/x64/lib/python3.8/site-packages/doc_builder/commands/style.py", line 28, in style_command
    raise ValueError(f"{len(changed)} files should be restyled!")
ValueError: 1 files should be restyled!
make: *** [Makefile:17: quality] Error 1
Error: Process completed with exit code 2.

BenjaminBossan · 2024-10-01T12:36:38Z

It's the doc-builder that is complaining. This is the diff I get from running doc-builder style src/accelerate docs/source --max_len 119:

@@ -127,10 +127,11 @@ def find_executable_batch_size(function: callable = None, starting_batch_size: i
 
 
     >>> @find_executable_batch_size(starting_batch_size=128)
-    ... def train(batch_size, model, optimizer): ...
+    ... def train(batch_size, model, optimizer):
+    ...     ...
 
 
-    ... train(model, optimizer)
+    >>> train(model, optimizer)
     ```
     """
     if function is None

Mon-ius · 2024-10-01T15:08:03Z

@find_executable_batch_size(starting_batch_size=128)

Interesting, but I didn't touch the file under src/accelerate/utils/memory.py. I ran your command on my side: doc-builder style src/accelerate docs/source --max_len 119, and there were no issues.

I just updated the code based on your message. I'm aware that the doc-builder on python3.8 is slightly different from the one running on python3.12 on my side.

@BenjaminBossan, Could you please trigger the CI again?

BenjaminBossan · 2024-10-01T15:16:38Z

Thanks @Mon-ius. I think one of your previous commits made a change to memory.py due to ruff or so. Thus let's completely undo any changes to the file and it should pass. That means, restoring the one line that's currently removed.

Mon-ius · 2024-10-01T16:39:04Z

@BenjaminBossan I just deleted the file and download the original one, seems just one blank line make the CI failed.

Could you plz trigger it again? 🤗

Mon-ius · 2024-10-01T17:39:40Z

Hi, @BenjaminBossan

Seems all checks have been passed now, could u plz to merge it 🤗

BenjaminBossan

I checked this change with older PyTorch versions and unfortunately, it is only supported starting from 2.3 on. Since we want to support older versions as well, we need to make a torch version check.

Probably a similar argument applies to the other grad checkers, like torch.npu.amp.GradScaler, but I haven't tested.

Mon-ius · 2024-10-02T11:45:28Z

@BenjaminBossan Thx for your review, I applied a simple pytorch version check for 2.3 on GPU device, could u plz to trigger the CI?

Mon-ius · 2024-10-02T11:50:05Z

For npu device, I currently dont hold one to test, but I also applied the strategy on last seperate commit 🤗

BenjaminBossan

Thanks for the version fixes. I have a few smaller comments.

It would probably make sense to factor out the grad scaler instantiation to a utility function to avoid code duplication, but I'll let Zach be the judge of that once he's back in office. For now, it can stay as is.

BenjaminBossan · 2024-10-02T12:29:10Z

src/accelerate/checkpointing.py

-        scaler (`torch.cuda.amp.GradScaler`, *optional*):
-            An optional gradient scaler instance to save
+        scaler (`torch.amp.GradScaler`, *optional*) for pytorch>2.3:
+            An optional gradient scaler instance to save; for lower version, check `torch.cuda.amp.GradScaler`


I think we don't need this extra explanation in the docstring, it should be quite clear as is what this refers to.

BenjaminBossan · 2024-10-02T12:29:29Z

src/accelerate/utils/dataclasses.py

@@ -209,8 +209,9 @@ def register_comm_hook(self, model):
 class GradScalerKwargs(KwargsHandler):
    """
    Use this object in your [`Accelerator`] to customize the behavior of mixed precision, specifically how the
-    `torch.cuda.amp.GradScaler` used is created. Please refer to the documentation of this
-    [scaler](https://pytorch.org/docs/stable/amp.html?highlight=gradscaler) for more information on each argument.
+    `torch.amp.GradScaler` used is created for pytoch>2.3 or `torch.cuda.amp.GradScaler` for lower version. Please


Same, let's not overexplain.

BenjaminBossan · 2024-10-02T12:30:01Z

src/accelerate/accelerator.py

@@ -494,11 +495,17 @@ def __init__(
            elif is_musa_available():
                self.scalar = torch.musa.amp.GradScaler(**kwargs)
            elif is_npu_available():
-                self.scaler = torch.npu.amp.GradScaler(**kwargs)
+                if version.parse(torch.__version__) > version.parse("2.3"):


Since we both can't test this, let's not touch NPU for now.

Mon-ius · 2024-10-02T14:38:47Z

@BenjaminBossan Thanks again for the comments, I have reverted the code related to the documentation and NPU. 🤗

Mon-ius · 2024-10-02T15:44:18Z

seems there is an CI self issue, which lacks pillow on test environment for failed

=========================== short test summary info ============================
FAILED tests/trainer/test_trainer.py::TrainerIntegrationTest::test_trainer_saves_image_processor - ImportError: 
CLIPImageProcessor requires the PIL library but it was not found in your environment. You can install it with pip:
`pip install pillow`. Please note that you may need to restart your runtime after installation.
FAILED tests/trainer/test_trainer.py::TrainerIntegrationTest::test_trainer_saves_processor - ImportError: 
CLIPImageProcessor requires the PIL library but it was not found in your environment. You can install it with pip:
`pip install pillow`. Please note that you may need to restart your runtime after installation.
=========== 2 failed, 172 passed, 81 skipped, 42 warnings in 28.79s ============

BenjaminBossan · 2024-10-02T15:46:53Z

Yes, appears to be unrelated.

BenjaminBossan

LGTM, thanks for your continued work on this PR. I'll leave this open for Zach to give the final review.

Mon-ius · 2024-10-06T09:51:05Z

Thanks for the version fixes. I have a few smaller comments.

It would probably make sense to factor out the grad scaler instantiation to a utility function to avoid code duplication, but I'll let Zach be the judge of that once he's back in office. For now, it can stay as is.

@BenjaminBossan, how about Zach's review of this PR 🤗

muellerzr

Thanks, this looks okay to me. This hints that we should probably have a get_grad_scaler util func in Accelerate which can handle the backends automatically as part of the ops util package, but for the scope of this PR this works great.

Thanks!

fix deprecated FutureWarning for pytorch 2.4+

5572f88

Mon-ius changed the title ~~fix deprecated FutureWarning for pytorch 2.4+~~ fix deprecated torch.cuda.amp.GradScaler FutureWarning for pytorch 2.4+ Sep 29, 2024

perform make style and make quality

e583d2e

Merge branch 'huggingface:main' into main

f5e7ebf

This was referenced Oct 1, 2024

Update accelerator.py preventing FutureWarning when using GradScaler with cuda #3137

Closed

feat: removed depricated GradScaler #3138

Closed

try to fix Quality Check on actions/workflows/quality.yml

c9e6667

undo changes for src/accelerate/utils/memory.py

75d87c7

BenjaminBossan reviewed Oct 2, 2024

View reviewed changes

adapt scaler for pytorch.__version__

5990f7f

fix scalar waning for npu device deps on pytorch2.4 version check

347bd27

BenjaminBossan reviewed Oct 2, 2024

View reviewed changes

Mon-ius added 2 commits October 2, 2024 22:32

fallback to default npu scaler

6ac8586

fallback to default GradScaler doc

7b4c949

BenjaminBossan approved these changes Oct 2, 2024

View reviewed changes

muellerzr approved these changes Oct 7, 2024

View reviewed changes

muellerzr merged commit e93b056 into huggingface:main Oct 7, 2024
24 of 25 checks passed

This was referenced Oct 7, 2024

Fix GradScaler deprecation warning #3111

Closed

Refactor scaler to util #3142

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix deprecated `torch.cuda.amp.GradScaler` FutureWarning for pytorch 2.4+ #3132

fix deprecated `torch.cuda.amp.GradScaler` FutureWarning for pytorch 2.4+ #3132

Mon-ius commented Sep 29, 2024

BenjaminBossan commented Sep 30, 2024

Mon-ius commented Oct 1, 2024

HuggingFaceDocBuilderDev commented Oct 1, 2024

Mon-ius commented Oct 1, 2024 •

edited

Loading

BenjaminBossan commented Oct 1, 2024

Mon-ius commented Oct 1, 2024 •

edited

Loading

BenjaminBossan commented Oct 1, 2024

Mon-ius commented Oct 1, 2024

Mon-ius commented Oct 1, 2024

BenjaminBossan left a comment

Mon-ius commented Oct 2, 2024

Mon-ius commented Oct 2, 2024

BenjaminBossan left a comment

BenjaminBossan Oct 2, 2024

BenjaminBossan Oct 2, 2024

BenjaminBossan Oct 2, 2024

Mon-ius commented Oct 2, 2024

Mon-ius commented Oct 2, 2024

BenjaminBossan commented Oct 2, 2024

BenjaminBossan left a comment

Mon-ius commented Oct 6, 2024

muellerzr left a comment

fix deprecated torch.cuda.amp.GradScaler FutureWarning for pytorch 2.4+ #3132

fix deprecated torch.cuda.amp.GradScaler FutureWarning for pytorch 2.4+ #3132

Conversation

Mon-ius commented Sep 29, 2024

BenjaminBossan commented Sep 30, 2024

Mon-ius commented Oct 1, 2024

HuggingFaceDocBuilderDev commented Oct 1, 2024

Mon-ius commented Oct 1, 2024 • edited Loading

BenjaminBossan commented Oct 1, 2024

Mon-ius commented Oct 1, 2024 • edited Loading

BenjaminBossan commented Oct 1, 2024

Mon-ius commented Oct 1, 2024

Mon-ius commented Oct 1, 2024

BenjaminBossan left a comment

Choose a reason for hiding this comment

Mon-ius commented Oct 2, 2024

Mon-ius commented Oct 2, 2024

BenjaminBossan left a comment

Choose a reason for hiding this comment

BenjaminBossan Oct 2, 2024

Choose a reason for hiding this comment

BenjaminBossan Oct 2, 2024

Choose a reason for hiding this comment

BenjaminBossan Oct 2, 2024

Choose a reason for hiding this comment

Mon-ius commented Oct 2, 2024

Mon-ius commented Oct 2, 2024

BenjaminBossan commented Oct 2, 2024

BenjaminBossan left a comment

Choose a reason for hiding this comment

Mon-ius commented Oct 6, 2024

muellerzr left a comment

Choose a reason for hiding this comment

fix deprecated `torch.cuda.amp.GradScaler` FutureWarning for pytorch 2.4+ #3132

fix deprecated `torch.cuda.amp.GradScaler` FutureWarning for pytorch 2.4+ #3132

Mon-ius commented Oct 1, 2024 •

edited

Loading

Mon-ius commented Oct 1, 2024 •

edited

Loading