Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Prodigy Plus Schedule Free optimizer #614

Open
wants to merge 12 commits into
base: master
Choose a base branch
from

Conversation

saunderez
Copy link

@saunderez saunderez commented Dec 25, 2024


For more details, open the Copilot Workspace session.

…mizerParamsWindow.py`

* Add `split_groups` parameter with title, tooltip, and type.
* Adjust formatting for `cautious` parameter to align with new `split_groups` parameter.
…ate_optimizer` function in `modules/util/create.py`.

* **Remove `decouple` parameter:**
  - Remove `decouple` parameter from the `create_optimizer` function for different optimizer configurations.

* **Add `prodigy_steps` parameter:**
  - Add `prodigy_steps` parameter to the `create_optimizer` function for `PRODIGY_PLUS_SCHEDULE_FREE` optimizer configuration.
@@ -154,6 +159,11 @@ def create_dynamic_ui(

# Extract the keys for the selected optimizer
for index, key in enumerate(OPTIMIZER_DEFAULT_PARAMETERS[selected_optimizer].keys()):
if selected_optimizer == Optimizer.PRODIGY_PLUS_SCHEDULE_FREE and key not in [
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain this change? I'm not quite sure if or why it's needed

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@saunderez Can you comment on this? I don't want to merge something if I don't understand why it's done

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a side effect of setting the lr to 1.0 in the OPTIMIZER_DEFAULT_PARAMETERS and not wanting to present it as configurable.

            if selected_optimizer == Optimizer.PRODIGY_PLUS_SCHEDULE_FREE and key in [
                'lr'
            ]:
                continue

Is really what it's doing here.

given the other learning rate free optimizers don't do this it's probably better if lr is removed from Optimizer.PRODIGY_PLUS_SCHEDULE_FREE and then conditional isn't necessary.

Out of scope for this change would be to fix this sharp edge for all the optimizers that expect a lr of 1.0

@Koratahiu
Copy link

OneTrainer\modules\trainer\GenericTrainer.py", line 539, in __apply_fused_back_pass
for param_group in self.model.optimizer.param_groups:
AttributeError: 'NoneType' object has no attribute 'param_groups'

pretty broken

  This commit adds a variant of the Prodigy Optimizer
  created by LoganBooker
  https://github.com/LoganBooker/prodigy-plus-schedule-free/tree/main

  It is both learning rate free and schedule free.
  It also contains experiemntal optimization techniques
  as well as general memory usage and performance ,
  improvmentsc

  Use with constant scheduler

  Based on code from:
  https://github.com/facebookresearch/schedule_free
  https://github.com/konstmish/prodigy

  Incorporates improvements from these pull requests (credit to https://github.com/dxqbYD and https://github.com/sangoi-exe):
  konstmish/prodigy#23
  konstmish/prodigy#22
  konstmish/prodigy#20

  Supports fused backwards pass.
  Experimental featuures.
  ADOPT https://arxiv.org/abs/2411.02853
  Cautious https://arxiv.org/pdf/2411.16085
  MuonPP https://github.com/KellerJordan/Muon/blob/master/muon.py
  StableAdamW https://optimi.benjaminwarner.dev/optimizers/stableadamw/

  Probably some other stuff I forgot. For full details
@saunderez saunderez force-pushed the add-prodigy-plus-schedule-free branch from 2e934c9 to 878c9c2 Compare December 30, 2024 18:07
use_adopt: bool
prodigy_steps: int
use_adopt: bool
use_cautious: bool
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this supposed to be here twice?

'use_cautious': {'title': 'Use Cautious', 'tooltip': 'Experimental. Perform "cautious" updates, as proposed in https://arxiv.org/pdf/2411.16085. Recommended: False', 'type': 'bool'},
'use_adopt': {'title': 'Use ADOPT', 'tooltip': 'Experimental. Partial implementation of (https://arxiv.org/abs/2411.02853). Recommended: False', 'type': 'bool'},
'lr': {'title': 'Learning Rate', 'tooltip': 'Learning rate adjustment parameter. Increases or decreases the Prodigy learning rate. Recommended: 1.0', 'type': 'float'},
'weignt_decay_by_lr': {'title': 'Weight Decay by LR', 'tooltip': 'If True, weight_decay is multiplied by the adaptive learning rate. Recommended: True', 'type': 'bool'},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

weignt_decay_by_lr -> weight_decay_by_lr

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually this is in a couple of places.

data.append(("use_muon_pp", False, bool, False))
data.append(("use_cautious", False, bool, False))
data.append(("use_adopt", False, bool, False))
data.append(("prodigy_steps", 0, int, False))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing weight_decay_by_lr

@@ -34,6 +34,7 @@ lion-pytorch==0.2.2 # lion optimizer
prodigyopt==1.0 # prodigy optimizer
schedulefree==1.3.0 # schedule-free optimizers
pytorch_optimizer==3.3.0 # pytorch optimizers
prodigy-plus-schedule-free==1.8.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1.9.0 got released rather recently. Only interface change is an extra parameter, factored_fp32 with a default of True.

@O-J1 O-J1 added the followup Failure to provide config or other info or needs followup label Feb 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
followup Failure to provide config or other info or needs followup
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants