-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
scale_by_adam issue #33
Comments
Nevermind, I understand now that it changes So the issue is just with class OrthoAdamW(C.BaseOpt):
def __init__(self, params, lr=0.0025, betas=(0.9, 0.99), eps=1e-8, weight_decay=0, warmup_steps=0,
foreach: bool = True, storage_dtype: str = 'float32', mars: bool = False, caution: bool = False,
mars_gamma: float = 0.0025, gradient_clipping: C.str_or_fn = C.use_default,
update_clipping: C.str_or_fn = C.use_default, palm: bool = C.use_default, beta2_scale: float = 0.8):
defaults = locals()
defaults.pop("self")
params = defaults.pop("params")
super().__init__(params, defaults, foreach, gradient_clipping, update_clipping, palm,
C.scale_by_adam, C.orthogonalize_grad_to_param,) |
Good find, and thank you for the detailed report! I've added |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hi!
I was testing orthograd and found this, scale_by_adam causes TypeError: adam_() takes 6 positional argument but 7 were given
in
adam_
there is no epsilon argumentbut
heavyball.OrthoAdamW
doesn't seem to error. Well I added a print toscale_by_adam
and it never actually get called when using OrthoAdamW, onlyorthogonalize_grad_to_param
get called. There might be something wrong thereThe text was updated successfully, but these errors were encountered: