Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lr_scheduler.step() should pass epoch variable to be consistent with PyTorch _LRScheduler #1333

Closed
thinline72 opened this issue Apr 1, 2020 · 3 comments
Labels
bug Something isn't working help wanted Open to be worked on
Milestone

Comments

@thinline72
Copy link

PyTorch _LRScheduler base class for schedulers supports passing epoch parameter to the lr_scheduler.step() function.

But when Lightning training loop updates learning rates, it doesn't pass the current epoch index. That leads to a wrong behaviour of those shedulers that relies on epoch param.

Some schedulers handle that under the hood, like CosineAnnealingWarmRestarts from PyTorch that increment last epoch index if epoch isn't provided. In my case I have a custom scheduler that updates learning rate after each batch via "interval": "step" and it also uses current epoch index. I can create a workaround for sure, but thought it'd be nice to get it fixed in the Pytorch Lightning too. Plus it wasn't obvious for me until I logged lr into tensorboard.

@thinline72 thinline72 added bug Something isn't working help wanted Open to be worked on labels Apr 1, 2020
@github-actions
Copy link
Contributor

github-actions bot commented Apr 1, 2020

Hi! thanks for your contribution!, great first issue!

@thinline72
Copy link
Author

thinline72 commented Apr 1, 2020

Actually, _LRScheduler also handles that by incrementing the last epoch index if epoch isn't provided.

Then it's an issue for all schedulers that update lr after each batch (like with "interval": "step") and rely on self.last_epoch. As it increments self.last_epoch after each batch, it basically becomes last step index instead of last epoch index.

@thinline72
Copy link
Author

thinline72 commented Apr 1, 2020

Oh, I see that PyTorch has a deprecation warning for the epoch param now. Although they still increment self.last_epoch at each step.

Anyway, probably it doesn't make sense to fix it, I'll close the issue. At least it'll be available in search if someone get stuck like me today :)

@Borda Borda added this to the 0.7.2 milestone Apr 2, 2020
@Borda Borda modified the milestones: v0.7., v0.7.x Apr 18, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working help wanted Open to be worked on
Projects
None yet
Development

No branches or pull requests

2 participants