You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the issue
note: this is not a bug but a inconsistent design.
In the current implementation of the learning rate (LR) scheduler configuration in DeepSpeed, the LR scheduler is always initialized from the configuration file if it is defined there, regardless of whether a scheduler is provided programmatically. This behavior leads to inconsistency with the optimizer, which can be overwritten programmatically even if it is defined in the configuration file.
Here is the current implementation of the LR scheduler configuration:
def_configure_lr_scheduler(self, client_lr_scheduler):
# First check for scheduler in json configurationlr_scheduler=self._scheduler_from_config(self.optimizer)
iflr_scheduler:
log_dist(f"DeepSpeed using configured LR scheduler = {self.scheduler_name()}", ranks=[0])
self.lr_scheduler=lr_schedulerelse:
ifisinstance(client_lr_scheduler, Callable):
log_dist('DeepSpeed using client callable to create LR scheduler', ranks=[0])
self.lr_scheduler=client_lr_scheduler(self.basic_optimizer)
else:
log_dist('DeepSpeed using client LR scheduler', ranks=[0])
self.lr_scheduler=client_lr_schedulerlog_dist(f'DeepSpeed LR Scheduler = {self.lr_scheduler}', ranks=[0])
Expected behavior
We should be able to overwrite the lr scheduler defined in config.
Ideally would prefer something like:
def_configure_lr_scheduler(self, client_lr_scheduler):
# First check for scheduler in json configurationifclient_lr_scheduler:
ifisinstance(client_lr_scheduler, Callable):
log_dist('DeepSpeed using client callable to create LR scheduler', ranks=[0])
self.lr_scheduler=client_lr_scheduler(self.basic_optimizer)
else:
log_dist('DeepSpeed using client LR scheduler', ranks=[0])
self.lr_scheduler=client_lr_schedulerelse:
lr_scheduler=self._scheduler_from_config(self.optimizer)
log_dist(f"DeepSpeed using configured LR scheduler = {self.scheduler_name()}", ranks=[0])
self.lr_scheduler=lr_schedulerlog_dist(f'DeepSpeed LR Scheduler = {self.lr_scheduler}', ranks=[0])
Why does the current design enforce the initialization of the LR scheduler from the configuration file if it is defined there, while allowing the optimizer to be overwritten programmatically?
The text was updated successfully, but these errors were encountered:
This PR is based on #5726.
The current lr scheduler initialization always prioritize config over
manual defined scheduler in the code. However, the optimizer
initialization implementation prioritize manual defined optimizer over
config. This PR aims to make initialization behavior for both optimizer
and lr scheduler consistent where if lr scheduler is defined in the
code, then it will overwrite config.
---------
Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>
Describe the issue
note: this is not a bug but a inconsistent design.
In the current implementation of the learning rate (LR) scheduler configuration in DeepSpeed, the LR scheduler is always initialized from the configuration file if it is defined there, regardless of whether a scheduler is provided programmatically. This behavior leads to inconsistency with the optimizer, which can be overwritten programmatically even if it is defined in the configuration file.
Here is the current implementation of the LR scheduler configuration:
Expected behavior
We should be able to overwrite the lr scheduler defined in config.
Ideally would prefer something like:
Why does the current design enforce the initialization of the LR scheduler from the configuration file if it is defined there, while allowing the optimizer to be overwritten programmatically?
The text was updated successfully, but these errors were encountered: