-
-
Notifications
You must be signed in to change notification settings - Fork 972
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support for custom lr groups for non-embedding modules #2213
Conversation
src/axolotl/core/trainer_builder.py
Outdated
if lr_groups_lookup and any( | ||
group_modules in name for group_modules in lr_groups_lookup | ||
): | ||
lr_group_module = [ | ||
group_modules | ||
for group_modules in lr_groups_lookup | ||
if group_modules in name | ||
][0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can refactor the check a bit. It seems like we're creating the list twice. The if check is not necessary as the variable would be empty otherwise
Perhaps
lr_group_modules = [group_modules for group_modules in lr_groups_lookup if group_modules in name]
if any(lr_group_modules):
# use lr_group_modules
src/axolotl/core/trainer_builder.py
Outdated
group_modules | ||
for group_modules in lr_groups_lookup | ||
if group_modules in name | ||
][0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason we're taking index 0 only? Should we add a check that assert len(..) == 1
?
invert name check for group modules include lr_groups in training args additional conditional for creating optimizer fix regular params as w weight decay fix lookup and add docs
…oud#2213) * support for custom lr groups for non-embedding modules invert name check for group modules include lr_groups in training args additional conditional for creating optimizer fix regular params as w weight decay fix lookup and add docs * address pr feedback
No description provided.