-
Notifications
You must be signed in to change notification settings - Fork 27.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added use_liger flag to Trainer #32889
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work @shimizust ! Very excited to have this on transformers Trainer API ! Could you fix the CI tests by typing make style
or/and rebase from main + add a few simple tests to check that layers were indeed replaced and check that we are able to train without errors (check the TrainerIntegrationTest
in test_trainer.py file )?
docs/source/en/trainer.md
Outdated
) | ||
``` | ||
|
||
Currently, the kernel supports the Llama, Gemma, Mistral, and Mixtral model architectures. A list of supported models is defined [here](https://github.com/linkedin/Liger-Kernel/blob/main/src/liger_kernel/transformers/monkey_patch.py#L7). When use_liger is set to True, the corresponding layers in the original model will be patched with Liger's efficient implementation, so you don't need to do anything extra other than setting the argument value. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it is better to link the readme and put the supported models there ? Right now, you are linking to a specific line in the monkey_patch
file, not sure if this will change in the future.
model_type = getattr(model, "config", None) and getattr(model.config, "model_type", None): | ||
if model_type: | ||
# Monkey patch the model with liger kernels. Use the default kernel configurations. | ||
_apply_liger_kernel(model_type=model_type) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice !
Quick question @shimizust , I see that there is already another PR opened here. Which one should we focus on ? |
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
Co-authored-by: Byron Hsu <byronhsu1230@gmail.com>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
Co-authored-by: Byron Hsu <byronhsu1230@gmail.com>
Co-authored-by: Byron Hsu <byronhsu1230@gmail.com>
Hi @SunMarc thanks for the review! I'm going to incorporate your suggestions and merge into the original PR and close this one |
Using this PR instead: #32860 |
What does this PR do?
Fixes # (issue)
Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.