-
-
Notifications
You must be signed in to change notification settings - Fork 972
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update XFormers Attention Monkeypatch to handle Llama-2 70B (GQA) #339
Conversation
Updated XFormers MonkeyPatch to handle GQA as used in Llama-2 70B. All the updated code is taken directly from the Transformers library: huggingface/transformers@07360b6#diff-06392bad3b9e97be9ade60d4ac46f73b6809388f4d507c2ba1384ab872711c51 from their llama_modeling.py file.
Maybe hold off on committing this - I just found that while it runs LLama-2-13b properly, the losses are completely different (starts at 8 rather than starting at 1). I need to look more closely into why this is, as it shouldn't be changing anything for the 13B model. |
Command had accidentally been moved out of if-else block.
OK, this most recent commit fixes the difference in losses. It was a whitespace issue - darn Python! This now performs exactly like it did previously for 13B (and 7B) models without GQA, and also now works for 70B models with GQA. It's ready for review and merging. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you!
@ssmi153 looks like there are some pre-commit formatting issues. |
OK, I'll look into this. Struggling to get pre-commit to work on Windows at
the moment. I'll sort this out as soon as I can.
…On Sun, 6 Aug 2023 at 05:42, Wing Lian ***@***.***> wrote:
@ssmi153 <https://github.com/ssmi153> looks like there are some
pre-commit formatting issues. pre-commit run --all-files should fix it
—
Reply to this email directly, view it on GitHub
<#339 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/A6ZBKFH2EMYX6EKTCW43WULXT2AXFANCNFSM6AAAAAA3E25PJM>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Thanks to @winglian
Champion! I've pulled your changes into the pull request (in probably in the most awkward possible way, sorry - it's been a few years since I last commited code, and our last setup used mercurial, so I'm just getting my head around Github's workflows). |
Thanks @ssmi153. This is very much appreciated and wanted. |
…olotl-ai-cloud#339) * Fix XFormers attention for Llama-2 70B (GQA) Updated XFormers MonkeyPatch to handle GQA as used in Llama-2 70B. All the updated code is taken directly from the Transformers library: huggingface/transformers@07360b6#diff-06392bad3b9e97be9ade60d4ac46f73b6809388f4d507c2ba1384ab872711c51 from their llama_modeling.py file. * Catch configs without pretraining_tp * Whitespace bug fix Command had accidentally been moved out of if-else block. * pre-commit formatting fixes Thanks to @winglian
* Fix XFormers attention for Llama-2 70B (GQA) Updated XFormers MonkeyPatch to handle GQA as used in Llama-2 70B. All the updated code is taken directly from the Transformers library: huggingface/transformers@07360b6#diff-06392bad3b9e97be9ade60d4ac46f73b6809388f4d507c2ba1384ab872711c51 from their llama_modeling.py file. * Catch configs without pretraining_tp * Whitespace bug fix Command had accidentally been moved out of if-else block. * pre-commit formatting fixes Thanks to @winglian
This is a fix for this issue: #338 . It fixes the XFormersAttention monkeypatch. I've tested this with LLama-2 70B and Llama-2 13B and both are able to start training appropriately without any errors.
Notes: