Update XFormers Attention Monkeypatch to handle Llama-2 70B (GQA) #339

ssmi153 · 2023-08-05T00:45:28Z

This is a fix for this issue: #338 . It fixes the XFormersAttention monkeypatch. I've tested this with LLama-2 70B and Llama-2 13B and both are able to start training appropriately without any errors.

Notes:

I haven't yet tested this with an complete finetune from start to finish - this will take a number of days to complete.
We still also need to fix the FlashAttention monkeyscript. I've been working on this but have run into a number of errors, so I'm pushing this out in the meantime.

Updated XFormers MonkeyPatch to handle GQA as used in Llama-2 70B. All the updated code is taken directly from the Transformers library: huggingface/transformers@07360b6#diff-06392bad3b9e97be9ade60d4ac46f73b6809388f4d507c2ba1384ab872711c51 from their llama_modeling.py file.

ssmi153 · 2023-08-05T01:34:39Z

Maybe hold off on committing this - I just found that while it runs LLama-2-13b properly, the losses are completely different (starts at 8 rather than starting at 1). I need to look more closely into why this is, as it shouldn't be changing anything for the 13B model.

Command had accidentally been moved out of if-else block.

ssmi153 · 2023-08-05T06:02:55Z

OK, this most recent commit fixes the difference in losses. It was a whitespace issue - darn Python! This now performs exactly like it did previously for 13B (and 7B) models without GQA, and also now works for 70B models with GQA. It's ready for review and merging.

winglian

thank you!

winglian · 2023-08-05T17:42:00Z

@ssmi153 looks like there are some pre-commit formatting issues. pre-commit run --all-files should fix it

ssmi153 · 2023-08-06T01:27:18Z

OK, I'll look into this. Struggling to get pre-commit to work on Windows at the moment. I'll sort this out as soon as I can.

…

On Sun, 6 Aug 2023 at 05:42, Wing Lian ***@***.***> wrote: @ssmi153 <https://github.com/ssmi153> looks like there are some pre-commit formatting issues. pre-commit run --all-files should fix it — Reply to this email directly, view it on GitHub <#339 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/A6ZBKFH2EMYX6EKTCW43WULXT2AXFANCNFSM6AAAAAA3E25PJM> . You are receiving this because you were mentioned.Message ID: ***@***.***>

winglian · 2023-08-06T02:48:26Z

@ssmi153 you could cherry-pick this commit 9793faf

@winglian

Thanks to @winglian

ssmi153 · 2023-08-06T04:46:30Z

Champion! I've pulled your changes into the pull request (in probably in the most awkward possible way, sorry - it's been a few years since I last commited code, and our last setup used mercurial, so I'm just getting my head around Github's workflows).

winglian · 2023-08-06T15:08:59Z

Thanks @ssmi153. This is very much appreciated and wanted.

@winglian

…olotl-ai-cloud#339) * Fix XFormers attention for Llama-2 70B (GQA) Updated XFormers MonkeyPatch to handle GQA as used in Llama-2 70B. All the updated code is taken directly from the Transformers library: huggingface/transformers@07360b6#diff-06392bad3b9e97be9ade60d4ac46f73b6809388f4d507c2ba1384ab872711c51 from their llama_modeling.py file. * Catch configs without pretraining_tp * Whitespace bug fix Command had accidentally been moved out of if-else block. * pre-commit formatting fixes Thanks to @winglian

@winglian

* Fix XFormers attention for Llama-2 70B (GQA) Updated XFormers MonkeyPatch to handle GQA as used in Llama-2 70B. All the updated code is taken directly from the Transformers library: huggingface/transformers@07360b6#diff-06392bad3b9e97be9ade60d4ac46f73b6809388f4d507c2ba1384ab872711c51 from their llama_modeling.py file. * Catch configs without pretraining_tp * Whitespace bug fix Command had accidentally been moved out of if-else block. * pre-commit formatting fixes Thanks to @winglian

ssmi153 added 2 commits August 5, 2023 11:01

Catch configs without pretraining_tp

1fed74b

Whitespace bug fix

64852ae

Command had accidentally been moved out of if-else block.

ssmi153 mentioned this pull request Aug 5, 2023

QLoRA finetuning of Llama-2 70B not working (GQA mismatch) #338

Closed

winglian approved these changes Aug 5, 2023

View reviewed changes

pre-commit formatting fixes

f37bdd2

Thanks to @winglian

winglian merged commit 10405b9 into axolotl-ai-cloud:main Aug 6, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update XFormers Attention Monkeypatch to handle Llama-2 70B (GQA) #339

Update XFormers Attention Monkeypatch to handle Llama-2 70B (GQA) #339

ssmi153 commented Aug 5, 2023

ssmi153 commented Aug 5, 2023

ssmi153 commented Aug 5, 2023

winglian left a comment

winglian commented Aug 5, 2023

ssmi153 commented Aug 6, 2023 via email

winglian commented Aug 6, 2023

ssmi153 commented Aug 6, 2023

winglian commented Aug 6, 2023

Update XFormers Attention Monkeypatch to handle Llama-2 70B (GQA) #339

Update XFormers Attention Monkeypatch to handle Llama-2 70B (GQA) #339

Conversation

ssmi153 commented Aug 5, 2023

ssmi153 commented Aug 5, 2023

ssmi153 commented Aug 5, 2023

winglian left a comment

Choose a reason for hiding this comment

winglian commented Aug 5, 2023

ssmi153 commented Aug 6, 2023 via email

winglian commented Aug 6, 2023

ssmi153 commented Aug 6, 2023

winglian commented Aug 6, 2023