Flash Attention support #78

suragnair · 2024-10-25T22:04:35Z

added support for Flash Attention layers
Added a flash_attn flag in BorzoiModel that switches to flash attention.
updated Dockerfile to use pytorch dev image, which has cudatoolkit required by flash-attn. Dockerfile includes flash-attn.
Installation is a bit more involved, so I have updated README with instructions. However, simple pip install gReLU still works if flash-attn is not required as it is only imported when called.

Solves #64

Starting from pytorch dev image instead of lightning. This is important since flash-attn requires cudatoolkit-dev which needs conda. Instead, easier to start with a dev docker container. Added install for flash-attn.

Current install should work fine for those who don't need flash-attn as flash-attn relevant imports are in the FlashAttention function itself.

for more information, see https://pre-commit.ci

Dockerfile

src/grelu/model/blocks.py

…ransformerBlock as optional and reordered them

for more information, see https://pre-commit.ci

Surag and others added 6 commits October 21, 2024 15:39

added flash attention support

6b3b2b9

removed window_size as input, always using (-1,-1) global attn

95fc6a2

moved flash attn imports inside function call to load dynamically

7efb127

Updated Dockerfile for flash-attn

e376226

Starting from pytorch dev image instead of lightning. This is important since flash-attn requires cudatoolkit-dev which needs conda. Instead, easier to start with a dev docker container. Added install for flash-attn.

Updated README with instructions for flash-attn

9d143e0

Current install should work fine for those who don't need flash-attn as flash-attn relevant imports are in the FlashAttention function itself.

more docu

c4ed4f9

suragnair requested a review from avantikalal October 25, 2024 22:04

pre-commit-ci bot and others added 5 commits October 25, 2024 22:04

[pre-commit.ci] auto fixes from pre-commit.com hooks

25484fe

for more information, see https://pre-commit.ci

Update Dockerfile

7e08c41

more docu

6ec621f

more docu

ab71689

flake8 compatibilty

c769b2d

avantikalal requested changes Oct 25, 2024

View reviewed changes

Dockerfile Show resolved Hide resolved

src/grelu/model/blocks.py Outdated Show resolved Hide resolved

src/grelu/model/blocks.py Outdated Show resolved Hide resolved

Surag and others added 2 commits October 25, 2024 17:24

using warnings.warn, set up to warn only once, set some options for T…

d31d356

…ransformerBlock as optional and reordered them

[pre-commit.ci] auto fixes from pre-commit.com hooks

55bec38

for more information, see https://pre-commit.ci

avantikalal approved these changes Oct 26, 2024

View reviewed changes

avantikalal merged commit 31a2133 into main Oct 26, 2024
2 checks passed

suragnair deleted the surag branch October 26, 2024 01:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Flash Attention support #78

Flash Attention support #78

suragnair commented Oct 25, 2024 •

edited

Loading

Flash Attention support #78

Flash Attention support #78

Conversation

suragnair commented Oct 25, 2024 • edited Loading

suragnair commented Oct 25, 2024 •

edited

Loading