Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flash Attention support #78

Merged
merged 13 commits into from
Oct 26, 2024
Merged

Flash Attention support #78

merged 13 commits into from
Oct 26, 2024

Conversation

suragnair
Copy link
Collaborator

@suragnair suragnair commented Oct 25, 2024

  • added support for Flash Attention layers
  • Added a flash_attn flag in BorzoiModel that switches to flash attention.
  • updated Dockerfile to use pytorch dev image, which has cudatoolkit required by flash-attn. Dockerfile includes flash-attn.
  • Installation is a bit more involved, so I have updated README with instructions. However, simple pip install gReLU still works if flash-attn is not required as it is only imported when called.

Solves #64

Surag and others added 6 commits October 21, 2024 15:39
Starting from pytorch dev image instead of lightning. This is
important since flash-attn requires cudatoolkit-dev which needs conda.
Instead, easier to start with a dev docker container. Added install
for flash-attn.
Current install should work fine for those who don't need flash-attn
as flash-attn relevant imports are in the FlashAttention function
itself.
@suragnair suragnair requested a review from avantikalal October 25, 2024 22:04
Dockerfile Show resolved Hide resolved
src/grelu/model/blocks.py Outdated Show resolved Hide resolved
src/grelu/model/blocks.py Outdated Show resolved Hide resolved
Surag and others added 2 commits October 25, 2024 17:24
@avantikalal avantikalal merged commit 31a2133 into main Oct 26, 2024
2 checks passed
@suragnair suragnair deleted the surag branch October 26, 2024 01:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants