Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ragged paged attention #8659

Merged
merged 9 commits into from
Feb 4, 2025
Merged

Conversation

vanbasten23
Copy link
Collaborator

@vanbasten23 vanbasten23 commented Jan 31, 2025

Test plan:

LIBTPU_INIT_ARGS=--xla_tpu_scoped_vmem_limit_kib=65536  python /workspaces/persist/pytorch/xla/test/test_ragged_paged_attention_kernel.py 2>&1 | tee out.txt

cc: @miladm

@bythew3i
Copy link

Test plan:

LIBTPU_INIT_ARGS=--xla_tpu_scoped_vmem_limit_kib=65536  python /workspaces/persist/pytorch/xla/test/test_ragged_paged_attention_kernel.py 2>&1 | tee out.txt

How is 65536 calculated?

@vanbasten23
Copy link
Collaborator Author

Test plan:

LIBTPU_INIT_ARGS=--xla_tpu_scoped_vmem_limit_kib=65536  python /workspaces/persist/pytorch/xla/test/test_ragged_paged_attention_kernel.py 2>&1 | tee out.txt

How is 65536 calculated?

I found a ticket and someone uses it. I remember the number is the vmem limit on a TPU generation.

@vanbasten23 vanbasten23 force-pushed the xiowei/add_ragged_paged_attention branch from ad2f87c to 9e4b227 Compare February 1, 2025 00:32
@vanbasten23 vanbasten23 force-pushed the xiowei/add_ragged_paged_attention branch from 9e4b227 to 7fe5071 Compare February 3, 2025 05:41
@vanbasten23 vanbasten23 requested review from lsy323 and miladm February 3, 2025 21:31
@vanbasten23
Copy link
Collaborator Author

Build and test / CPU tests / test (benchmark_tests) failure is irrelevant to this PR. (A PR #8668 without any changes also fails this)

@miladm
Copy link
Collaborator

miladm commented Feb 3, 2025

cc onduty @lsy323 to assist with the CI test failure before we merge @vanbasten23

@vanbasten23 vanbasten23 merged commit 8480094 into master Feb 4, 2025
11 of 12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants