Skip to content
This repository was archived by the owner on Oct 11, 2024. It is now read-only.

[Bugfix] Fix marlin 2:4 kernel crash on H100 #243

Merged
merged 1 commit into from
May 16, 2024
Merged

Conversation

mgoin
Copy link
Member

@mgoin mgoin commented May 15, 2024

The reason for the crash was the inline PTX assembly that introduced the async_copy with streaming behavior. The solution is to use the more standard PTX for async_copy (without the fractional L2 policy for "evict_first"). There is no performance difference between standard async_copy PTX and the previous one.
Ported from dense marlin: vllm-project#4218

@mgoin mgoin merged commit 84a8ea1 into main May 16, 2024
12 checks passed
@mgoin mgoin deleted the marlin-24-h100-fix branch May 16, 2024 14:24
mgoin added a commit that referenced this pull request May 16, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant