Skip to content

CUDA: add FP32 FlashAttention vector kernel #1324

CUDA: add FP32 FlashAttention vector kernel

CUDA: add FP32 FlashAttention vector kernel #1324

Annotations

5 warnings

bench-server-baseline (phi-2, q4_0)

succeeded May 11, 2024 in 14m 10s