Skip to content

CUDA: add FP32 FlashAttention vector kernel #11682

CUDA: add FP32 FlashAttention vector kernel

CUDA: add FP32 FlashAttention vector kernel #11682

windows-latest-cmake (noavx, -DLLAMA_NATIVE=OFF -DLLAMA_BUILD_SERVER=ON -DLLAMA_AVX=OFF -DLLAMA_A...

succeeded May 11, 2024 in 6m 5s