CUDA: generalize FP16 fattn vec kernel #7061
+351
−197
Merged
Loading