Fix usage of unpad_input function (huggingface#35925)

Fix usage of unpad_function See huggingface#35899 In the [commit](Dao-AILab/flash-attention@cdbbe84) return type of `unpad_input` was changed. Now the code support older and newer versions Co-authored-by: Pavel Gein <pavel.gein@gmail.com>
sbucaille · Feb 14, 2025 · 8613d09 · 8613d09
1 parent 7345ee2
commit 8613d09
Showing 1 changed file with 1 addition and 1 deletion.
diff --git a/src/transformers/modeling_flash_attention_utils.py b/src/transformers/modeling_flash_attention_utils.py
@@ -121,7 +121,7 @@ def _upad_input(
     else:
         # The -q_len: slice assumes left padding.
         attention_mask = attention_mask[:, -query_length:]
-        query_layer, indices_q, cu_seqlens_q, max_seqlen_in_batch_q = unpad_input(query_layer, attention_mask)
+        query_layer, indices_q, cu_seqlens_q, max_seqlen_in_batch_q, *_ = unpad_input(query_layer, attention_mask)
 
     return (
         query_layer,