You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When applicable, automatically use sparse or blocksparse for causal attention. Right now this requires that people use them explicitly, even if the causal flag is passed, which means that a lot of people could miss the possible optimization.
Motivation
Free perf on the table, can be a little complex to handle all cases, but would make sense to do it directly in xFormers.
Pitch
Sort out the applicable cases first, and in that case defer to the sparse or blocksparse when ScaledDotProduct is used with the causal flag
Alternatives
Warn users of this possible optim
Additional context
The text was updated successfully, but these errors were encountered:
🚀 Feature
When applicable, automatically use sparse or blocksparse for causal attention. Right now this requires that people use them explicitly, even if the causal flag is passed, which means that a lot of people could miss the possible optimization.
Motivation
Free perf on the table, can be a little complex to handle all cases, but would make sense to do it directly in xFormers.
Pitch
Sort out the applicable cases first, and in that case defer to the sparse or blocksparse when ScaledDotProduct is used with the causal flag
Alternatives
Warn users of this possible optim
Additional context
The text was updated successfully, but these errors were encountered: