[perf] Auto causal <> sparse #18

blefaudeux · 2021-10-21T00:08:41Z

🚀 Feature

When applicable, automatically use sparse or blocksparse for causal attention. Right now this requires that people use them explicitly, even if the causal flag is passed, which means that a lot of people could miss the possible optimization.

Motivation

Free perf on the table, can be a little complex to handle all cases, but would make sense to do it directly in xFormers.

Pitch

Sort out the applicable cases first, and in that case defer to the sparse or blocksparse when ScaledDotProduct is used with the causal flag

Alternatives

Warn users of this possible optim

Additional context

…configs [refactor] AttrDict removal, dataclasses all the way + handle patchy defaults

Add integration of improved Fmha-bwd

blefaudeux self-assigned this Oct 21, 2021

blefaudeux added the enhancement New feature or request label Oct 21, 2021

blefaudeux mentioned this issue Oct 21, 2021

[docs] Adding some new content, better credits, missing authors #17

Merged

10 tasks

xwhan pushed a commit to xwhan/xformers that referenced this issue Feb 8, 2022

Merge pull request facebookresearch#18 from fairinternal/dataclasses_…

e3b35aa

…configs [refactor] AttrDict removal, dataclasses all the way + handle patchy defaults

dianaml0 assigned yuanandonly Jun 6, 2022

yuanandonly mentioned this issue Jun 14, 2022

Switch to blocksparse for causal attention #334

Merged

10 tasks

tenpercent pushed a commit to tenpercent/xformers that referenced this issue Aug 15, 2024

Merge pull request facebookresearch#18 from ROCm/fa_bwd_opt_test

520e6ed

Add integration of improved Fmha-bwd

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[perf] Auto causal <> sparse #18

[perf] Auto causal <> sparse #18

blefaudeux commented Oct 21, 2021

[perf] Auto causal <> sparse #18

[perf] Auto causal <> sparse #18

Comments

blefaudeux commented Oct 21, 2021

🚀 Feature

Motivation

Pitch

Alternatives

Additional context