Bugfixes & perf improvement for `memory_efficient_attention`

danthe3rd released this 28 Apr 08:35

· 441 commits to main since this release

8bf59c9

[0.0.19] - 2023-04-28

Added

Display nvcc version used to compile xformers in python -m xformers.info

Fixed

Fixed performance regression with nvcc>11.6 (#712)
fMHA/cutlass: Fixed nan in the output when using a torch.Tensor with -inf prefixes as attn_bias (#722)
fMHA/cutlass: Fixed nan in the output when the sequence length is larger than 2 ** 15 (#719)
fMHA/cutlass: Significative performance improvements (up to 2x) for both the forward pass and backward pass
fMHA/cutlass: The kernel are now deterministic
fMHA/cutlass: Fixed backward pass correctness when using dropout (#724)

Assets 2