Skip to content

Bugfixes & perf improvement for `memory_efficient_attention`

Compare
Choose a tag to compare
@danthe3rd danthe3rd released this 28 Apr 08:35
· 441 commits to main since this release

[0.0.19] - 2023-04-28

Added

  • Display nvcc version used to compile xformers in python -m xformers.info

Fixed

  • Fixed performance regression with nvcc>11.6 (#712)
  • fMHA/cutlass: Fixed nan in the output when using a torch.Tensor with -inf prefixes as attn_bias (#722)
  • fMHA/cutlass: Fixed nan in the output when the sequence length is larger than 2 ** 15 (#719)
  • fMHA/cutlass: Significative performance improvements (up to 2x) for both the forward pass and backward pass
  • fMHA/cutlass: The kernel are now deterministic
  • fMHA/cutlass: Fixed backward pass correctness when using dropout (#724)