Skip to content

v0.0.10

Compare
Choose a tag to compare
@awni awni released this 18 Jan 20:02
· 665 commits to main since this release
f6e911c

Highlights:

  • Faster matmul: up to 2.5x faster for certain sizes, benchmarks
  • Fused matmul + addition (for faster linear layers)

Core

  • Quantization supports sizes other than multiples of 32
  • Faster GEMM (matmul)
  • ADMM primitive (fused addition and matmul)
  • mx.isnan, mx.isinf, isposinf, isneginf
  • mx.tile
  • VJPs for scatter_min and scatter_max
  • Multi output split primitive

NN

  • Losses: Gaussian negative log-likelihood

Misc

  • Performance enhancements for graph evaluation with lots of outputs
  • Default PRNG seed is based on current time instead of 0
  • Primitive VJP takes output as input. Reduces redundant work without need for simplification
  • PRNGs default seed based on system time rather than fixed to 0
  • Format boolean printing in Python style when in Python

Bugfixes

  • Scatter < 32 bit precision and integer overflow fix
  • Overflow with mx.eye
  • Report Metal out of memory issues instead of silent failure
  • Change mx.round to follow NumPy which rounds to even