Skip to content

V4.7.0 Performance improvements, bug fixes, add assembly hpa_hgemm, initial source hpa_igemm

Compare
Choose a tag to compare
@amcamd amcamd released this 19 Dec 19:35
· 3478 commits to master since this release

Features

  • add dot2 instructions for fp16/fp32 hpa_hgemm on gfx906
  • initial i8/i32 hpa_igemm
  • enable fractional loads
  • enable precise bounds check