Skip to content

Commit

Permalink
wvsplitk templatized and better tuned for MI300 (opendatahub-io#132)
Browse files Browse the repository at this point in the history
* improvements to wvSpltK

* wvsplt gemm; better handle MI300 and large A[] sizes

* lint fix

* Adjustments to better handle small weights in TP8.

* early-out bug fix

* better wave load balancing in wvSplt

* add missing skip for wvsplt_big

* Bug fix for wvSplt_big in load balancing at M4, lint fix.
  • Loading branch information
amd-hhashemi authored Aug 16, 2024
1 parent 7382dd5 commit cfab178
Showing 1 changed file with 420 additions and 1,079 deletions.
Loading

0 comments on commit cfab178

Please sign in to comment.