Q4_1 quantization compiling to vmfb megacommit #2

Max191 · 2024-02-22T23:20:35Z

No description provided.

Max191 · 2024-02-22T23:22:17Z

I should split this into multiple PRs, but my commits accidentally got too jumbled up so I just squashed everything :P

I'll split it up tomorrow, but I'll leave this PR here in case anyone wants to see it or cherry pick it

Max191 · 2024-02-22T23:56:32Z

nod-ai/SHARK-ModelDev#473 is also needed for this PR

stellaraccident

It's fine. Prototype code we'll clean it in a future revision.

Q4_1 quantization compiling to vmfb megacommit

4d9634b

Max191 marked this pull request as draft February 22, 2024 23:22

Max191 mentioned this pull request Feb 22, 2024

Add additional matching logic to MMGroupQuantRewriterPass nod-ai/SHARK-ModelDev#473

Open

stellaraccident approved these changes Feb 23, 2024

View reviewed changes

stellaraccident marked this pull request as ready for review February 23, 2024 00:14

stellaraccident merged commit e2189c7 into stellaraccident:main Feb 23, 2024

dmahurin pushed a commit to persimmonsai/mlir-llm-runner that referenced this pull request Jun 6, 2024

Q4_1 quantization compiling to vmfb megacommit (stellaraccident#2)

d847ed7

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Q4_1 quantization compiling to vmfb megacommit #2

Q4_1 quantization compiling to vmfb megacommit #2

Max191 commented Feb 22, 2024

Max191 commented Feb 22, 2024

Max191 commented Feb 22, 2024

stellaraccident left a comment

Q4_1 quantization compiling to vmfb megacommit #2

Q4_1 quantization compiling to vmfb megacommit #2

Conversation

Max191 commented Feb 22, 2024

Max191 commented Feb 22, 2024

Max191 commented Feb 22, 2024

stellaraccident left a comment

Choose a reason for hiding this comment