Add Embedding Quantization to QAT module_swap flow #886

TiRune · 2024-09-13T21:49:39Z

Summary: Adding the embedding quantizer in the same fashion as the other module swap setup.

Differential Revision: D62664322

pytorch-bot · 2024-09-13T21:49:42Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/886

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 1430e0e with merge base a4221df ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

facebook-github-bot · 2024-09-13T21:50:05Z

This pull request was exported from Phabricator. Differential Revision: D62664322

facebook-github-bot · 2024-09-16T21:23:55Z

This pull request was exported from Phabricator. Differential Revision: D62664322

Summary: Pull Request resolved: pytorch#886 Adding the embedding quantizer in the same fashion as the other module swap setup. Differential Revision: D62664322

facebook-github-bot · 2024-09-17T03:17:44Z

This pull request was exported from Phabricator. Differential Revision: D62664322

Summary: Pull Request resolved: pytorch#886 Adding the embedding quantizer in the same fashion as the other module swap setup. Differential Revision: D62664322

jerryzh168 · 2024-09-17T16:28:52Z

torchao/quantization/GPTQ.py

@@ -965,6 +965,41 @@ def forward(self, input: torch.Tensor) -> torch.Tensor:
            self.precision,
        )

+
+def _replace_embedding_4w(


I'm wondering if this can be added at the user code side, since we are planning to deprecate the module swap API

Please don't deprecate the module swap API - it's the easiest to work with and extend.
I'll likely have a headache if I wanted to make things work quickly and effectively with the tensor subclass stuff.

If you guys have a few minutes, we can discuss together how to add this to the tensor subclass stuff as well... but...

OK, I think keeping multiple implementations of the same thing might be confusing, we can gather all the requirements and decide on the long term plan I think, I'm asking Andrew to take a stab first

Summary: Pull Request resolved: pytorch#886 Adding the embedding quantizer in the same fashion as the other module swap setup. Differential Revision: D62664322

facebook-github-bot · 2024-09-17T18:48:48Z

This pull request was exported from Phabricator. Differential Revision: D62664322

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Sep 13, 2024

facebook-github-bot added the fb-exported label Sep 13, 2024

TiRune force-pushed the export-D62664322 branch from 0ac8d53 to 5259084 Compare September 16, 2024 21:24

TiRune force-pushed the export-D62664322 branch from 5259084 to a087e50 Compare September 17, 2024 03:17

jerryzh168 requested a review from andrewor14 September 17, 2024 16:28

jerryzh168 reviewed Sep 17, 2024

View reviewed changes

Add Embedding Quantization to QAT module_swap flow (pytorch#886)

1430e0e

Summary: Pull Request resolved: pytorch#886 Adding the embedding quantizer in the same fashion as the other module swap setup. Differential Revision: D62664322

TiRune force-pushed the export-D62664322 branch from a087e50 to 1430e0e Compare September 17, 2024 18:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Embedding Quantization to QAT module_swap flow #886

Add Embedding Quantization to QAT module_swap flow #886

TiRune commented Sep 13, 2024

pytorch-bot bot commented Sep 13, 2024 •

edited

Loading

facebook-github-bot commented Sep 13, 2024

facebook-github-bot commented Sep 16, 2024

facebook-github-bot commented Sep 17, 2024

jerryzh168 Sep 17, 2024

TiRune Sep 17, 2024 •

edited

Loading

jerryzh168 Sep 17, 2024

facebook-github-bot commented Sep 17, 2024

Add Embedding Quantization to QAT module_swap flow #886

Are you sure you want to change the base?

Add Embedding Quantization to QAT module_swap flow #886

Conversation

TiRune commented Sep 13, 2024

pytorch-bot bot commented Sep 13, 2024 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/886

✅ No Failures

facebook-github-bot commented Sep 13, 2024

facebook-github-bot commented Sep 16, 2024

facebook-github-bot commented Sep 17, 2024

jerryzh168 Sep 17, 2024

Choose a reason for hiding this comment

TiRune Sep 17, 2024 • edited Loading

Choose a reason for hiding this comment

jerryzh168 Sep 17, 2024

Choose a reason for hiding this comment

facebook-github-bot commented Sep 17, 2024

pytorch-bot bot commented Sep 13, 2024 •

edited

Loading

TiRune Sep 17, 2024 •

edited

Loading