Add tensor parallelism support for int4_weight_only quantization #1120

jerryzh168 · 2024-10-18T23:35:07Z

Summary:
Following #939 we added TP support for int4_weight_only quantization in torchao that's using TensorCoreTiledLayout

Addresses one work item in #988

Also clarified docs based on #386

Also restructructured the tests in test/dtypes/test_affine_quantized_tensor_parallel.py to not depend on torchao/utils.py to reduce the jumps people have to do to understand what is tested

Test Plan:
python test/dtypes/test_affine_quantized_tensor_parallel.py

Reviewers:

Subscribers:

Tasks:

Tags:

Summary: Following pytorch#988 we added TP support for int4_weight_only quantization in torchao that's using TensorCoreTiledLayout Addresses one work item in pytorch#988 Also clarified docs based on pytorch#386 Also restructructured the tests in test/dtypes/test_affine_quantized_tensor_parallel.py to not depend on torchao/utils.py to reduce the jumps people have to do to understand what is tested Test Plan: python test/dtypes/test_affine_quantized_tensor_parallel.py Reviewers: Subscribers: Tasks: Tags:

pytorch-bot · 2024-10-18T23:35:10Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/1120

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit ca241ed with merge base 3475aed ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

…and present it in a nicely formatted manner (pytorch#1120) * add stage metrics - total params per stage, total size * PR feedback * PR feedback, typing

jerryzh168 requested review from jainapurva and msaroufim October 18, 2024 23:35

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 18, 2024

jerryzh168 requested review from drisspg, kwen2501 and HDCharles October 18, 2024 23:35

typo

ca241ed

jainapurva approved these changes Oct 19, 2024

View reviewed changes

jerryzh168 merged commit a2faafe into pytorch:main Oct 19, 2024
17 checks passed

jerryzh168 deleted the int4wo-tp branch October 19, 2024 01:11

This was referenced Oct 19, 2024

Tensor Core Layout docs is not clear #386

Closed

Tensor Parallelism Support for AffineQuantizedTensor #988

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add tensor parallelism support for int4_weight_only quantization #1120

Add tensor parallelism support for int4_weight_only quantization #1120

jerryzh168 commented Oct 18, 2024 •

edited

Loading

pytorch-bot bot commented Oct 18, 2024 •

edited

Loading

Add tensor parallelism support for int4_weight_only quantization #1120

Add tensor parallelism support for int4_weight_only quantization #1120

Conversation

jerryzh168 commented Oct 18, 2024 • edited Loading

pytorch-bot bot commented Oct 18, 2024 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/1120

✅ No Failures

jerryzh168 commented Oct 18, 2024 •

edited

Loading

pytorch-bot bot commented Oct 18, 2024 •

edited

Loading