Add sparsify API to torchao #473

jcaip · 2024-07-03T21:14:11Z

This PR removes the old apply_sparse_semi_structured API in favor of sparsify, so that we are more inline with the existing quantize API.

I also updated the README so that it's a bit more clear that sparsity doesn't require intrusive code changes and added a bit of example code.

pytorch-bot · 2024-07-03T21:14:14Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/473

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure, 1 Pending

As of commit 950c476 with merge base a895699 ():

NEW FAILURE - The following job has failed:

Run Regression Tests / test (CPU 2.2.2, linux.4xlarge, torch==2.2.2 --index-url https://download.pytorch.org/whl/cpu, cpu) / linux-job (gh)
test/integration/test_integration.py::SmoothquantIntegrationTest::test_on_dummy_distilbert

This comment was automatically generated by Dr. CI and updates every 15 minutes.

jerryzh168 · 2024-07-03T21:39:17Z

README.md

-In some cases we rewrote popular GenAI models to be significantly faster in native PyTorch as in no C++/CUDA to achieve at the time SOTA inference performance. These involve more intrusive code changes.
+```python
+from torchao.sparsity import sparsify
+from torchao.sparsity.prototype.dynamic_quant_sparse import int8_dynamic_activation_int8_2x4_sparse_weight


what's the plan to move this out of prototype? is this related to composing sparsity and quant properly?

Yeah I plan to move this out as part of 0.4, implementing a layout like here: https://github.com/pytorch/ao/compare/jcaip/affine-quantize-sparse?expand=1

This works, but I am running into a performance regression so need to debug that first before we can merge.

jerryzh168 · 2024-07-03T21:40:29Z

README.md


-#### With intrusive code changes
+m = sparsify(m, to_sparse_semi_structured)


also this is using to_sparse_semi_structured while the other one is using int8_dynamic_activation_int8_2x4_sparse_weight() which might be a bit confusing, I'd suggest to just align

OK ill add a semi_sparse_weight() wrapper function.

msaroufim

Cool! Minor questions above

msaroufim · 2024-07-03T21:40:48Z

test/sparsity/test_sparse_api.py

@@ -37,7 +38,7 @@ def test_sparse(self):
        apply_fake_sparsity(model)
        dense_result = model(input)

-        apply_sparse_semi_structured(model)
+        model = sparsify(model, to_sparse_semi_structured)


is sparsify an in place op? This came up recently since quantize is in place and here it looks like the api used to be in place but now its not

Yeah, I think there's some discussion on whether we can use quantize_ or quantize for the in-place op, I'm not sure if we came to a conclusion. cc @jerryzh168 do you have a preference for what to use here?

we can use the inplace version for now I think

@jcaip I think we can change this to sparsify_ to be consistent with quantization: #467

msaroufim · 2024-07-03T21:43:51Z

README.md

-In some cases we rewrote popular GenAI models to be significantly faster in native PyTorch as in no C++/CUDA to achieve at the time SOTA inference performance. These involve more intrusive code changes.
+```python
+from torchao.sparsity import sparsify
+from torchao.sparsity.prototype.dynamic_quant_sparse import int8_dynamic_activation_int8_2x4_sparse_weight


We might have briefly chatted about this when we were discussing the quantize api but just thinking out loud here

If I add quantize(sparsify(m)) is that different from sparsify(quantize(m)) and if so in what order if any are optimizations applied in this case?

Right now, the composition of int8 quantization and 2:4 sparsity is treated as it's own distinct technique, so you can either go:

quantize(int8dynamic + 2:4 sparse) or sparsify(int8dynamic + 2:4 sparse). Once we implement sparsity as a AQTLayout we can add support for a "composable" API, where we go

quantize(int8 dynamic) sparsify(to_sparse_semi_structured)

or vice versa.

Mathematically, the how you apply the optimizations will matter, but I think we should make them the so it doesn't matter for our API, for two reasons:

currently only quantize -> sparsify is supported, it would be extra work to support sparsify -> quantize.

One of these orderings will be "better", I can't really see a situation where the order of how you apply the optimizations will differ across different layers. So we should always just default to the "best" one and not give users an option to shoot themselves in their foot.

msaroufim · 2024-07-03T21:50:25Z

torchao/sparsity/sparse_api.py

+        m = sparsify(m, to_sparse_semi_structured, filter_fn)
+
+        # for int8 dynamic quantization + 2:4 sparsity
+        from torchao.sparity.prototype import int8_dynamic_activation_int8_2x4_sparse_weight


typo here sparity

msaroufim · 2024-07-03T21:51:23Z

torchao/sparsity/sparse_api.py

-            mod.weight = torch.nn.Parameter(to_sparse_semi_structured(mod.weight))
+    Currently, we support two options for sparsity:
+        - semi-sturctured (2:4) sparsity with `to_sparse_semi_structured`
+        - int8 dynamic quantization + 2:4 sparsity with `int8_dynamic_activation_int8_2x4_sparse_weight`, which is also available via the quantize API


this is where I'm a bi confused, it's not clear whether the quantize and sparsify apis compose reading the docstrings

msaroufim · 2024-07-03T21:51:35Z

torchao/sparsity/sparse_api.py

-        if filter_fn(mod, name):
-            mod.weight = torch.nn.Parameter(to_sparse_semi_structured(mod.weight))
+    Currently, we support two options for sparsity:
+        - semi-sturctured (2:4) sparsity with `to_sparse_semi_structured`


typo here as well sturctured

msaroufim · 2024-07-05T16:47:09Z

The failure in nightly regression is a flake with smoothquant where it tries to download something from HF, so this is safe to merge

* Add sparsify API to torchao * fix typo

Add sparsify API to torchao

4a31705

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jul 3, 2024

jerryzh168 reviewed Jul 3, 2024

View reviewed changes

jerryzh168 approved these changes Jul 3, 2024

View reviewed changes

jerryzh168 reviewed Jul 3, 2024

View reviewed changes

msaroufim requested changes Jul 3, 2024

View reviewed changes

msaroufim reviewed Jul 3, 2024

View reviewed changes

msaroufim self-requested a review July 4, 2024 03:17

msaroufim approved these changes Jul 4, 2024

View reviewed changes

fix typo

950c476

jcaip merged commit a35a1cd into main Jul 5, 2024
13 checks passed

dbyoung18 pushed a commit to dbyoung18/ao that referenced this pull request Jul 31, 2024

Add sparsify API to torchao (pytorch#473)

3bfb90e

* Add sparsify API to torchao * fix typo

yanbing-j pushed a commit to yanbing-j/ao that referenced this pull request Dec 9, 2024

Clean up CLI output (pytorch#473)

770f70c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add sparsify API to torchao #473

Add sparsify API to torchao #473

jcaip commented Jul 3, 2024

pytorch-bot bot commented Jul 3, 2024 •

edited

Loading

jerryzh168 Jul 3, 2024

jcaip Jul 3, 2024

jerryzh168 Jul 3, 2024

jcaip Jul 3, 2024

msaroufim left a comment

msaroufim Jul 3, 2024

jcaip Jul 3, 2024

jerryzh168 Jul 4, 2024

jerryzh168 Jul 8, 2024

msaroufim Jul 3, 2024

jcaip Jul 3, 2024

msaroufim Jul 3, 2024

msaroufim Jul 3, 2024

msaroufim Jul 3, 2024

msaroufim commented Jul 5, 2024


		#### With intrusive code changes
		m = sparsify(m, to_sparse_semi_structured)

Add sparsify API to torchao #473

Add sparsify API to torchao #473

Conversation

jcaip commented Jul 3, 2024

pytorch-bot bot commented Jul 3, 2024 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/473

❌ 1 New Failure, 1 Pending

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

msaroufim left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

msaroufim commented Jul 5, 2024

pytorch-bot bot commented Jul 3, 2024 •

edited

Loading