fixing bug in GPTQ (pytorch#120)

* fixing bug in GPTQ Summary: shape was always padded even when not needed. Test Plan: pythont test/quantization/test_quant_api.py -k "test_gptq_quantizer_int4wo" Reviewers: Subscribers: Tasks: Tags: * removing extra spaces Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:
dbyoung18 · Apr 4, 2024 · ac76174 · ac76174
1 parent 12f1080
commit ac76174
Showing 1 changed file with 4 additions and 1 deletion.
diff --git a/torchao/quantization/GPTQ.py b/torchao/quantization/GPTQ.py
@@ -950,7 +950,10 @@ def __init__(
             # TODO: this is the gpt-fast version, merge with the main version later
             def make_names_and_values_dict_func(q, qparams):
                 k = q.shape[1]
-                new_k = find_multiple(k, 1024)
+                if not _check_linear_int4_k(k, groupsize):
+                    new_k = find_multiple(k, 1024)
+                else:
+                    new_k = k
                 # how much we need to pad the weight
                 delta_k = new_k - q.shape[1]
                 q = q.to(torch.int32)