-
Notifications
You must be signed in to change notification settings - Fork 207
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove input_quant_func from AffineQuantizedTensor subclass #243
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/243
Note: Links to docs will display an error until the docs builds have been completed. ❌ 1 New FailureAs of commit 166353f with merge base cae3d82 (): This comment was automatically generated by Dr. CI and updates every 15 minutes. |
Summary: Currently we have a input_quant_func in the AffineQuantizedTensor, which is a bit convoluted, we want to use a separate LinearActAffineQuantizedTensor subclass for activation quantization (dynamic quantization) instead Test Plan: python test/quantization/test_quant_api.py -k test_quantized_tensor_subclass_8da4w Reviewers: Subscribers: Tasks: Tags:
Summary: This PR added dispatch for int8act-int8 weight dynamic quantization that's calling `int_scaled_matmul` kernel in the end Test Plan: python test/quantization/test_quant_api.py -k test_quantized_tensor_subclass_int8_dyn_quant Reviewers: Subscribers: Tasks: Tags:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great :) Let's move AffineQuantizedTensor into dtypes next and create a PyTorch style conversion function? We should also not need to use torch_function to overwrite linear
, but it makes sense to do it as a follow up because it'll require us to add support for detach, view, addmm, etc. to AffineQuantizedTensor
sounds good. main thing is transpose, we need to think about how to support that with the scales/zero_point and block_size arg |
) * Remove input_quant_func from AffineQuantizedTensor subclass Summary: Currently we have a input_quant_func in the AffineQuantizedTensor, which is a bit convoluted, we want to use a separate LinearActAffineQuantizedTensor subclass for activation quantization (dynamic quantization) instead Test Plan: python test/quantization/test_quant_api.py -k test_quantized_tensor_subclass_8da4w Reviewers: Subscribers: Tasks: Tags: * Add dispatch for dynamic quantization in `AffineQuantizedTensor` Summary: This PR added dispatch for int8act-int8 weight dynamic quantization that's calling `int_scaled_matmul` kernel in the end Test Plan: python test/quantization/test_quant_api.py -k test_quantized_tensor_subclass_int8_dyn_quant Reviewers: Subscribers: Tasks: Tags: * Fix test
) * Remove input_quant_func from AffineQuantizedTensor subclass Summary: Currently we have a input_quant_func in the AffineQuantizedTensor, which is a bit convoluted, we want to use a separate LinearActAffineQuantizedTensor subclass for activation quantization (dynamic quantization) instead Test Plan: python test/quantization/test_quant_api.py -k test_quantized_tensor_subclass_8da4w Reviewers: Subscribers: Tasks: Tags: * Add dispatch for dynamic quantization in `AffineQuantizedTensor` Summary: This PR added dispatch for int8act-int8 weight dynamic quantization that's calling `int_scaled_matmul` kernel in the end Test Plan: python test/quantization/test_quant_api.py -k test_quantized_tensor_subclass_int8_dyn_quant Reviewers: Subscribers: Tasks: Tags: * Fix test
Summary:
Currently we have a input_quant_func in the AffineQuantizedTensor, which is a bit convoluted, we want to use a separate LinearActAffineQuantizedTensor subclass for activation quantization (dynamic quantization) instead
also added dispatch for int8act-int8 weight dynamic quantization that's calling
int_scaled_matmul
kernel in the endTest Plan:
python test/quantization/test_quant_api.py -k test_quantized_tensor_subclass_8da4w
python test/quantization/test_quant_api.py -k test_quantized_tensor_subclass_int8_dyn_quant
Reviewers:
Subscribers:
Tasks:
Tags: