You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have a few questions regarding the quantization of multimodal models:
1、Does the current version of AutoAWQ quantize only the language model, or does it also include the vision component for quantization?
2、What is the default calibration dataset used for quantization?
3、I noticed that the example code for Qwen2-VL uses a custom multimodal dataset. Is this dataset required for all multimodal model quantizations, or can we use the default dataset?
Thank you for your clarification!
The text was updated successfully, but these errors were encountered:
As far as I know, AWQ itself quantizes only the language model, not vision encoders. I saw the paper proposed quantizing vision encoders but not sure which paper is.
It seems the authors uses pile-val-backupdataset in there paper.
I am not 100% sure but I guess we should multimodal dataset for vlms.
Hello,
I have a few questions regarding the quantization of multimodal models:
1、Does the current version of AutoAWQ quantize only the language model, or does it also include the vision component for quantization?
2、What is the default calibration dataset used for quantization?
3、I noticed that the example code for Qwen2-VL uses a custom multimodal dataset. Is this dataset required for all multimodal model quantizations, or can we use the default dataset?
Thank you for your clarification!
The text was updated successfully, but these errors were encountered: