-
Notifications
You must be signed in to change notification settings - Fork 262
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
support peft model quantization with SmoothQuant (#1282)
Peft model will use below arch: Linears in Linear. This pull request supports this arch with smoothquant. ``` (v): Linear( in_features=32, out_features=32, bias=False (lora_dropout): ModuleDict( (default): Dropout(p=0.1, inplace=False) ) (lora_A): ModuleDict( (default): Linear(in_features=32, out_features=8, bias=False) ) (lora_B): ModuleDict( (default): Linear(in_features=8, out_features=32, bias=False) ) (lora_embedding_A): ParameterDict() (lora_embedding_B): ParameterDict() ``` BTW, when IPEX version<=1.13, HistogramObserver doesn't support asym scheme, the zero_point is 0 for asym uint8, while the MinMaxObserver works well. Also, IPEX SmoothQuant Observer can only use save/load_qconf_summary once. The save_qconf_summary API will freeze the scale used in model and calibration won't work anymore. The load_qconf_summary will overwrite the scales used in model but only work in the first call. Here we implement normal observer to workaround this issue. --------- Signed-off-by: changwangss <chang1.wang@intel.com> Signed-off-by: Xin He <xin3.he@intel.com> Signed-off-by: y <xin3.he@intel.com> Signed-off-by: chensuyue <suyue.chen@intel.com> Co-authored-by: changwangss <chang1.wang@intel.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: chen, suyue <suyue.chen@intel.com>
- Loading branch information
1 parent
21668df
commit 5e21b70
Showing
8 changed files
with
15,163 additions
and
165 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.