Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CPU] Decompression support for hybrid models #20973

Conversation

v-Golubev
Copy link
Contributor

@v-Golubev v-Golubev commented Nov 8, 2023

This PR contains changes which add decompression support for the models which have both compressed and quantized layers.

Details:

  • Zero point with convert is supported in CPU specific transformations and optimizations
  • Decompression related passes in common transformations part moved to a separate manager
  • Added decompression related callbacks for Cleanup LPT to avoid decompression subgraph folding at the LPT stage

Tickets:

@github-actions github-actions bot added category: CPU OpenVINO CPU plugin category: LP transformations OpenVINO Low Precision transformations labels Nov 8, 2023
@v-Golubev v-Golubev changed the title Vg/cpu/hybrid decompression quantization support [CPU] Decompression support for hybrid models Nov 8, 2023
@v-Golubev v-Golubev marked this pull request as ready for review November 9, 2023 10:09
@v-Golubev v-Golubev requested review from a team as code owners November 9, 2023 10:09
@v-Golubev v-Golubev force-pushed the vg/cpu/hybrid_decompression_quantization_support branch from 07d5d01 to ac7b93b Compare November 9, 2023 10:09
@dmitry-gorokhov dmitry-gorokhov added this to the 2023.3 milestone Nov 9, 2023
@v-Golubev v-Golubev force-pushed the vg/cpu/hybrid_decompression_quantization_support branch 2 times, most recently from ee62151 to c1b8244 Compare November 9, 2023 20:24
@dmitry-gorokhov dmitry-gorokhov merged commit b43b9f9 into openvinotoolkit:master Nov 14, 2023
66 checks passed
byungilm pushed a commit to byungilm/openvino that referenced this pull request Nov 17, 2023
allnes pushed a commit to allnes/openvino that referenced this pull request Nov 23, 2023
github-merge-queue bot pushed a commit that referenced this pull request Oct 25, 2024
### Details:
- set LPT callbacks to handle compression and avoid constant folding for
it (taken from #20973)
 - Allow u8/i8 output data type for compressed onednn FC
- Disable Dequantize propagation through Transpose if it's a dependency
of SDPA to keep Transpose+SDPA fusion
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: CPU OpenVINO CPU plugin category: LP transformations OpenVINO Low Precision transformations
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants