diff --git a/docs/low_precision_transformations/quantization/pipelines/fake_quantize_decomposition.md b/docs/low_precision_transformations/quantization/pipelines/fake_quantize_decomposition.md
new file mode 100644
index 00000000000000..fd10859c20abe3
--- /dev/null
+++ b/docs/low_precision_transformations/quantization/pipelines/fake_quantize_decomposition.md
@@ -0,0 +1,151 @@
+# OpenVINO™ Low Precision Transformations: FakeQuantizeDecompositionTransformation pipelines
+## Table of Contents
+1. [Introduction](#introduction)
+2. [Pipeline #1: FakeQuantize decomposition](#pipeline-1-fakequantize-decomposition)
+3. [Pipeline #2: Concat per-tensor quantization](#pipeline-2-concat-per-tensor-quantization)
+4. [Pipeline #3: Concat multi-channels quantization](#pipeline-3-concat-multi-channels-quantization)
+5. [Pipeline #4: FakeQuantize connects neighbor cascade Concat operations](#pipeline-4-fakequantize-connects-neighbor-cascade-concat-operations)
+6. [Pipeline #5: AvgPool precision propagation](#pipeline-5-avgpool-precision-propagation)
+
+## Introduction
+`FakeQuantizeDecompositionTransformation` decomposes `FakeQuantize` operation on quantize (`FakeQuantize` with low precision output) and dequantization operations (`Convert`, `Subtract` and `Multiply`). `FakeQuantize` result output precision depends on:
+1. Next operation supported input precision. Customizable parameter `precisionsOnActivations` is used for identifying supported input precision.
+2. Operation output intervals.
+
+## Pipeline #1: FakeQuantize decomposition
+[NOT UPDATED]
+Features:
+1. `FakeQuantize` on activations operation output intervals are signed, default precision should be `signed int8` which is not supported by `Convolution` bellow.
+2. Quantize and dequantize operations on activations are presented by one `Fakequantize` operation.
+3. Quantize and dequantize operations on weights are presented by one `Fakequantize` operation.
+4. There is no `FakeQuantize` between `AvgPool` and `Convolution`.
+5. `Convolution` weights are quantized.
+
+> TODO: if `Convolution` is not quantized then [[input] port] requirements are not set. <= WIP
+> TODO: if operation is not precision preserved then `PRECISION_PRESERVED` attribute can be skipped. <= WIP: right now: created everywhere
+
+### Original model
+![Original model](img/pipeline1/actual.svg)
+
+### Markup precisions
+![Markup precisions result](img/pipeline1/step1_markup_precisions.svg)
+
+### Markup AvgPool precisions (CPU/GPU specific)
+![Markup AvgPool precisions (CPU/GPU specific) result](img/pipeline1/step2_markup_avg_pool_precisions.svg)
+
+### Propagate precisions
+![Propagate precisions result](img/pipeline1/step3_propagate_precisions.svg)
+
+### Transformations
+![Transformations result](img/pipeline1/transformed.svg)
+
+## Pipeline #2: Concat per-tensor quantization
+[NOT UPDATED]
+Features:
+1. `FakeQuantize` on activations operations output intervals are signed, default precision should be `signed int8` which is not supported by `Convolution` bellow.
+2. `FakeQuantize` on activations operations have different output intervals which will be aligned.
+3. Quantize and dequantize operations on activations are presented by one `Fakequantize` operation.
+4. Quantize and dequantize operations on weights are presented by one `Fakequantize` operation.
+5. There is no `FakeQuantize` between `AvgPool` and `Convolution`.
+6. `Convolution` weights are quantized.
+
+> TODO: `Convolution` operation defines `ConcatTransformation` behavior for each plugin and the behavior is not configurable.
+
+> TODO: if `Convolution` is not quantized then `FakeQuantize` are not aligned <= WIP: `MarkupPrecisions` tranformation checks each operation quantization and add empty [input [port]] requirements if operation is not quantized.
+> TODO: if `ConvolutionTransformation` is skipped ([input [port]] requirements are empty) then `FakeQuantize` are not aligned <= WIP
+> TODO: if `Convolution` operation doesn't exist then `FakeQuantize` are not aligned <= WIP
+
+### Original model
+![Original model](img/pipeline2/actual.svg)
+
+### Markup precisions
+![Markup precisions result](img/pipeline2/step1_markup_precisions.svg)
+
+### Markup AvgPool precisions (CPU/GPU specific)
+![Markup AvgPool precisions (CPU/GPU specific) result](img/pipeline2/step2_markup_avg_pool_precisions.svg)
+
+### Propagate precisions
+![Propagate precisions result](img/pipeline2/step3_propagate_precisions.svg)
+
+### Align concatization quantization
+![Align concatization quantization result](img/pipeline2/step4_align_concat_quantization.svg)
+
+### Transformations
+![Transformations result](img/pipeline2/transformed.svg)
+
+## Pipeline #3: Concat multi-channels quantization
+[NOT UPDATED]
+Features:
+1. Quantize and dequantize operations on activations are presented by one `Fakequantize` operation.
+2. There is no `FakeQuantize` between `AvgPool` and `Result`.
+
+### Original model
+![Original model](img/pipeline3/actual.svg)
+
+### Markup precisions
+![Markup precisions result](img/pipeline3/step1_markup_precisions.svg)
+
+### Markup AvgPool precisions (CPU/GPU specific)
+![Markup AvgPool precisions (CPU/GPU specific) result](img/pipeline3/step2_markup_avg_pool_precisions.svg)
+
+### Propagate precisions
+![Propagate precisions result](img/pipeline3/step3_propagate_precisions.svg)
+
+### Align concatization quantization
+![Align concatization quantization result](img/pipeline3/step4_align_concat_quantization.svg)
+
+### Transformations
+![Transformations result](img/pipeline3/transformed.svg)
+
+## Pipeline #4: FakeQuantize connects neighbor cascade Concat operations
+Features:
+1. Quantize and dequantize operations on activations are presented by one `Fakequantize` operation.
+2. There is `FakeQuantize` between two `Concat` subgraphs: the first uses multi-channel quantization, the second uses per-tensor quantization.
+
+> Source: `ConcatWithNeighborsWithConvolutionTransformation` functional test.
+
+### Original model
+![Original model](img/pipeline4/actual.svg)
+
+### Markup precisions
+![Markup precisions result](img/pipeline4/step1_markup_precisions.svg)
+
+### Markup AvgPool precisions (CPU/GPU specific)
+![Markup AvgPool precisions (CPU/GPU specific) result](img/pipeline4/step2_markup_avg_pool_precisions.svg)
+
+### Propagate precisions
+![Propagate precisions result](img/pipeline4/step3_propagate_precisions.svg)
+
+### Align concatization intervals
+![Align concatization intervals result](img/pipeline4/step4_align_concat_intervals.svg)
+
+### Align concatization quantization
+![Align concatization quantization result](img/pipeline4/step5_align_concat_quantization.svg)
+
+### Transformations
+![Transformations result](img/pipeline4/transformed.svg)
+
+## Pipeline #5: AvgPool precision propagation
+
+Features:
+1. There is `FakeQuantize` after `AvgPool`.
+
+> Source: `MarkupAvgPoolPrecisionsTransformation` functional test.
+
+### Original model
+![Original model](img/pipeline5/actual.svg)
+
+### Markup precisions
+![Markup precisions result](img/pipeline5/step1_markup_precisions.svg)
+
+### Markup AvgPool precisions (CPU/GPU specific)
+![Markup AvgPool precisions (CPU/GPU specific) result](img/pipeline5/step2_markup_avg_pool_precisions.svg)
+
+### Propagate precisions
+![Propagate precisions result](img/pipeline5/step3_propagate_precisions.svg)
+
+### Align concatization quantization
+![Align concatization quantization result](img/pipeline5/step4_align_concat_quantization.svg)
+
+### Transformations
+![Transformations result](img/pipeline5/transformed.svg)
\ No newline at end of file
diff --git a/docs/low_precision_transformations/quantization/pipelines/img/pipeline1/actual.svg b/docs/low_precision_transformations/quantization/pipelines/img/pipeline1/actual.svg
new file mode 100644
index 00000000000000..f3b33203eb1fab
--- /dev/null
+++ b/docs/low_precision_transformations/quantization/pipelines/img/pipeline1/actual.svg
@@ -0,0 +1,277 @@
+
+
+
+
+
diff --git a/docs/low_precision_transformations/quantization/pipelines/img/pipeline1/step1_markup_precisions.svg b/docs/low_precision_transformations/quantization/pipelines/img/pipeline1/step1_markup_precisions.svg
new file mode 100644
index 00000000000000..8dc984562eab82
--- /dev/null
+++ b/docs/low_precision_transformations/quantization/pipelines/img/pipeline1/step1_markup_precisions.svg
@@ -0,0 +1,283 @@
+
+
+
+
+
diff --git a/docs/low_precision_transformations/quantization/pipelines/img/pipeline1/step2_markup_avg_pool_precisions.svg b/docs/low_precision_transformations/quantization/pipelines/img/pipeline1/step2_markup_avg_pool_precisions.svg
new file mode 100644
index 00000000000000..a4aa50854eed7a
--- /dev/null
+++ b/docs/low_precision_transformations/quantization/pipelines/img/pipeline1/step2_markup_avg_pool_precisions.svg
@@ -0,0 +1,283 @@
+
+
+
+
+
diff --git a/docs/low_precision_transformations/quantization/pipelines/img/pipeline1/step3_propagate_precisions.svg b/docs/low_precision_transformations/quantization/pipelines/img/pipeline1/step3_propagate_precisions.svg
new file mode 100644
index 00000000000000..2a7e6bc8b7661e
--- /dev/null
+++ b/docs/low_precision_transformations/quantization/pipelines/img/pipeline1/step3_propagate_precisions.svg
@@ -0,0 +1,283 @@
+
+
+
+
+
diff --git a/docs/low_precision_transformations/quantization/pipelines/img/pipeline1/transformed.svg b/docs/low_precision_transformations/quantization/pipelines/img/pipeline1/transformed.svg
new file mode 100644
index 00000000000000..116422af1c3623
--- /dev/null
+++ b/docs/low_precision_transformations/quantization/pipelines/img/pipeline1/transformed.svg
@@ -0,0 +1,266 @@
+
+
+
+
+
diff --git a/docs/low_precision_transformations/quantization/pipelines/img/pipeline2/actual.svg b/docs/low_precision_transformations/quantization/pipelines/img/pipeline2/actual.svg
new file mode 100644
index 00000000000000..b59be6f9db9103
--- /dev/null
+++ b/docs/low_precision_transformations/quantization/pipelines/img/pipeline2/actual.svg
@@ -0,0 +1,433 @@
+
+
+
+
+
diff --git a/docs/low_precision_transformations/quantization/pipelines/img/pipeline2/step1_markup_precisions.svg b/docs/low_precision_transformations/quantization/pipelines/img/pipeline2/step1_markup_precisions.svg
new file mode 100644
index 00000000000000..39726b470b6f5d
--- /dev/null
+++ b/docs/low_precision_transformations/quantization/pipelines/img/pipeline2/step1_markup_precisions.svg
@@ -0,0 +1,436 @@
+
+
+
+
+
diff --git a/docs/low_precision_transformations/quantization/pipelines/img/pipeline2/step2_markup_avg_pool_precisions.svg b/docs/low_precision_transformations/quantization/pipelines/img/pipeline2/step2_markup_avg_pool_precisions.svg
new file mode 100644
index 00000000000000..e14309c2223cdb
--- /dev/null
+++ b/docs/low_precision_transformations/quantization/pipelines/img/pipeline2/step2_markup_avg_pool_precisions.svg
@@ -0,0 +1,438 @@
+
+
+
+
+
diff --git a/docs/low_precision_transformations/quantization/pipelines/img/pipeline2/step3_propagate_precisions.svg b/docs/low_precision_transformations/quantization/pipelines/img/pipeline2/step3_propagate_precisions.svg
new file mode 100644
index 00000000000000..4c7e2aa42dbfa3
--- /dev/null
+++ b/docs/low_precision_transformations/quantization/pipelines/img/pipeline2/step3_propagate_precisions.svg
@@ -0,0 +1,438 @@
+
+
+
+
+
diff --git a/docs/low_precision_transformations/quantization/pipelines/img/pipeline2/step4_align_concat_quantization.svg b/docs/low_precision_transformations/quantization/pipelines/img/pipeline2/step4_align_concat_quantization.svg
new file mode 100644
index 00000000000000..2251aed4549453
--- /dev/null
+++ b/docs/low_precision_transformations/quantization/pipelines/img/pipeline2/step4_align_concat_quantization.svg
@@ -0,0 +1,446 @@
+
+
+
+
+
diff --git a/docs/low_precision_transformations/quantization/pipelines/img/pipeline2/transformed.svg b/docs/low_precision_transformations/quantization/pipelines/img/pipeline2/transformed.svg
new file mode 100644
index 00000000000000..5a1c6f9410e202
--- /dev/null
+++ b/docs/low_precision_transformations/quantization/pipelines/img/pipeline2/transformed.svg
@@ -0,0 +1,428 @@
+
+
+
+
+
diff --git a/docs/low_precision_transformations/quantization/pipelines/img/pipeline3/actual.svg b/docs/low_precision_transformations/quantization/pipelines/img/pipeline3/actual.svg
new file mode 100644
index 00000000000000..1e41e6b4bfec31
--- /dev/null
+++ b/docs/low_precision_transformations/quantization/pipelines/img/pipeline3/actual.svg
@@ -0,0 +1,308 @@
+
+
+
+
+
diff --git a/docs/low_precision_transformations/quantization/pipelines/img/pipeline3/step1_markup_precisions.svg b/docs/low_precision_transformations/quantization/pipelines/img/pipeline3/step1_markup_precisions.svg
new file mode 100644
index 00000000000000..e040f632bdb24c
--- /dev/null
+++ b/docs/low_precision_transformations/quantization/pipelines/img/pipeline3/step1_markup_precisions.svg
@@ -0,0 +1,311 @@
+
+
+
+
+
diff --git a/docs/low_precision_transformations/quantization/pipelines/img/pipeline3/step2_markup_avg_pool_precisions.svg b/docs/low_precision_transformations/quantization/pipelines/img/pipeline3/step2_markup_avg_pool_precisions.svg
new file mode 100644
index 00000000000000..e9eb45b0573758
--- /dev/null
+++ b/docs/low_precision_transformations/quantization/pipelines/img/pipeline3/step2_markup_avg_pool_precisions.svg
@@ -0,0 +1,313 @@
+
+
+
+
+
diff --git a/docs/low_precision_transformations/quantization/pipelines/img/pipeline3/step3_propagate_precisions.svg b/docs/low_precision_transformations/quantization/pipelines/img/pipeline3/step3_propagate_precisions.svg
new file mode 100644
index 00000000000000..e9eb45b0573758
--- /dev/null
+++ b/docs/low_precision_transformations/quantization/pipelines/img/pipeline3/step3_propagate_precisions.svg
@@ -0,0 +1,313 @@
+
+
+
+
+
diff --git a/docs/low_precision_transformations/quantization/pipelines/img/pipeline3/step4_align_concat_quantization.svg b/docs/low_precision_transformations/quantization/pipelines/img/pipeline3/step4_align_concat_quantization.svg
new file mode 100644
index 00000000000000..74bd0d8bdef982
--- /dev/null
+++ b/docs/low_precision_transformations/quantization/pipelines/img/pipeline3/step4_align_concat_quantization.svg
@@ -0,0 +1,320 @@
+
+
+
+
+
diff --git a/docs/low_precision_transformations/quantization/pipelines/img/pipeline3/transformed.svg b/docs/low_precision_transformations/quantization/pipelines/img/pipeline3/transformed.svg
new file mode 100644
index 00000000000000..7a134cb8347eb0
--- /dev/null
+++ b/docs/low_precision_transformations/quantization/pipelines/img/pipeline3/transformed.svg
@@ -0,0 +1,374 @@
+
+
+
+
+
diff --git a/docs/low_precision_transformations/quantization/pipelines/img/pipeline4/actual.svg b/docs/low_precision_transformations/quantization/pipelines/img/pipeline4/actual.svg
new file mode 100644
index 00000000000000..0c17e5a9f17f06
--- /dev/null
+++ b/docs/low_precision_transformations/quantization/pipelines/img/pipeline4/actual.svg
@@ -0,0 +1,480 @@
+
+
+
+
+
diff --git a/docs/low_precision_transformations/quantization/pipelines/img/pipeline4/step1_markup_precisions.svg b/docs/low_precision_transformations/quantization/pipelines/img/pipeline4/step1_markup_precisions.svg
new file mode 100644
index 00000000000000..34dee1151691f9
--- /dev/null
+++ b/docs/low_precision_transformations/quantization/pipelines/img/pipeline4/step1_markup_precisions.svg
@@ -0,0 +1,483 @@
+
+
+
+
+
diff --git a/docs/low_precision_transformations/quantization/pipelines/img/pipeline4/step2_markup_avg_pool_precisions.svg b/docs/low_precision_transformations/quantization/pipelines/img/pipeline4/step2_markup_avg_pool_precisions.svg
new file mode 100644
index 00000000000000..34dee1151691f9
--- /dev/null
+++ b/docs/low_precision_transformations/quantization/pipelines/img/pipeline4/step2_markup_avg_pool_precisions.svg
@@ -0,0 +1,483 @@
+
+
+
+
+
diff --git a/docs/low_precision_transformations/quantization/pipelines/img/pipeline4/step3_propagate_precisions.svg b/docs/low_precision_transformations/quantization/pipelines/img/pipeline4/step3_propagate_precisions.svg
new file mode 100644
index 00000000000000..a0c2c786afbc3f
--- /dev/null
+++ b/docs/low_precision_transformations/quantization/pipelines/img/pipeline4/step3_propagate_precisions.svg
@@ -0,0 +1,486 @@
+
+
+
+
+
diff --git a/docs/low_precision_transformations/quantization/pipelines/img/pipeline4/step4_align_concat_intervals.svg b/docs/low_precision_transformations/quantization/pipelines/img/pipeline4/step4_align_concat_intervals.svg
new file mode 100644
index 00000000000000..986e7a58fb21aa
--- /dev/null
+++ b/docs/low_precision_transformations/quantization/pipelines/img/pipeline4/step4_align_concat_intervals.svg
@@ -0,0 +1,493 @@
+
+
+
+
+
diff --git a/docs/low_precision_transformations/quantization/pipelines/img/pipeline4/step5_align_concat_quantization.svg b/docs/low_precision_transformations/quantization/pipelines/img/pipeline4/step5_align_concat_quantization.svg
new file mode 100644
index 00000000000000..6444edbc0be3c2
--- /dev/null
+++ b/docs/low_precision_transformations/quantization/pipelines/img/pipeline4/step5_align_concat_quantization.svg
@@ -0,0 +1,496 @@
+
+
+
+
+
diff --git a/docs/low_precision_transformations/quantization/pipelines/img/pipeline4/transformed.svg b/docs/low_precision_transformations/quantization/pipelines/img/pipeline4/transformed.svg
new file mode 100644
index 00000000000000..29cda07e2bddef
--- /dev/null
+++ b/docs/low_precision_transformations/quantization/pipelines/img/pipeline4/transformed.svg
@@ -0,0 +1,574 @@
+
+
+
+
+
diff --git a/docs/low_precision_transformations/quantization/pipelines/img/pipeline5/actual.svg b/docs/low_precision_transformations/quantization/pipelines/img/pipeline5/actual.svg
new file mode 100644
index 00000000000000..37ea9e70967c45
--- /dev/null
+++ b/docs/low_precision_transformations/quantization/pipelines/img/pipeline5/actual.svg
@@ -0,0 +1,391 @@
+
+
+
+
+
diff --git a/docs/low_precision_transformations/quantization/pipelines/img/pipeline5/step1_markup_precisions.svg b/docs/low_precision_transformations/quantization/pipelines/img/pipeline5/step1_markup_precisions.svg
new file mode 100644
index 00000000000000..e439081e90242e
--- /dev/null
+++ b/docs/low_precision_transformations/quantization/pipelines/img/pipeline5/step1_markup_precisions.svg
@@ -0,0 +1,395 @@
+
+
+
+
+
diff --git a/docs/low_precision_transformations/quantization/pipelines/img/pipeline5/step2_markup_avg_pool_precisions.svg b/docs/low_precision_transformations/quantization/pipelines/img/pipeline5/step2_markup_avg_pool_precisions.svg
new file mode 100644
index 00000000000000..4a7df2b0890243
--- /dev/null
+++ b/docs/low_precision_transformations/quantization/pipelines/img/pipeline5/step2_markup_avg_pool_precisions.svg
@@ -0,0 +1,400 @@
+
+
+
+
+
diff --git a/docs/low_precision_transformations/quantization/pipelines/img/pipeline5/step3_propagate_precisions.svg b/docs/low_precision_transformations/quantization/pipelines/img/pipeline5/step3_propagate_precisions.svg
new file mode 100644
index 00000000000000..112c0d17f5fe91
--- /dev/null
+++ b/docs/low_precision_transformations/quantization/pipelines/img/pipeline5/step3_propagate_precisions.svg
@@ -0,0 +1,400 @@
+
+
+
+
+
diff --git a/docs/low_precision_transformations/quantization/pipelines/img/pipeline5/step4_align_concat_quantization.svg b/docs/low_precision_transformations/quantization/pipelines/img/pipeline5/step4_align_concat_quantization.svg
new file mode 100644
index 00000000000000..95fde030f25459
--- /dev/null
+++ b/docs/low_precision_transformations/quantization/pipelines/img/pipeline5/step4_align_concat_quantization.svg
@@ -0,0 +1,413 @@
+
+
+
+
+
diff --git a/docs/low_precision_transformations/quantization/pipelines/img/pipeline5/transformed.svg b/docs/low_precision_transformations/quantization/pipelines/img/pipeline5/transformed.svg
new file mode 100644
index 00000000000000..a4482fa54523f8
--- /dev/null
+++ b/docs/low_precision_transformations/quantization/pipelines/img/pipeline5/transformed.svg
@@ -0,0 +1,397 @@
+
+
+
+
+
diff --git a/inference-engine/src/cldnn_engine/cldnn_engine.cpp b/inference-engine/src/cldnn_engine/cldnn_engine.cpp
index 4aa53beb1e5a86..956d5b641dac66 100644
--- a/inference-engine/src/cldnn_engine/cldnn_engine.cpp
+++ b/inference-engine/src/cldnn_engine/cldnn_engine.cpp
@@ -69,8 +69,8 @@
#include
#include
#include
-#include
#include
+#include
#include
#include
#include
@@ -147,7 +147,7 @@ InferenceEngine::CNNNetwork clDNNEngine::CloneAndTransformNetwork(const Inferenc
bool enableInt8;
{
ngraph::pass::Manager manager;
- enableInt8 = config.enableInt8 && ngraph::pass::low_precision::LowPrecisionTransformer::isFunctionQuantized(nGraphFunc);
+ enableInt8 = config.enableInt8 && ngraph::pass::low_precision::LowPrecision::isFunctionQuantized(nGraphFunc);
if (enableInt8) {
manager.register_pass(
std::vector{ ngraph::element::i8, ngraph::element::u8, ngraph::element::i4, ngraph::element::u4 });
@@ -367,28 +367,28 @@ InferenceEngine::CNNNetwork clDNNEngine::CloneAndTransformNetwork(const Inferenc
if (!config.enable_fp16_for_quantized_models) {
manager.register_pass(precisions_array {{ ngraph::element::f16, ngraph::element::f32 }});
}
- auto lptPrerequisites = manager.register_pass();
- const std::vector supportedTypes = { ngraph::element::i8, ngraph::element::u8 };
- lptPrerequisites->add_matcher(supportedTypes);
- lptPrerequisites->add_matcher(supportedTypes);
- lptPrerequisites->add_matcher();
- manager.run_passes(nGraphFunc);
- auto params = LayerTransformation::Params(true, // updatePrecisions
- LayerTransformation::QuantizedTensorAlignment::UpdateLevel, // quantizedTensorAlignmentOnActivations
- LayerTransformation::QuantizedTensorAlignment::None, // quantizedTensorAlignmentOnWeights
- true); // supportAsymmetricQuantization
- LowPrecisionTransformer transformer(LowPrecisionTransformer::getAllTransformations(params)
- .add(LayerTransformation::Params(params)
- .setSupportAsymmetricQuantization(false)
- .setSupport3DTensorOnActivations(false))
- .add(LayerTransformation::Params(params)
- .setSupportAsymmetricQuantization(false)
- .setDeconvolutionSpecificChannelsRatio(true))
- // INT8 StridedSlice not supported
- .remove());
-
- transformer.transform(nGraphFunc);
+ // TODO: LPT: not implemented:
+ // - supportAsymmetricQuantization
+ // - support3DTensorOnActivations
+ // - deconvolutionSpecificChannelsRatio
+
+ auto supportedPrecisions = std::vector({
+ OperationPrecisionRestriction::create({})
+ });
+
+ auto perTensorQuantization = std::vector({
+ OperationPerTensorQuantizationRestriction::create({0}),
+ OperationPerTensorQuantizationRestriction::create({0})
+ });
+
+ ngraph::pass::Manager lptManager;
+
+ auto lptPassConfig = lptManager.get_pass_config();
+ lptPassConfig->disable();
+
+ lptManager.register_pass(supportedPrecisions, perTensorQuantization);
+ lptManager.run_passes(nGraphFunc);
}
{
diff --git a/inference-engine/src/low_precision_transformations/include/low_precision/add.hpp b/inference-engine/src/low_precision_transformations/include/low_precision/add.hpp
index 274b61e8fb873a..ede29a848cc191 100644
--- a/inference-engine/src/low_precision_transformations/include/low_precision/add.hpp
+++ b/inference-engine/src/low_precision_transformations/include/low_precision/add.hpp
@@ -13,9 +13,8 @@ namespace low_precision {
class LP_TRANSFORMATIONS_API AddTransformation : public EltwiseBaseTransformation {
public:
- AddTransformation(const Params& params) : EltwiseBaseTransformation(params) {}
- ~AddTransformation() override {}
- void registerMatcherIn(GraphRewrite& pass, TransformationContext& context) const override;
+ NGRAPH_RTTI_DECLARATION;
+ AddTransformation(const Params& params = Params());
bool transform(TransformationContext& context, ngraph::pattern::Matcher &m) const override;
bool canBeTransformed(const TransformationContext& context, std::shared_ptr layer) const override;
};
diff --git a/inference-engine/src/low_precision_transformations/include/low_precision/align_quantization_intervals.hpp b/inference-engine/src/low_precision_transformations/include/low_precision/align_quantization_intervals.hpp
new file mode 100644
index 00000000000000..ea7bf1d4a43eab
--- /dev/null
+++ b/inference-engine/src/low_precision_transformations/include/low_precision/align_quantization_intervals.hpp
@@ -0,0 +1,34 @@
+// Copyright (C) 2021 Intel Corporation
+// SPDX-License-Identifier: Apache-2.0
+//
+
+#pragma once
+
+#include
+
+#include
+#include
+
+#include
+#include
+#include
+
+namespace ngraph {
+namespace pass {
+namespace low_precision {
+
+class LP_TRANSFORMATIONS_API AlignQuantizationIntervals;
+
+} // namespace low_precision
+} // namespace pass
+} // namespace ngraph
+
+class ngraph::pass::low_precision::AlignQuantizationIntervals : public ngraph::pass::FunctionPass {
+public:
+ NGRAPH_RTTI_DECLARATION;
+ AlignQuantizationIntervals(LayerTransformation::Params params = LayerTransformation::Params());
+ bool run_on_function(std::shared_ptr f) override;
+
+protected:
+ LayerTransformation::Params params;
+};
diff --git a/inference-engine/src/low_precision_transformations/include/low_precision/align_quantization_parameters.hpp b/inference-engine/src/low_precision_transformations/include/low_precision/align_quantization_parameters.hpp
new file mode 100644
index 00000000000000..c02836f29972fa
--- /dev/null
+++ b/inference-engine/src/low_precision_transformations/include/low_precision/align_quantization_parameters.hpp
@@ -0,0 +1,34 @@
+// Copyright (C) 2021 Intel Corporation
+// SPDX-License-Identifier: Apache-2.0
+//
+
+#pragma once
+
+#include
+
+#include
+#include
+
+#include
+#include
+#include
+
+namespace ngraph {
+namespace pass {
+namespace low_precision {
+
+class LP_TRANSFORMATIONS_API AlignQuantizationParameters;
+
+} // namespace low_precision
+} // namespace pass
+} // namespace ngraph
+
+class ngraph::pass::low_precision::AlignQuantizationParameters : public ngraph::pass::FunctionPass {
+public:
+ NGRAPH_RTTI_DECLARATION;
+ AlignQuantizationParameters(LayerTransformation::Params params = LayerTransformation::Params());
+ bool run_on_function(std::shared_ptr f) override;
+
+protected:
+ LayerTransformation::Params params;
+};
diff --git a/inference-engine/src/low_precision_transformations/include/low_precision/avg_pool.hpp b/inference-engine/src/low_precision_transformations/include/low_precision/avg_pool.hpp
index 989d772f4fa2fd..1733ac0ed7c4f4 100644
--- a/inference-engine/src/low_precision_transformations/include/low_precision/avg_pool.hpp
+++ b/inference-engine/src/low_precision_transformations/include/low_precision/avg_pool.hpp
@@ -13,8 +13,8 @@ namespace low_precision {
class LP_TRANSFORMATIONS_API AvgPoolTransformation : public LayerTransformation {
public:
- AvgPoolTransformation(const Params& params);
- void registerMatcherIn(GraphRewrite& pass, TransformationContext& context) const override;
+ NGRAPH_RTTI_DECLARATION;
+ AvgPoolTransformation(const Params& params = Params());
bool transform(TransformationContext& context, ngraph::pattern::Matcher &m) const override;
bool isPrecisionPreserved(std::shared_ptr layer) const noexcept override;
bool canBeTransformed(const TransformationContext& context, std::shared_ptr layer) const override;
diff --git a/inference-engine/src/low_precision_transformations/include/low_precision/base_matcher_pass.hpp b/inference-engine/src/low_precision_transformations/include/low_precision/base_matcher_pass.hpp
new file mode 100644
index 00000000000000..4c637624e40f3d
--- /dev/null
+++ b/inference-engine/src/low_precision_transformations/include/low_precision/base_matcher_pass.hpp
@@ -0,0 +1,24 @@
+// Copyright (C) 2021 Intel Corporation
+// SPDX-License-Identifier: Apache-2.0
+//
+
+#pragma once
+#include
+#include
+#include "rt_info/attribute_parameters.hpp"
+
+namespace ngraph {
+namespace pass {
+namespace low_precision {
+
+class LP_TRANSFORMATIONS_API BaseMatcherPass;
+
+} // namespace low_precision
+} // namespace pass
+} // namespace ngraph
+
+class LP_TRANSFORMATIONS_API ngraph::pass::low_precision::BaseMatcherPass : public ngraph::pass::MatcherPass {
+public:
+ BaseMatcherPass(const AttributeParameters& params = AttributeParameters());
+ AttributeParameters params;
+};
diff --git a/inference-engine/src/low_precision_transformations/include/low_precision/clamp.hpp b/inference-engine/src/low_precision_transformations/include/low_precision/clamp.hpp
index d3b60802426736..0e62b0b645e296 100644
--- a/inference-engine/src/low_precision_transformations/include/low_precision/clamp.hpp
+++ b/inference-engine/src/low_precision_transformations/include/low_precision/clamp.hpp
@@ -14,8 +14,8 @@ namespace low_precision {
class LP_TRANSFORMATIONS_API ClampTransformation : public LayerTransformation {
public:
- ClampTransformation(const Params& params);
- void registerMatcherIn(GraphRewrite& pass, TransformationContext& context) const override;
+ NGRAPH_RTTI_DECLARATION;
+ ClampTransformation(const Params& params = Params());
bool transform(TransformationContext& context, ngraph::pattern::Matcher& m) const override;
bool canBeTransformed(const TransformationContext& context, std::shared_ptr op) const override;
bool isPrecisionPreserved(std::shared_ptr layer) const noexcept override;
diff --git a/inference-engine/src/low_precision_transformations/include/low_precision/common/fake_quantize_dequantization.hpp b/inference-engine/src/low_precision_transformations/include/low_precision/common/fake_quantize_dequantization.hpp
index 67c522bb7e3fcf..a9fba5234d1846 100644
--- a/inference-engine/src/low_precision_transformations/include/low_precision/common/fake_quantize_dequantization.hpp
+++ b/inference-engine/src/low_precision_transformations/include/low_precision/common/fake_quantize_dequantization.hpp
@@ -8,6 +8,7 @@
#include
#include
#include
+#include
namespace ngraph {
namespace pass {
@@ -15,7 +16,7 @@ namespace low_precision {
typedef std::tuple, std::shared_ptr> FakeQuantizeDequantizationValues;
-class FakeQuantizeDequantization {
+class LP_TRANSFORMATIONS_API FakeQuantizeDequantization {
public:
FakeQuantizeDequantization();
diff --git a/inference-engine/src/low_precision_transformations/include/low_precision/common/operation_per_tensor_quantization_restriction.hpp b/inference-engine/src/low_precision_transformations/include/low_precision/common/operation_per_tensor_quantization_restriction.hpp
new file mode 100644
index 00000000000000..4c5321b26bef99
--- /dev/null
+++ b/inference-engine/src/low_precision_transformations/include/low_precision/common/operation_per_tensor_quantization_restriction.hpp
@@ -0,0 +1,56 @@
+// Copyright (C) 2021 Intel Corporation
+// SPDX-License-Identifier: Apache-2.0
+//
+
+#pragma once
+
+#include
+
+#include
+#include
+
+#include
+#include
+
+namespace ngraph {
+namespace pass {
+namespace low_precision {
+
+class OperationPerTensorQuantizationRestriction {
+public:
+ using RestrictedPorts = std::vector;
+
+ ngraph::Node::type_info_t operationType;
+ bool specifyVersion;
+ std::vector restrictedPorts;
+
+ OperationPerTensorQuantizationRestriction() = default;
+ OperationPerTensorQuantizationRestriction(
+ const ngraph::Node::type_info_t operationType,
+ const bool specifyVersion,
+ const RestrictedPorts& restrictedPorts) :
+ operationType(operationType),
+ specifyVersion(specifyVersion),
+ restrictedPorts(restrictedPorts) {}
+
+ template
+ static OperationPerTensorQuantizationRestriction create(
+ const RestrictedPorts& restrictedPorts = {},
+ const bool specifyVersion = false) {
+ return OperationPerTensorQuantizationRestriction(T::get_type_info_static(), specifyVersion, restrictedPorts);
+ }
+
+ template
+ static RestrictedPorts getPrecisionsByOperationType(std::vector& restrictions) {
+ for (const auto& restriction : restrictions) {
+ if (restriction.operationType == T::get_type_info_static()) {
+ return restriction.restrictedPorts;
+ }
+ }
+ return {};
+ }
+};
+
+} // namespace low_precision
+} // namespace pass
+} // namespace ngraph
diff --git a/inference-engine/src/low_precision_transformations/include/low_precision/common/operation_precision_restriction.hpp b/inference-engine/src/low_precision_transformations/include/low_precision/common/operation_precision_restriction.hpp
new file mode 100644
index 00000000000000..d22252ee7afd88
--- /dev/null
+++ b/inference-engine/src/low_precision_transformations/include/low_precision/common/operation_precision_restriction.hpp
@@ -0,0 +1,59 @@
+// Copyright (C) 2021 Intel Corporation
+// SPDX-License-Identifier: Apache-2.0
+//
+
+#pragma once
+
+#include
+#include
+#include
+#include
+
+#include
+#include
+
+#include
+#include
+
+namespace ngraph {
+namespace pass {
+namespace low_precision {
+
+class OperationPrecisionRestriction {
+public:
+ using PrecisionsByPort = std::vector>>;
+
+ ngraph::Node::type_info_t operationType;
+ bool specifyVersion;
+ std::vector>> precisionsByPort;
+
+ OperationPrecisionRestriction() = default;
+ OperationPrecisionRestriction(
+ const ngraph::Node::type_info_t operationType,
+ const bool specifyVersion,
+ const PrecisionsByPort& precisionsByPort) :
+ operationType(operationType),
+ specifyVersion(specifyVersion),
+ precisionsByPort(precisionsByPort) {}
+
+ template
+ static OperationPrecisionRestriction create(
+ const PrecisionsByPort& precisionsByPort,
+ const bool specifyVersion = false) {
+ return OperationPrecisionRestriction(T::get_type_info_static(), specifyVersion, precisionsByPort);
+ }
+
+ template
+ static PrecisionsByPort getPrecisionsByOperationType(std::vector& restrictions) {
+ for (const auto& restriction : restrictions) {
+ if (restriction.operationType == T::get_type_info_static()) {
+ return restriction.precisionsByPort;
+ }
+ }
+ return {};
+ }
+};
+
+} // namespace low_precision
+} // namespace pass
+} // namespace ngraph
diff --git a/inference-engine/src/low_precision_transformations/include/low_precision/concat.hpp b/inference-engine/src/low_precision_transformations/include/low_precision/concat.hpp
index 67bfa48226df5e..41d4f458c8f0a4 100644
--- a/inference-engine/src/low_precision_transformations/include/low_precision/concat.hpp
+++ b/inference-engine/src/low_precision_transformations/include/low_precision/concat.hpp
@@ -22,9 +22,8 @@ namespace low_precision {
class LP_TRANSFORMATIONS_API ConcatTransformation : public LayerTransformation {
public:
- ConcatTransformation(const Params& params) : LayerTransformation(params) {}
- ~ConcatTransformation() override {};
- void registerMatcherIn(GraphRewrite& pass, TransformationContext& context) const override;
+ NGRAPH_RTTI_DECLARATION;
+ ConcatTransformation(const Params& params = Params());
bool transform(TransformationContext& context, ngraph::pattern::Matcher &m) const override;
bool isPrecisionPreserved(std::shared_ptr layer) const noexcept override;
bool canBeTransformed(const TransformationContext& context, std::shared_ptr layer) const override;
diff --git a/inference-engine/src/low_precision_transformations/include/low_precision/concat_multi_channels.hpp b/inference-engine/src/low_precision_transformations/include/low_precision/concat_multi_channels.hpp
deleted file mode 100644
index 48c0a0ef9eaa5f..00000000000000
--- a/inference-engine/src/low_precision_transformations/include/low_precision/concat_multi_channels.hpp
+++ /dev/null
@@ -1,51 +0,0 @@
-// Copyright (C) 2018-2021 Intel Corporation
-// SPDX-License-Identifier: Apache-2.0
-//
-
-#pragma once
-
-#include
-#include
-#include
-
-#include
-
-#include "concat.hpp"
-#include "common/subgraph.hpp"
-#include "common/fake_quantize_dequantization.hpp"
-
-namespace ngraph {
-namespace pass {
-namespace low_precision {
-
-class TRANSFORMATIONS_API ConcatMultiChannelsTransformation : public ConcatTransformation {
-public:
- ConcatMultiChannelsTransformation(const Params& params) : ConcatTransformation(params) {}
- ~ConcatMultiChannelsTransformation() override {};
- void registerMatcherIn(GraphRewrite& pass, TransformationContext& context) const override;
- bool transform(TransformationContext& context, ngraph::pattern::Matcher &m) const override;
- bool isPrecisionPreserved(std::shared_ptr layer) const noexcept override;
-
-private:
- // Go through the parent elements of the layer and fill dequantization collection
- // with Dq operations that should be inserted before the layer.
- void fillDequantization(
- const std::shared_ptr layer,
- const std::unordered_map& dequantizationByFakeQuantize,
- std::vector& dequantization) const;
-
- FakeQuantizeDequantization getConcatenatedDequantization(
- const std::shared_ptr concat,
- const std::vector& dequantization) const;
-
- static FakeQuantizeDequantization getFoldedDequantization(
- const std::shared_ptr operation,
- const FakeQuantizeDequantization& dequantization,
- const size_t sourceOutputIdx);
-
- bool isMultiChannel(const std::vector>& concatLayers) const noexcept;
-};
-
-} // namespace low_precision
-} // namespace pass
-} // namespace ngraph
diff --git a/inference-engine/src/low_precision_transformations/include/low_precision/convert.hpp b/inference-engine/src/low_precision_transformations/include/low_precision/convert.hpp
index 7154590ee9d1d6..415830c47a2f90 100644
--- a/inference-engine/src/low_precision_transformations/include/low_precision/convert.hpp
+++ b/inference-engine/src/low_precision_transformations/include/low_precision/convert.hpp
@@ -13,9 +13,8 @@ namespace low_precision {
class LP_TRANSFORMATIONS_API ConvertTransformation : public LayerTransformation {
public:
- ConvertTransformation(const Params& params) : LayerTransformation(params) {}
- ~ConvertTransformation() override {}
- void registerMatcherIn(GraphRewrite& pass, TransformationContext& context) const override;
+ NGRAPH_RTTI_DECLARATION;
+ ConvertTransformation(const Params& params = Params());
bool transform(TransformationContext& context, ngraph::pattern::Matcher &m) const override;
bool isPrecisionPreserved(std::shared_ptr layer) const noexcept override;
};
diff --git a/inference-engine/src/low_precision_transformations/include/low_precision/convolution.hpp b/inference-engine/src/low_precision_transformations/include/low_precision/convolution.hpp
index 68a160b5f971a9..86df8937727643 100644
--- a/inference-engine/src/low_precision_transformations/include/low_precision/convolution.hpp
+++ b/inference-engine/src/low_precision_transformations/include/low_precision/convolution.hpp
@@ -13,8 +13,8 @@ namespace low_precision {
class LP_TRANSFORMATIONS_API ConvolutionTransformation : public WeightableLayerTransformation {
public:
- ConvolutionTransformation(const Params& params);
- void registerMatcherIn(GraphRewrite& pass, TransformationContext& context) const override;
+ NGRAPH_RTTI_DECLARATION;
+ ConvolutionTransformation(const Params& params = Params());
bool transform(TransformationContext& context, ngraph::pattern::Matcher &m) const override;
bool isQuantized(std::shared_ptr layer) const noexcept override;
};
diff --git a/inference-engine/src/low_precision_transformations/include/low_precision/convolution_backprop_data.hpp b/inference-engine/src/low_precision_transformations/include/low_precision/convolution_backprop_data.hpp
index 176dd44d3dc8ad..880c30a1c67943 100644
--- a/inference-engine/src/low_precision_transformations/include/low_precision/convolution_backprop_data.hpp
+++ b/inference-engine/src/low_precision_transformations/include/low_precision/convolution_backprop_data.hpp
@@ -13,8 +13,8 @@ namespace low_precision {
class LP_TRANSFORMATIONS_API ConvolutionBackpropDataTransformation : public WeightableLayerTransformation {
public:
- ConvolutionBackpropDataTransformation(const Params& params);
- void registerMatcherIn(GraphRewrite& pass, TransformationContext& context) const override;
+ ConvolutionBackpropDataTransformation(const Params& params = Params());
+ //void registerMatcherIn(GraphRewrite& pass, TransformationContext& context) const override;
bool transform(TransformationContext& context, ngraph::pattern::Matcher &m) const override;
bool canBeTransformed(const TransformationContext& context, std::shared_ptr op) const override;
bool isQuantized(std::shared_ptr layer) const noexcept override;
diff --git a/inference-engine/src/low_precision_transformations/include/low_precision/create_attribute.hpp b/inference-engine/src/low_precision_transformations/include/low_precision/create_attribute.hpp
new file mode 100644
index 00000000000000..7162cbbd63f92b
--- /dev/null
+++ b/inference-engine/src/low_precision_transformations/include/low_precision/create_attribute.hpp
@@ -0,0 +1,63 @@
+// Copyright (C) 2021 Intel Corporation
+// SPDX-License-Identifier: Apache-2.0
+//
+
+#pragma once
+
+#include
+#include
+
+#include
+#include
+#include
+
+#include
+#include
+#include "base_matcher_pass.hpp"
+#include "network_helper.hpp"
+#include "lpt_itt.hpp"
+
+namespace ngraph {
+namespace pass {
+namespace low_precision {
+
+template
+class CreateAttribute;
+
+} // namespace low_precision
+} // namespace pass
+} // namespace ngraph
+
+enum class AttributeSource {
+ Node,
+ OutputPort
+};
+
+template
+class ngraph::pass::low_precision::CreateAttribute : public ngraph::pass::low_precision::BaseMatcherPass {
+public:
+ CreateAttribute(const AttributeSource source = AttributeSource::Node) {
+ assert((source == AttributeSource::Node) || (source == AttributeSource::OutputPort));
+ auto operation = std::is_same::value ?
+ std::make_shared(element::f32, Shape{}, [](std::shared_ptr n) { return true; }) :
+ pattern::wrap_type();
+
+ ngraph::graph_rewrite_callback callback = [&](pattern::Matcher& m) {
+ auto op = m.get_match_root();
+ if (!op || transformation_callback(op)) {
+ return false;
+ }
+ {
+ OV_ITT_SCOPE(FIRST_INFERENCE, itt::domains::LPT_LT, "CreateAttribute");
+ const auto attribute = ngraph::VariantWrapper::create(op, params);
+ if (attribute == nullptr) {
+ return false;
+ }
+ }
+ return true;
+ };
+
+ auto matcher = std::make_shared(operation, "CreateAttribute");
+ this->register_matcher(matcher, callback);
+ }
+};
diff --git a/inference-engine/src/low_precision_transformations/include/low_precision/create_precisions_dependent_attribute.hpp b/inference-engine/src/low_precision_transformations/include/low_precision/create_precisions_dependent_attribute.hpp
new file mode 100644
index 00000000000000..035b53ed8b89eb
--- /dev/null
+++ b/inference-engine/src/low_precision_transformations/include/low_precision/create_precisions_dependent_attribute.hpp
@@ -0,0 +1,70 @@
+// Copyright (C) 2021 Intel Corporation
+// SPDX-License-Identifier: Apache-2.0
+//
+
+#pragma once
+
+#include
+#include
+
+#include
+#include
+#include
+
+#include
+#include
+#include
+#include "rt_info/precision_preserved_attribute.hpp"
+#include "network_helper.hpp"
+#include "lpt_itt.hpp"
+
+namespace ngraph {
+namespace pass {
+namespace low_precision {
+
+template
+class CreatePrecisionsDependentAttribute;
+
+} // namespace low_precision
+} // namespace pass
+} // namespace ngraph
+
+template
+class ngraph::pass::low_precision::CreatePrecisionsDependentAttribute : public ngraph::pass::MatcherPass {
+public:
+ CreatePrecisionsDependentAttribute() {
+ auto operation = pattern::wrap_type();
+
+ ngraph::graph_rewrite_callback callback = [&](pattern::Matcher& m) {
+ auto node = m.get_match_root();
+ if (!node || transformation_callback(node)) {
+ return false;
+ }
+
+ {
+ OV_ITT_SCOPE(FIRST_INFERENCE, itt::domains::LPT_LT, "CreatePrecisionsDependentAttribute");
+ auto &rt = node->get_rt_info();
+
+ const auto precisionPreservedAttribute = std::make_shared>(
+ std::make_shared(false));
+ rt[ngraph::VariantWrapper::type_info.name] = precisionPreservedAttribute;
+ const auto &targetSharedValue = precisionPreservedAttribute->get()->sharedValue;
+
+ const auto attribute = std::make_shared>>(
+ std::make_shared());
+ rt[ngraph::VariantWrapper>::type_info.name] = attribute;
+
+ ngraph::pass::low_precision::NetworkHelper::reassign(
+ targetSharedValue,
+ {
+ std::dynamic_pointer_cast(attribute->get()),
+ std::dynamic_pointer_cast(precisionPreservedAttribute->get())
+ });
+ }
+ return true;
+ };
+
+ auto matcher = std::make_shared(operation, "CreateAttribute");
+ this->register_matcher(matcher, callback);
+ }
+};
diff --git a/inference-engine/src/low_precision_transformations/include/low_precision/depth_to_space.hpp b/inference-engine/src/low_precision_transformations/include/low_precision/depth_to_space.hpp
index 9159743fd3fc48..640ce710eada29 100644
--- a/inference-engine/src/low_precision_transformations/include/low_precision/depth_to_space.hpp
+++ b/inference-engine/src/low_precision_transformations/include/low_precision/depth_to_space.hpp
@@ -12,10 +12,9 @@ namespace low_precision {
class LP_TRANSFORMATIONS_API DepthToSpaceTransformation : public TransparentBaseTransformation {
public:
- DepthToSpaceTransformation(const Params& params) : TransparentBaseTransformation(params) {}
- ~DepthToSpaceTransformation() override {}
+ NGRAPH_RTTI_DECLARATION;
+ DepthToSpaceTransformation(const Params& params = Params());
bool transform(TransformationContext &context, ngraph::pattern::Matcher &m) const override;
- void registerMatcherIn(GraphRewrite& pass, TransformationContext& context) const override;
bool isPrecisionPreserved(std::shared_ptr layer) const noexcept override;
bool canBeTransformed(const TransformationContext& context, std::shared_ptr layer) const override;
};
diff --git a/inference-engine/src/low_precision_transformations/include/low_precision/fake_quantize.hpp b/inference-engine/src/low_precision_transformations/include/low_precision/fake_quantize.hpp
index 49bc3fa2f9ee44..69fb8159dbc6d5 100644
--- a/inference-engine/src/low_precision_transformations/include/low_precision/fake_quantize.hpp
+++ b/inference-engine/src/low_precision_transformations/include/low_precision/fake_quantize.hpp
@@ -15,8 +15,8 @@ namespace low_precision {
class LP_TRANSFORMATIONS_API FakeQuantizeTransformation : public LayerTransformation {
public:
- FakeQuantizeTransformation(const Params& params) : LayerTransformation(params) {}
- void registerMatcherIn(GraphRewrite& pass, TransformationContext& context) const override;
+ NGRAPH_RTTI_DECLARATION;
+ FakeQuantizeTransformation(const Params& params = Params());
bool transform(TransformationContext& context, ngraph::pattern::Matcher &m) const override;
bool isPrecisionPreserved(std::shared_ptr layer) const noexcept override;
diff --git a/inference-engine/src/low_precision_transformations/include/low_precision/fake_quantize_decomposition.hpp b/inference-engine/src/low_precision_transformations/include/low_precision/fake_quantize_decomposition.hpp
index ef1de6bdd669fb..b1ee411e75d0dc 100644
--- a/inference-engine/src/low_precision_transformations/include/low_precision/fake_quantize_decomposition.hpp
+++ b/inference-engine/src/low_precision_transformations/include/low_precision/fake_quantize_decomposition.hpp
@@ -15,8 +15,8 @@ namespace low_precision {
class LP_TRANSFORMATIONS_API FakeQuantizeDecompositionTransformation : public LayerTransformation {
public:
- FakeQuantizeDecompositionTransformation(const Params& params) : LayerTransformation(params) {}
- void registerMatcherIn(GraphRewrite& pass, TransformationContext& context) const override;
+ NGRAPH_RTTI_DECLARATION;
+ FakeQuantizeDecompositionTransformation(const Params& params = Params());
bool transform(TransformationContext& context, ngraph::pattern::Matcher &m) const override;
bool isPrecisionPreserved(std::shared_ptr layer) const noexcept override;
};
diff --git a/inference-engine/src/low_precision_transformations/include/low_precision/fold_convert.hpp b/inference-engine/src/low_precision_transformations/include/low_precision/fold_convert.hpp
index b371935dfeed99..2f8aa07ed134a6 100644
--- a/inference-engine/src/low_precision_transformations/include/low_precision/fold_convert.hpp
+++ b/inference-engine/src/low_precision_transformations/include/low_precision/fold_convert.hpp
@@ -14,9 +14,8 @@ namespace low_precision {
class LP_TRANSFORMATIONS_API FoldConvertTransformation : public LayerTransformation {
public:
- FoldConvertTransformation(const Params& params) : LayerTransformation(params) {}
- ~FoldConvertTransformation() override {}
- void registerMatcherIn(GraphRewrite& pass, TransformationContext& context) const override;
+ NGRAPH_RTTI_DECLARATION;
+ FoldConvertTransformation(const Params& params = Params());
bool transform(TransformationContext& context, ngraph::pattern::Matcher &m) const override;
bool canBeTransformed(const TransformationContext& context, std::shared_ptr layer) const override;
bool isPrecisionPreserved(std::shared_ptr layer) const noexcept override;
diff --git a/inference-engine/src/low_precision_transformations/include/low_precision/fold_fake_quantize.hpp b/inference-engine/src/low_precision_transformations/include/low_precision/fold_fake_quantize.hpp
new file mode 100644
index 00000000000000..921aa82aab97ab
--- /dev/null
+++ b/inference-engine/src/low_precision_transformations/include/low_precision/fold_fake_quantize.hpp
@@ -0,0 +1,25 @@
+// Copyright (C) 2021 Intel Corporation
+// SPDX-License-Identifier: Apache-2.0
+//
+
+#pragma once
+
+#include
+#include "low_precision/layer_transformation.hpp"
+
+namespace ngraph {
+namespace pass {
+namespace low_precision {
+
+class LP_TRANSFORMATIONS_API FoldFakeQuantizeTransformation : public LayerTransformation {
+public:
+ NGRAPH_RTTI_DECLARATION;
+ FoldFakeQuantizeTransformation(const Params& params = Params());
+ bool transform(TransformationContext& context, ngraph::pattern::Matcher &m) const override;
+ bool canBeTransformed(const TransformationContext& context, std::shared_ptr layer) const override;
+ bool isPrecisionPreserved(std::shared_ptr layer) const noexcept override;
+};
+
+} // namespace low_precision
+} // namespace pass
+} // namespace ngraph
diff --git a/inference-engine/src/low_precision_transformations/include/low_precision/fuse_convert.hpp b/inference-engine/src/low_precision_transformations/include/low_precision/fuse_convert.hpp
index 866a2633cb04a7..441734310a89ad 100644
--- a/inference-engine/src/low_precision_transformations/include/low_precision/fuse_convert.hpp
+++ b/inference-engine/src/low_precision_transformations/include/low_precision/fuse_convert.hpp
@@ -14,9 +14,8 @@ namespace low_precision {
class LP_TRANSFORMATIONS_API FuseConvertTransformation : public LayerTransformation {
public:
- FuseConvertTransformation(const Params& params) : LayerTransformation(params) {}
- ~FuseConvertTransformation() override {}
- void registerMatcherIn(GraphRewrite& pass, TransformationContext& context) const override;
+ NGRAPH_RTTI_DECLARATION;
+ FuseConvertTransformation(const Params& params = Params());
bool transform(TransformationContext& context, ngraph::pattern::Matcher &m) const override;
bool canBeTransformed(const TransformationContext& context, std::shared_ptr layer) const override;
bool isPrecisionPreserved(std::shared_ptr layer) const noexcept override;
diff --git a/inference-engine/src/low_precision_transformations/include/low_precision/fuse_fake_quantize.hpp b/inference-engine/src/low_precision_transformations/include/low_precision/fuse_fake_quantize.hpp
index b3e263d3200d21..3b29ea02cf0bc0 100644
--- a/inference-engine/src/low_precision_transformations/include/low_precision/fuse_fake_quantize.hpp
+++ b/inference-engine/src/low_precision_transformations/include/low_precision/fuse_fake_quantize.hpp
@@ -14,9 +14,8 @@ namespace low_precision {
class LP_TRANSFORMATIONS_API FuseFakeQuantizeTransformation : public LayerTransformation {
public:
- FuseFakeQuantizeTransformation(const Params& params) : LayerTransformation(params) {}
- ~FuseFakeQuantizeTransformation() override {}
- void registerMatcherIn(GraphRewrite& pass, TransformationContext& context) const override;
+ NGRAPH_RTTI_DECLARATION;
+ FuseFakeQuantizeTransformation(const Params& params);
bool transform(TransformationContext& context, ngraph::pattern::Matcher &m) const override;
bool isPrecisionPreserved(std::shared_ptr layer) const noexcept override;
diff --git a/inference-engine/src/low_precision_transformations/include/low_precision/fuse_multiply_to_fake_quantize.hpp b/inference-engine/src/low_precision_transformations/include/low_precision/fuse_multiply_to_fake_quantize.hpp
index 012bfda2ed309d..6e6e4011db9759 100644
--- a/inference-engine/src/low_precision_transformations/include/low_precision/fuse_multiply_to_fake_quantize.hpp
+++ b/inference-engine/src/low_precision_transformations/include/low_precision/fuse_multiply_to_fake_quantize.hpp
@@ -14,9 +14,8 @@ namespace low_precision {
class LP_TRANSFORMATIONS_API FuseMultiplyToFakeQuantizeTransformation : public LayerTransformation {
public:
- FuseMultiplyToFakeQuantizeTransformation(const Params& params) : LayerTransformation(params) {}
- ~FuseMultiplyToFakeQuantizeTransformation() override {}
- void registerMatcherIn(GraphRewrite& pass, TransformationContext& context) const override;
+ NGRAPH_RTTI_DECLARATION;
+ FuseMultiplyToFakeQuantizeTransformation(const Params& params = Params());
bool transform(TransformationContext& context, ngraph::pattern::Matcher &m) const override;
bool canBeTransformed(const TransformationContext& context, std::shared_ptr layer) const override;
bool isPrecisionPreserved(std::shared_ptr layer) const noexcept override;
diff --git a/inference-engine/src/low_precision_transformations/include/low_precision/fuse_subtract_to_fake_quantize.hpp b/inference-engine/src/low_precision_transformations/include/low_precision/fuse_subtract_to_fake_quantize.hpp
index 907361298b8af4..06da1b56c40ba4 100644
--- a/inference-engine/src/low_precision_transformations/include/low_precision/fuse_subtract_to_fake_quantize.hpp
+++ b/inference-engine/src/low_precision_transformations/include/low_precision/fuse_subtract_to_fake_quantize.hpp
@@ -14,9 +14,8 @@ namespace low_precision {
class LP_TRANSFORMATIONS_API FuseSubtractToFakeQuantizeTransformation : public LayerTransformation {
public:
- FuseSubtractToFakeQuantizeTransformation(const Params& params) : LayerTransformation(params) {}
- ~FuseSubtractToFakeQuantizeTransformation() override {}
- void registerMatcherIn(GraphRewrite& pass, TransformationContext& context) const override;
+ NGRAPH_RTTI_DECLARATION;
+ FuseSubtractToFakeQuantizeTransformation(const Params& params = Params());
bool transform(TransformationContext& context, ngraph::pattern::Matcher &m) const override;
bool canBeTransformed(const TransformationContext& context, std::shared_ptr layer) const override;
bool isPrecisionPreserved(std::shared_ptr layer) const noexcept override;
diff --git a/inference-engine/src/low_precision_transformations/include/low_precision/group_convolution.hpp b/inference-engine/src/low_precision_transformations/include/low_precision/group_convolution.hpp
index 5a5d96b990b617..2ab5766bc13673 100644
--- a/inference-engine/src/low_precision_transformations/include/low_precision/group_convolution.hpp
+++ b/inference-engine/src/low_precision_transformations/include/low_precision/group_convolution.hpp
@@ -13,8 +13,8 @@ namespace low_precision {
class LP_TRANSFORMATIONS_API GroupConvolutionTransformation : public ConvolutionTransformation {
public:
- GroupConvolutionTransformation(const Params& params);
- void registerMatcherIn(GraphRewrite& pass, TransformationContext& context) const override;
+ NGRAPH_RTTI_DECLARATION;
+ GroupConvolutionTransformation(const Params& params = Params());
bool transform(TransformationContext& context, ngraph::pattern::Matcher &m) const override;
bool isQuantized(std::shared_ptr layer) const noexcept override;
};
diff --git a/inference-engine/src/low_precision_transformations/include/low_precision/interpolate.hpp b/inference-engine/src/low_precision_transformations/include/low_precision/interpolate.hpp
index 05f69229ecc843..840d09b6a4106a 100644
--- a/inference-engine/src/low_precision_transformations/include/low_precision/interpolate.hpp
+++ b/inference-engine/src/low_precision_transformations/include/low_precision/interpolate.hpp
@@ -12,10 +12,9 @@ namespace low_precision {
class LP_TRANSFORMATIONS_API InterpolateTransformation : public LayerTransformation {
public:
- InterpolateTransformation(const Params& params) : LayerTransformation(params) {}
- ~InterpolateTransformation() override {}
+ NGRAPH_RTTI_DECLARATION;
+ InterpolateTransformation(const Params& params = Params());
bool transform(TransformationContext &context, ngraph::pattern::Matcher &m) const override;
- void registerMatcherIn(GraphRewrite& pass, TransformationContext& context) const override;
bool isPrecisionPreserved(std::shared_ptr layer) const noexcept override;
bool canBeTransformed(const TransformationContext& context, std::shared_ptr layer) const override;
};
diff --git a/inference-engine/src/low_precision_transformations/include/low_precision/layer_transformation.hpp b/inference-engine/src/low_precision_transformations/include/low_precision/layer_transformation.hpp
index 06a37ab8b22015..2200f986795904 100644
--- a/inference-engine/src/low_precision_transformations/include/low_precision/layer_transformation.hpp
+++ b/inference-engine/src/low_precision_transformations/include/low_precision/layer_transformation.hpp
@@ -41,7 +41,7 @@ namespace ngraph {
namespace pass {
namespace low_precision {
-class TRANSFORMATIONS_API DataPrecision {
+class LP_TRANSFORMATIONS_API DataPrecision {
public:
DataPrecision() : precision(element::undefined), min(0.f), max(0.f), hasZeroPoint(false) {}
@@ -148,20 +148,27 @@ inline std::ostream &operator << (std::ostream &os, const DataPrecision& value)
}
// Base class for all LP transformations, holds some common data structures
-class TRANSFORMATIONS_API LayerTransformation {
+class LP_TRANSFORMATIONS_API LayerTransformation : public ngraph::pass::MatcherPass {
public:
enum QuantizedTensorAlignment {
None,
UpdateLevel
};
+ // TODO: LPT: not implemented: clean up ngraph::pass::low_precision::LayerTransformation::Params,
+ // use LayerTestsUtils::LayerTransformation::Params type instead:
+ // - quantizedTensorAlignmentOnActivations
+ // - quantizedTensorAlignmentOnWeights
+ // - supportAsymmetricQuantization
+ // - precisionsOnActivations
+ // - precisionsOnWeights
class Params {
public:
Params(
const bool updatePrecisions = true,
const QuantizedTensorAlignment quantizedTensorAlignmentOnActivations = QuantizedTensorAlignment::UpdateLevel,
const QuantizedTensorAlignment quantizedTensorAlignmentOnWeights = QuantizedTensorAlignment::None,
- bool supportAsymmetricQuantization = false,
+ bool supportAsymmetricQuantization = true,
std::vector precisionsOnActivations = { element::u8, element::i8 },
std::vector precisionsOnWeights = { element::i8 },
element::Type deqPrecision = element::f32,
@@ -250,11 +257,12 @@ class TRANSFORMATIONS_API LayerTransformation {
LayerTransformation(const Params& params);
virtual ~LayerTransformation() = default;
- virtual void registerMatcherIn(ngraph::pass::GraphRewrite& pass, TransformationContext& context) const = 0;
virtual bool transform(TransformationContext& context, ngraph::pattern::Matcher &m) const = 0;
+ void setParams(const Params& params);
void setParamsManager(IParamsManager* paramsManager) noexcept;
void setLayerTransformationsManager(ILayerTransformationsManager* layerTransformationsManager) noexcept;
+ void setContext(TransformationContext* context) noexcept;
void setUpdatePrecisions(const bool updatePrecisions);
void setQuantizedTensorAlignmentOnActivations(const QuantizedTensorAlignment quantizedTensorAlignmentOnActivations);
@@ -264,16 +272,13 @@ class TRANSFORMATIONS_API LayerTransformation {
void setZeroThreshold(const float value);
void setMinQuantizationLevels(const size_t levels);
- const std::vector& getPrecisionsOnActivations() const;
- const std::vector& getPrecisionsOnWeights() const;
-
virtual bool canBeTransformed(const TransformationContext& context, std::shared_ptr layer) const;
bool canSubtractBeHandled(const std::shared_ptr& op, const size_t parentIndex = 0ul) const;
bool canSubtractBeHandled(const std::shared_ptr& op, const FakeQuantizeDequantization& dequantization) const;
- PrecisionDetails getPrecisionDetails(const QuantizationDetails& quantizationDetails) const;
+ static PrecisionDetails getPrecisionDetails(const QuantizationDetails& quantizationDetails);
// return true if operation can be quantized and false otherwise
// for example: if convolution operation weights are not quantized, then isQuantize returns false and true otherwise
@@ -284,10 +289,11 @@ class TRANSFORMATIONS_API LayerTransformation {
// note: dequantization operations on activations are absent during method execution
virtual bool isPrecisionPreserved(std::shared_ptr layer) const noexcept = 0;
+ // TODO: LPT: not completed: remove whole method
DataPrecision getDataPrecision(
- std::shared_ptr layer,
+ const std::shared_ptr& layer,
const QuantizationDetails& quantizationDetails,
- const bool onWeights) const;
+ const std::vector& precisions) const;
void fillAvailablePrecisions(std::shared_ptr layer, std::vector& availablePrecisions) const;
@@ -306,8 +312,6 @@ class TRANSFORMATIONS_API LayerTransformation {
QuantizedTensorAlignment quantizedTensorAlignmentOnActivations;
QuantizedTensorAlignment quantizedTensorAlignmentOnWeights;
bool supportAsymmetricQuantization;
- std::vector precisionsOnActivations;
- std::vector precisionsOnWeights;
element::Type deqPrecision;
bool support3DTensorOnActivations;
bool deconvolutionSpecificChannelsRatio;
@@ -321,6 +325,7 @@ class TRANSFORMATIONS_API LayerTransformation {
static const char originalLayerPostfix[];
IParamsManager* paramsManager;
ILayerTransformationsManager* layerTransformationsManager;
+ TransformationContext* context;
protected:
std::shared_ptr moveDequantizationAfter(
diff --git a/inference-engine/src/low_precision_transformations/include/low_precision/low_precision.hpp b/inference-engine/src/low_precision_transformations/include/low_precision/low_precision.hpp
new file mode 100644
index 00000000000000..82e026b39cb7cd
--- /dev/null
+++ b/inference-engine/src/low_precision_transformations/include/low_precision/low_precision.hpp
@@ -0,0 +1,59 @@
+// Copyright (C) 2021 Intel Corporation
+// SPDX-License-Identifier: Apache-2.0
+//
+
+#pragma once
+
+#include
+#include
+
+// one place to include all Low Precision Transformations from ngraph::pass::low_precision
+#include
+#include
+#include
+#include
+
+#include
+#include
+#include
+#include
+
+
+#include
+#include
+#include
+#include "low_precision/layer_transformation.hpp"
+#include "low_precision/markup_precisions.hpp"
+
+namespace ngraph {
+namespace pass {
+namespace low_precision {
+
+class LP_TRANSFORMATIONS_API LowPrecision;
+
+} // namespace low_precision
+} // namespace pass
+} // namespace ngraph
+
+class LP_TRANSFORMATIONS_API ngraph::pass::low_precision::LowPrecision : public ngraph::pass::FunctionPass {
+public:
+ class LP_TRANSFORMATIONS_API TypeRelaxedReplacer : public GraphRewrite {
+ public:
+ TypeRelaxedReplacer();
+ };
+
+ NGRAPH_RTTI_DECLARATION;
+ LowPrecision(
+ const std::vector& precisionRestrictions = {},
+ const std::vector& quantizationRestrictions = {},
+ const LayerTransformation::Params = LayerTransformation::Params());
+ bool run_on_function(std::shared_ptr f) override;
+
+ static bool isFunctionQuantized(const std::shared_ptr& function);
+
+protected:
+ std::vector precisionRestrictions;
+ std::vector quantizationRestrictions;
+ // remove
+ LayerTransformation::Params params;
+};
diff --git a/inference-engine/src/low_precision_transformations/include/low_precision/lpt_itt.hpp b/inference-engine/src/low_precision_transformations/include/low_precision/lpt_itt.hpp
new file mode 100644
index 00000000000000..3b207c1bf8f0c0
--- /dev/null
+++ b/inference-engine/src/low_precision_transformations/include/low_precision/lpt_itt.hpp
@@ -0,0 +1,27 @@
+// Copyright (C) 2021 Intel Corporation
+// SPDX-License-Identifier: Apache-2.0
+//
+
+/**
+ * @brief Defines openvino domains for tracing
+ * @file lpt_itt.hpp
+ */
+
+#pragma once
+
+#include
+
+namespace ngraph {
+namespace pass {
+namespace low_precision {
+namespace itt {
+namespace domains {
+
+OV_ITT_DOMAIN(LPT);
+OV_ITT_DOMAIN(LPT_LT);
+
+} // namespace domains
+} // namespace itt
+} // namespace low_precision
+} // namespace pass
+} // namespace ngraph
\ No newline at end of file
diff --git a/inference-engine/src/low_precision_transformations/include/low_precision/markup_avg_pool_precision_preserved.hpp b/inference-engine/src/low_precision_transformations/include/low_precision/markup_avg_pool_precision_preserved.hpp
new file mode 100644
index 00000000000000..07ed60e929c11d
--- /dev/null
+++ b/inference-engine/src/low_precision_transformations/include/low_precision/markup_avg_pool_precision_preserved.hpp
@@ -0,0 +1,29 @@
+// Copyright (C) 2021 Intel Corporation
+// SPDX-License-Identifier: Apache-2.0
+//
+
+#pragma once
+
+#include
+
+#include
+#include
+
+#include
+#include
+
+namespace ngraph {
+namespace pass {
+namespace low_precision {
+
+class LP_TRANSFORMATIONS_API MarkupAvgPoolPrecisionPreserved;
+
+} // namespace low_precision
+} // namespace pass
+} // namespace ngraph
+
+class ngraph::pass::low_precision::MarkupAvgPoolPrecisionPreserved : public ngraph::pass::FunctionPass {
+public:
+ NGRAPH_RTTI_DECLARATION;
+ bool run_on_function(std::shared_ptr f) override;
+};
diff --git a/inference-engine/src/low_precision_transformations/include/low_precision/markup_per_tensor_quantization.hpp b/inference-engine/src/low_precision_transformations/include/low_precision/markup_per_tensor_quantization.hpp
new file mode 100644
index 00000000000000..73168b700ef8ba
--- /dev/null
+++ b/inference-engine/src/low_precision_transformations/include/low_precision/markup_per_tensor_quantization.hpp
@@ -0,0 +1,48 @@
+// Copyright (C) 2021 Intel Corporation
+// SPDX-License-Identifier: Apache-2.0
+//
+
+#pragma once
+
+#include
+#include
+#include
+
+#include
+#include
+#include
+#include
+
+#include "common/operation_per_tensor_quantization_restriction.hpp"
+
+namespace ngraph {
+namespace pass {
+namespace low_precision {
+
+class LP_TRANSFORMATIONS_API MarkupPerTensorQuantization;
+
+} // namespace low_precision
+} // namespace pass
+} // namespace ngraph
+
+class ngraph::pass::low_precision::MarkupPerTensorQuantization : public ngraph::pass::FunctionPass {
+public:
+ class PerTensorQuantization {
+ public:
+ PerTensorQuantization() = default;
+ PerTensorQuantization(const bool versionIsRequired) : versionIsRequired(versionIsRequired) {}
+ void add(const uint64_t version, const std::vector& precisions) {
+ precisionsByVersion.emplace(version, precisions);
+ }
+
+ bool versionIsRequired;
+ std::unordered_map> precisionsByVersion;
+ };
+
+ NGRAPH_RTTI_DECLARATION;
+ MarkupPerTensorQuantization(const std::vector& restrictions = {});
+ bool run_on_function(std::shared_ptr f) override;
+
+private:
+ std::unordered_map restrictionsByOperation;
+};
diff --git a/inference-engine/src/low_precision_transformations/include/low_precision/markup_precisions.hpp b/inference-engine/src/low_precision_transformations/include/low_precision/markup_precisions.hpp
new file mode 100644
index 00000000000000..a3c31168cb73b6
--- /dev/null
+++ b/inference-engine/src/low_precision_transformations/include/low_precision/markup_precisions.hpp
@@ -0,0 +1,52 @@
+// Copyright (C) 2021 Intel Corporation
+// SPDX-License-Identifier: Apache-2.0
+//
+
+#pragma once
+
+#include
+#include
+#include
+
+#include
+#include
+
+#include
+#include
+#include "common/operation_precision_restriction.hpp"
+
+namespace ngraph {
+namespace pass {
+namespace low_precision {
+
+class LP_TRANSFORMATIONS_API MarkupPrecisions;
+
+} // namespace low_precision
+} // namespace pass
+} // namespace ngraph
+
+// Transformation is used to add customization options runtime
+class ngraph::pass::low_precision::MarkupPrecisions : public ngraph::pass::FunctionPass {
+public:
+ class Restriction {
+ public:
+ Restriction() = default;
+ Restriction(const bool versionIsRequired) : versionIsRequired(versionIsRequired) {}
+ void add(const uint64_t version, const std::vector>>& precisions) {
+ precisionsByVersion.emplace(version, precisions);
+ }
+
+ bool versionIsRequired;
+ std::unordered_map>>> precisionsByVersion;
+ };
+
+ NGRAPH_RTTI_DECLARATION;
+ MarkupPrecisions(const std::vector& restrictions = {});
+ bool run_on_function(std::shared_ptr f) override;
+
+private:
+ static bool isPrecisionPreserved(const std::shared_ptr& node);
+ static bool isQuantized(const std::shared_ptr& node);
+
+ std::unordered_map restrictionsByOperation;
+};
diff --git a/inference-engine/src/low_precision_transformations/include/low_precision/mat_mul.hpp b/inference-engine/src/low_precision_transformations/include/low_precision/mat_mul.hpp
index dc41cdcfdc2b2b..a294d348518ce0 100644
--- a/inference-engine/src/low_precision_transformations/include/low_precision/mat_mul.hpp
+++ b/inference-engine/src/low_precision_transformations/include/low_precision/mat_mul.hpp
@@ -13,10 +13,9 @@ namespace low_precision {
class LP_TRANSFORMATIONS_API MatMulTransformation : public LayerTransformation {
public:
- MatMulTransformation(const Params& params) : LayerTransformation(params) {}
- ~MatMulTransformation() override {}
+ NGRAPH_RTTI_DECLARATION;
+ MatMulTransformation(const Params& params = Params());
bool transform(TransformationContext &context, ngraph::pattern::Matcher &m) const override;
- void registerMatcherIn(GraphRewrite& pass, TransformationContext& context) const override;
bool isPrecisionPreserved(std::shared_ptr layer) const noexcept override;
bool canBeTransformed(const TransformationContext& context, std::shared_ptr layer) const override;
};
diff --git a/inference-engine/src/low_precision_transformations/include/low_precision/max_pool.hpp b/inference-engine/src/low_precision_transformations/include/low_precision/max_pool.hpp
index 96a0a6c3e7278d..40d01d1997f330 100644
--- a/inference-engine/src/low_precision_transformations/include/low_precision/max_pool.hpp
+++ b/inference-engine/src/low_precision_transformations/include/low_precision/max_pool.hpp
@@ -14,8 +14,8 @@ namespace low_precision {
class LP_TRANSFORMATIONS_API MaxPoolTransformation : public LayerTransformation {
public:
- MaxPoolTransformation(const Params& params);
- void registerMatcherIn(GraphRewrite& pass, TransformationContext& context) const override;
+ NGRAPH_RTTI_DECLARATION;
+ MaxPoolTransformation(const Params& params = Params());
bool canBeTransformed(const TransformationContext& context, std::shared_ptr op) const override;
bool transform(TransformationContext& context, ngraph::pattern::Matcher &m) const override;
bool isPrecisionPreserved(std::shared_ptr layer) const noexcept override;
diff --git a/inference-engine/src/low_precision_transformations/include/low_precision/multiply.hpp b/inference-engine/src/low_precision_transformations/include/low_precision/multiply.hpp
index 23dfeb482e392b..157d4251e83a90 100644
--- a/inference-engine/src/low_precision_transformations/include/low_precision/multiply.hpp
+++ b/inference-engine/src/low_precision_transformations/include/low_precision/multiply.hpp
@@ -13,9 +13,8 @@ namespace low_precision {
class LP_TRANSFORMATIONS_API MultiplyTransformation : public EltwiseBaseTransformation {
public:
- MultiplyTransformation(const Params& params) : EltwiseBaseTransformation(params) {}
- ~MultiplyTransformation() override {}
- void registerMatcherIn(GraphRewrite& pass, TransformationContext& context) const override;
+ NGRAPH_RTTI_DECLARATION;
+ MultiplyTransformation(const Params& params = Params());
bool transform(TransformationContext& context, ngraph::pattern::Matcher &m) const override;
};
diff --git a/inference-engine/src/low_precision_transformations/include/low_precision/multiply_to_group_convolution.hpp b/inference-engine/src/low_precision_transformations/include/low_precision/multiply_to_group_convolution.hpp
index 2e262e7bed5613..442deef0d5e958 100644
--- a/inference-engine/src/low_precision_transformations/include/low_precision/multiply_to_group_convolution.hpp
+++ b/inference-engine/src/low_precision_transformations/include/low_precision/multiply_to_group_convolution.hpp
@@ -7,6 +7,7 @@
#include
#include
#include "low_precision/layer_transformation.hpp"
+#include "common/operation_precision_restriction.hpp"
namespace ngraph {
namespace pass {
@@ -14,9 +15,11 @@ namespace low_precision {
class LP_TRANSFORMATIONS_API MultiplyToGroupConvolutionTransformation : public LayerTransformation {
public:
- MultiplyToGroupConvolutionTransformation(const Params& params) : LayerTransformation(params), groupSize(1ul) {}
+ NGRAPH_RTTI_DECLARATION;
+ MultiplyToGroupConvolutionTransformation(
+ const Params& params = Params(),
+ const OperationPrecisionRestriction::PrecisionsByPort& restrictions = {});
~MultiplyToGroupConvolutionTransformation() override {}
- void registerMatcherIn(GraphRewrite& pass, TransformationContext& context) const override;
bool transform(TransformationContext& context, ngraph::pattern::Matcher &m) const override;
bool canBeTransformed(const TransformationContext& context, std::shared_ptr layer) const override;
bool isPrecisionPreserved(std::shared_ptr layer) const noexcept override;
@@ -25,6 +28,7 @@ class LP_TRANSFORMATIONS_API MultiplyToGroupConvolutionTransformation : public L
void setGroupSize(const size_t groupSize);
size_t getGroupSize() const;
private:
+ OperationPrecisionRestriction::PrecisionsByPort restrictions;
size_t groupSize;
};
diff --git a/inference-engine/src/low_precision_transformations/include/low_precision/mvn.hpp b/inference-engine/src/low_precision_transformations/include/low_precision/mvn.hpp
index b5dcedd2e5e2bd..dc93cdf61f2cdb 100644
--- a/inference-engine/src/low_precision_transformations/include/low_precision/mvn.hpp
+++ b/inference-engine/src/low_precision_transformations/include/low_precision/mvn.hpp
@@ -12,8 +12,8 @@ namespace low_precision {
class LP_TRANSFORMATIONS_API MVNTransformation : public LayerTransformation {
public:
- MVNTransformation(const Params& params) : LayerTransformation(params) {}
- void registerMatcherIn(GraphRewrite& pass, TransformationContext& context) const override;
+ NGRAPH_RTTI_DECLARATION;
+ MVNTransformation(const Params& params = Params());
bool transform(TransformationContext &context, ngraph::pattern::Matcher &m) const override;
bool canBeTransformed(const TransformationContext& context, std::shared_ptr layer) const override;
bool isPrecisionPreserved(std::shared_ptr layer) const noexcept override;
diff --git a/inference-engine/src/low_precision_transformations/include/low_precision/network_helper.hpp b/inference-engine/src/low_precision_transformations/include/low_precision/network_helper.hpp
index 8a6744007511f6..ddf4e317526dcd 100644
--- a/inference-engine/src/low_precision_transformations/include/low_precision/network_helper.hpp
+++ b/inference-engine/src/low_precision_transformations/include/low_precision/network_helper.hpp
@@ -16,6 +16,9 @@
#include "ngraph_ops/type_relaxed.hpp"
#include
+#include "rt_info/shared_value_attribute.hpp"
+#include "rt_info/precisions_attribute.hpp"
+#include "rt_info/per_tensor_quantization_attribute.hpp"
#include "transformation_context.hpp"
#include "quantization_details.hpp"
#include "transformations/utils/utils.hpp"
@@ -76,6 +79,10 @@ class LP_TRANSFORMATIONS_API NetworkHelper {
static std::shared_ptr swapMultiplyAndAdd(std::shared_ptr addAfterMultiply, const int multiplyBranch);
+ static void copyInfo(const std::vector>& sources, const std::vector>& targets);
+
+ static void copyInfo(const std::vector>& sources, const std::shared_ptr& target);
+
static void copyInfo(const std::shared_ptr& source, const std::shared_ptr& target);
static void cleanRunTimeInfo(const std::shared_ptr& layer);
@@ -116,7 +123,8 @@ class LP_TRANSFORMATIONS_API NetworkHelper {
std::shared_ptr fq,
element::Type precision,
float min,
- float max);
+ float max,
+ const bool replace = true);
static FakeQuantizeDequantization makeDequantization(
const float dequantizationMul,
@@ -124,7 +132,8 @@ class LP_TRANSFORMATIONS_API NetworkHelper {
const ngraph::element::Type originalPrecision,
const ngraph::Shape dataNodeOutputShape,
element::Type precision,
- const element::Type deqPrecision = element::f32);
+ const element::Type deqPrecision = element::f32,
+ std::shared_ptr input = nullptr);
static FakeQuantizeDequantization createDequantizationFromFakeQuantize(
std::shared_ptr fq,
@@ -196,6 +205,105 @@ class LP_TRANSFORMATIONS_API NetworkHelper {
const std::vector& v1,
const std::vector& v2) noexcept;
+ static bool isPrecisionPreserved(const std::shared_ptr& node);
+
+ static void replaceAttributeInNodes(
+ std::shared_ptr f,
+ const std::string& name,
+ const std::shared_ptr newAttribute,
+ const std::shared_ptr oldAttribute,
+ const std::shared_ptr& initialNode) {
+ std::set> visited;
+ std::deque> nodes;
+ nodes.emplace_back(initialNode);
+
+ // bool initialNodeIsNotInitialized = true;
+
+ while (!nodes.empty()) {
+ auto node = nodes.front();
+ nodes.pop_front();
+
+ if (visited.count(node) || is_type(node)) {
+ continue;
+ }
+
+ visited.insert(node);
+
+ bool handleConnectedNodes = false;
+ if (NetworkHelper::isPrecisionPreserved(node) || is_type(node)) {
+ auto& rt = node->get_rt_info();
+
+ if (node == initialNode) {
+ rt[name] = newAttribute;
+ handleConnectedNodes = true;
+ } else {
+ auto it = rt.find(name);
+ if (it != rt.end()) {
+ const auto currentAttribute = it->second;
+ if (oldAttribute.get() == currentAttribute.get()) {
+ rt[name] = newAttribute;
+ }
+ handleConnectedNodes = true;
+ }
+ }
+ }
+
+ if (!handleConnectedNodes) {
+ continue;
+ }
+
+ if (!is_type(node)) {
+ for (size_t index = 0ul; index < node->get_input_size(); ++index) {
+ auto getInput = [](const std::shared_ptr& node, const size_t index) {
+ const auto dequantization = NetworkHelper::getDequantization(node, index);
+ if (!dequantization.empty() &&
+ (is_type