diff --git a/docs/low_precision_transformations/quantization/pipelines/fake_quantize_decomposition.md b/docs/low_precision_transformations/quantization/pipelines/fake_quantize_decomposition.md
new file mode 100644
index 00000000000000..fd10859c20abe3
--- /dev/null
+++ b/docs/low_precision_transformations/quantization/pipelines/fake_quantize_decomposition.md
@@ -0,0 +1,151 @@
+# OpenVINO™ Low Precision Transformations: FakeQuantizeDecompositionTransformation pipelines
+## Table of Contents
+1. [Introduction](#introduction)
+2. [Pipeline #1: FakeQuantize decomposition](#pipeline-1-fakequantize-decomposition)
+3. [Pipeline #2: Concat per-tensor quantization](#pipeline-2-concat-per-tensor-quantization)
+4. [Pipeline #3: Concat multi-channels quantization](#pipeline-3-concat-multi-channels-quantization)
+5. [Pipeline #4: FakeQuantize connects neighbor cascade Concat operations](#pipeline-4-fakequantize-connects-neighbor-cascade-concat-operations)
+6. [Pipeline #5: AvgPool precision propagation](#pipeline-5-avgpool-precision-propagation)
+
+## Introduction
+`FakeQuantizeDecompositionTransformation` decomposes `FakeQuantize` operation on quantize (`FakeQuantize` with low precision output) and dequantization operations (`Convert`, `Subtract` and `Multiply`). `FakeQuantize` result output precision depends on:
+1. Next operation supported input precision. Customizable parameter `precisionsOnActivations` is used for identifying supported input precision.
+2. Operation output intervals.
+
+## Pipeline #1: FakeQuantize decomposition
+[NOT UPDATED]
+Features:
+1. `FakeQuantize` on activations operation output intervals are signed, default precision should be `signed int8` which is not supported by `Convolution` bellow.
+2. Quantize and dequantize operations on activations are presented by one `Fakequantize` operation.
+3. Quantize and dequantize operations on weights are presented by one `Fakequantize` operation.
+4. There is no `FakeQuantize` between `AvgPool` and `Convolution`.
+5. `Convolution` weights are quantized.
+
+> TODO: if `Convolution` is not quantized then [[input] port] requirements are not set. <= WIP
+> TODO: if operation is not precision preserved then `PRECISION_PRESERVED` attribute can be skipped. <= WIP: right now: created everywhere
+
+### Original model
+![Original model](img/pipeline1/actual.svg)
+
+### Markup precisions
+![Markup precisions result](img/pipeline1/step1_markup_precisions.svg)
+
+### Markup AvgPool precisions (CPU/GPU specific)
+![Markup AvgPool precisions (CPU/GPU specific) result](img/pipeline1/step2_markup_avg_pool_precisions.svg)
+
+### Propagate precisions
+![Propagate precisions result](img/pipeline1/step3_propagate_precisions.svg)
+
+### Transformations
+![Transformations result](img/pipeline1/transformed.svg)
+
+## Pipeline #2: Concat per-tensor quantization
+[NOT UPDATED]
+Features:
+1. `FakeQuantize` on activations operations output intervals are signed, default precision should be `signed int8` which is not supported by `Convolution` bellow.
+2. `FakeQuantize` on activations operations have different output intervals which will be aligned.
+3. Quantize and dequantize operations on activations are presented by one `Fakequantize` operation.
+4. Quantize and dequantize operations on weights are presented by one `Fakequantize` operation.
+5. There is no `FakeQuantize` between `AvgPool` and `Convolution`.
+6. `Convolution` weights are quantized.
+
+> TODO: `Convolution` operation defines `ConcatTransformation` behavior for each plugin and the behavior is not configurable.
+
+> TODO: if `Convolution` is not quantized then `FakeQuantize` are not aligned <= WIP: `MarkupPrecisions` tranformation checks each operation quantization and add empty [input [port]] requirements if operation is not quantized.
+> TODO: if `ConvolutionTransformation` is skipped ([input [port]] requirements are empty) then `FakeQuantize` are not aligned <= WIP
+> TODO: if `Convolution` operation doesn't exist then `FakeQuantize` are not aligned <= WIP
+
+### Original model
+![Original model](img/pipeline2/actual.svg)
+
+### Markup precisions
+![Markup precisions result](img/pipeline2/step1_markup_precisions.svg)
+
+### Markup AvgPool precisions (CPU/GPU specific)
+![Markup AvgPool precisions (CPU/GPU specific) result](img/pipeline2/step2_markup_avg_pool_precisions.svg)
+
+### Propagate precisions
+![Propagate precisions result](img/pipeline2/step3_propagate_precisions.svg)
+
+### Align concatization quantization
+![Align concatization quantization result](img/pipeline2/step4_align_concat_quantization.svg)
+
+### Transformations
+![Transformations result](img/pipeline2/transformed.svg)
+
+## Pipeline #3: Concat multi-channels quantization
+[NOT UPDATED]
+Features:
+1. Quantize and dequantize operations on activations are presented by one `Fakequantize` operation.
+2. There is no `FakeQuantize` between `AvgPool` and `Result`.
+
+### Original model
+![Original model](img/pipeline3/actual.svg)
+
+### Markup precisions
+![Markup precisions result](img/pipeline3/step1_markup_precisions.svg)
+
+### Markup AvgPool precisions (CPU/GPU specific)
+![Markup AvgPool precisions (CPU/GPU specific) result](img/pipeline3/step2_markup_avg_pool_precisions.svg)
+
+### Propagate precisions
+![Propagate precisions result](img/pipeline3/step3_propagate_precisions.svg)
+
+### Align concatization quantization
+![Align concatization quantization result](img/pipeline3/step4_align_concat_quantization.svg)
+
+### Transformations
+![Transformations result](img/pipeline3/transformed.svg)
+
+## Pipeline #4: FakeQuantize connects neighbor cascade Concat operations
+Features:
+1. Quantize and dequantize operations on activations are presented by one `Fakequantize` operation.
+2. There is `FakeQuantize` between two `Concat` subgraphs: the first uses multi-channel quantization, the second uses per-tensor quantization.
+
+> Source: `ConcatWithNeighborsWithConvolutionTransformation` functional test.
+
+### Original model
+![Original model](img/pipeline4/actual.svg)
+
+### Markup precisions
+![Markup precisions result](img/pipeline4/step1_markup_precisions.svg)
+
+### Markup AvgPool precisions (CPU/GPU specific)
+![Markup AvgPool precisions (CPU/GPU specific) result](img/pipeline4/step2_markup_avg_pool_precisions.svg)
+
+### Propagate precisions
+![Propagate precisions result](img/pipeline4/step3_propagate_precisions.svg)
+
+### Align concatization intervals
+![Align concatization intervals result](img/pipeline4/step4_align_concat_intervals.svg)
+
+### Align concatization quantization
+![Align concatization quantization result](img/pipeline4/step5_align_concat_quantization.svg)
+
+### Transformations
+![Transformations result](img/pipeline4/transformed.svg)
+
+## Pipeline #5: AvgPool precision propagation
+
+Features:
+1. There is `FakeQuantize` after `AvgPool`.
+
+> Source: `MarkupAvgPoolPrecisionsTransformation` functional test.
+
+### Original model
+![Original model](img/pipeline5/actual.svg)
+
+### Markup precisions
+![Markup precisions result](img/pipeline5/step1_markup_precisions.svg)
+
+### Markup AvgPool precisions (CPU/GPU specific)
+![Markup AvgPool precisions (CPU/GPU specific) result](img/pipeline5/step2_markup_avg_pool_precisions.svg)
+
+### Propagate precisions
+![Propagate precisions result](img/pipeline5/step3_propagate_precisions.svg)
+
+### Align concatization quantization
+![Align concatization quantization result](img/pipeline5/step4_align_concat_quantization.svg)
+
+### Transformations
+![Transformations result](img/pipeline5/transformed.svg)
\ No newline at end of file
diff --git a/docs/low_precision_transformations/quantization/pipelines/img/pipeline1/actual.svg b/docs/low_precision_transformations/quantization/pipelines/img/pipeline1/actual.svg
new file mode 100644
index 00000000000000..f3b33203eb1fab
--- /dev/null
+++ b/docs/low_precision_transformations/quantization/pipelines/img/pipeline1/actual.svg
@@ -0,0 +1,277 @@
+
+
+
+
+
diff --git a/docs/low_precision_transformations/quantization/pipelines/img/pipeline1/step1_markup_precisions.svg b/docs/low_precision_transformations/quantization/pipelines/img/pipeline1/step1_markup_precisions.svg
new file mode 100644
index 00000000000000..8dc984562eab82
--- /dev/null
+++ b/docs/low_precision_transformations/quantization/pipelines/img/pipeline1/step1_markup_precisions.svg
@@ -0,0 +1,283 @@
+
+
+
+
+
diff --git a/docs/low_precision_transformations/quantization/pipelines/img/pipeline1/step2_markup_avg_pool_precisions.svg b/docs/low_precision_transformations/quantization/pipelines/img/pipeline1/step2_markup_avg_pool_precisions.svg
new file mode 100644
index 00000000000000..a4aa50854eed7a
--- /dev/null
+++ b/docs/low_precision_transformations/quantization/pipelines/img/pipeline1/step2_markup_avg_pool_precisions.svg
@@ -0,0 +1,283 @@
+
+
+
+
+
diff --git a/docs/low_precision_transformations/quantization/pipelines/img/pipeline1/step3_propagate_precisions.svg b/docs/low_precision_transformations/quantization/pipelines/img/pipeline1/step3_propagate_precisions.svg
new file mode 100644
index 00000000000000..2a7e6bc8b7661e
--- /dev/null
+++ b/docs/low_precision_transformations/quantization/pipelines/img/pipeline1/step3_propagate_precisions.svg
@@ -0,0 +1,283 @@
+
+
+
+
+
diff --git a/docs/low_precision_transformations/quantization/pipelines/img/pipeline1/transformed.svg b/docs/low_precision_transformations/quantization/pipelines/img/pipeline1/transformed.svg
new file mode 100644
index 00000000000000..116422af1c3623
--- /dev/null
+++ b/docs/low_precision_transformations/quantization/pipelines/img/pipeline1/transformed.svg
@@ -0,0 +1,266 @@
+
+
+
+
+
diff --git a/docs/low_precision_transformations/quantization/pipelines/img/pipeline2/actual.svg b/docs/low_precision_transformations/quantization/pipelines/img/pipeline2/actual.svg
new file mode 100644
index 00000000000000..b59be6f9db9103
--- /dev/null
+++ b/docs/low_precision_transformations/quantization/pipelines/img/pipeline2/actual.svg
@@ -0,0 +1,433 @@
+
+
+
+
+
diff --git a/docs/low_precision_transformations/quantization/pipelines/img/pipeline2/step1_markup_precisions.svg b/docs/low_precision_transformations/quantization/pipelines/img/pipeline2/step1_markup_precisions.svg
new file mode 100644
index 00000000000000..39726b470b6f5d
--- /dev/null
+++ b/docs/low_precision_transformations/quantization/pipelines/img/pipeline2/step1_markup_precisions.svg
@@ -0,0 +1,436 @@
+
+
+
+
+
diff --git a/docs/low_precision_transformations/quantization/pipelines/img/pipeline2/step2_markup_avg_pool_precisions.svg b/docs/low_precision_transformations/quantization/pipelines/img/pipeline2/step2_markup_avg_pool_precisions.svg
new file mode 100644
index 00000000000000..e14309c2223cdb
--- /dev/null
+++ b/docs/low_precision_transformations/quantization/pipelines/img/pipeline2/step2_markup_avg_pool_precisions.svg
@@ -0,0 +1,438 @@
+
+
+
+
+
diff --git a/docs/low_precision_transformations/quantization/pipelines/img/pipeline2/step3_propagate_precisions.svg b/docs/low_precision_transformations/quantization/pipelines/img/pipeline2/step3_propagate_precisions.svg
new file mode 100644
index 00000000000000..4c7e2aa42dbfa3
--- /dev/null
+++ b/docs/low_precision_transformations/quantization/pipelines/img/pipeline2/step3_propagate_precisions.svg
@@ -0,0 +1,438 @@
+
+
+
+
+
diff --git a/docs/low_precision_transformations/quantization/pipelines/img/pipeline2/step4_align_concat_quantization.svg b/docs/low_precision_transformations/quantization/pipelines/img/pipeline2/step4_align_concat_quantization.svg
new file mode 100644
index 00000000000000..2251aed4549453
--- /dev/null
+++ b/docs/low_precision_transformations/quantization/pipelines/img/pipeline2/step4_align_concat_quantization.svg
@@ -0,0 +1,446 @@
+
+
+
+
+
diff --git a/docs/low_precision_transformations/quantization/pipelines/img/pipeline2/transformed.svg b/docs/low_precision_transformations/quantization/pipelines/img/pipeline2/transformed.svg
new file mode 100644
index 00000000000000..5a1c6f9410e202
--- /dev/null
+++ b/docs/low_precision_transformations/quantization/pipelines/img/pipeline2/transformed.svg
@@ -0,0 +1,428 @@
+
+
+
+
+
diff --git a/docs/low_precision_transformations/quantization/pipelines/img/pipeline3/actual.svg b/docs/low_precision_transformations/quantization/pipelines/img/pipeline3/actual.svg
new file mode 100644
index 00000000000000..1e41e6b4bfec31
--- /dev/null
+++ b/docs/low_precision_transformations/quantization/pipelines/img/pipeline3/actual.svg
@@ -0,0 +1,308 @@
+
+
+
+
+
diff --git a/docs/low_precision_transformations/quantization/pipelines/img/pipeline3/step1_markup_precisions.svg b/docs/low_precision_transformations/quantization/pipelines/img/pipeline3/step1_markup_precisions.svg
new file mode 100644
index 00000000000000..e040f632bdb24c
--- /dev/null
+++ b/docs/low_precision_transformations/quantization/pipelines/img/pipeline3/step1_markup_precisions.svg
@@ -0,0 +1,311 @@
+
+
+
+
+
diff --git a/docs/low_precision_transformations/quantization/pipelines/img/pipeline3/step2_markup_avg_pool_precisions.svg b/docs/low_precision_transformations/quantization/pipelines/img/pipeline3/step2_markup_avg_pool_precisions.svg
new file mode 100644
index 00000000000000..e9eb45b0573758
--- /dev/null
+++ b/docs/low_precision_transformations/quantization/pipelines/img/pipeline3/step2_markup_avg_pool_precisions.svg
@@ -0,0 +1,313 @@
+
+
+
+
+
diff --git a/docs/low_precision_transformations/quantization/pipelines/img/pipeline3/step3_propagate_precisions.svg b/docs/low_precision_transformations/quantization/pipelines/img/pipeline3/step3_propagate_precisions.svg
new file mode 100644
index 00000000000000..e9eb45b0573758
--- /dev/null
+++ b/docs/low_precision_transformations/quantization/pipelines/img/pipeline3/step3_propagate_precisions.svg
@@ -0,0 +1,313 @@
+
+
+
+
+
diff --git a/docs/low_precision_transformations/quantization/pipelines/img/pipeline3/step4_align_concat_quantization.svg b/docs/low_precision_transformations/quantization/pipelines/img/pipeline3/step4_align_concat_quantization.svg
new file mode 100644
index 00000000000000..74bd0d8bdef982
--- /dev/null
+++ b/docs/low_precision_transformations/quantization/pipelines/img/pipeline3/step4_align_concat_quantization.svg
@@ -0,0 +1,320 @@
+
+
+
+
+
diff --git a/docs/low_precision_transformations/quantization/pipelines/img/pipeline3/transformed.svg b/docs/low_precision_transformations/quantization/pipelines/img/pipeline3/transformed.svg
new file mode 100644
index 00000000000000..7a134cb8347eb0
--- /dev/null
+++ b/docs/low_precision_transformations/quantization/pipelines/img/pipeline3/transformed.svg
@@ -0,0 +1,374 @@
+
+
+
+
+
diff --git a/docs/low_precision_transformations/quantization/pipelines/img/pipeline4/actual.svg b/docs/low_precision_transformations/quantization/pipelines/img/pipeline4/actual.svg
new file mode 100644
index 00000000000000..0c17e5a9f17f06
--- /dev/null
+++ b/docs/low_precision_transformations/quantization/pipelines/img/pipeline4/actual.svg
@@ -0,0 +1,480 @@
+
+
+
+
+
diff --git a/docs/low_precision_transformations/quantization/pipelines/img/pipeline4/step1_markup_precisions.svg b/docs/low_precision_transformations/quantization/pipelines/img/pipeline4/step1_markup_precisions.svg
new file mode 100644
index 00000000000000..34dee1151691f9
--- /dev/null
+++ b/docs/low_precision_transformations/quantization/pipelines/img/pipeline4/step1_markup_precisions.svg
@@ -0,0 +1,483 @@
+
+
+
+
+
diff --git a/docs/low_precision_transformations/quantization/pipelines/img/pipeline4/step2_markup_avg_pool_precisions.svg b/docs/low_precision_transformations/quantization/pipelines/img/pipeline4/step2_markup_avg_pool_precisions.svg
new file mode 100644
index 00000000000000..34dee1151691f9
--- /dev/null
+++ b/docs/low_precision_transformations/quantization/pipelines/img/pipeline4/step2_markup_avg_pool_precisions.svg
@@ -0,0 +1,483 @@
+
+
+
+
+
diff --git a/docs/low_precision_transformations/quantization/pipelines/img/pipeline4/step3_propagate_precisions.svg b/docs/low_precision_transformations/quantization/pipelines/img/pipeline4/step3_propagate_precisions.svg
new file mode 100644
index 00000000000000..a0c2c786afbc3f
--- /dev/null
+++ b/docs/low_precision_transformations/quantization/pipelines/img/pipeline4/step3_propagate_precisions.svg
@@ -0,0 +1,486 @@
+
+
+
+
+
diff --git a/docs/low_precision_transformations/quantization/pipelines/img/pipeline4/step4_align_concat_intervals.svg b/docs/low_precision_transformations/quantization/pipelines/img/pipeline4/step4_align_concat_intervals.svg
new file mode 100644
index 00000000000000..986e7a58fb21aa
--- /dev/null
+++ b/docs/low_precision_transformations/quantization/pipelines/img/pipeline4/step4_align_concat_intervals.svg
@@ -0,0 +1,493 @@
+
+
+
+
+
diff --git a/docs/low_precision_transformations/quantization/pipelines/img/pipeline4/step5_align_concat_quantization.svg b/docs/low_precision_transformations/quantization/pipelines/img/pipeline4/step5_align_concat_quantization.svg
new file mode 100644
index 00000000000000..6444edbc0be3c2
--- /dev/null
+++ b/docs/low_precision_transformations/quantization/pipelines/img/pipeline4/step5_align_concat_quantization.svg
@@ -0,0 +1,496 @@
+
+
+
+
+
diff --git a/docs/low_precision_transformations/quantization/pipelines/img/pipeline4/transformed.svg b/docs/low_precision_transformations/quantization/pipelines/img/pipeline4/transformed.svg
new file mode 100644
index 00000000000000..29cda07e2bddef
--- /dev/null
+++ b/docs/low_precision_transformations/quantization/pipelines/img/pipeline4/transformed.svg
@@ -0,0 +1,574 @@
+
+
+
+
+
diff --git a/docs/low_precision_transformations/quantization/pipelines/img/pipeline5/actual.svg b/docs/low_precision_transformations/quantization/pipelines/img/pipeline5/actual.svg
new file mode 100644
index 00000000000000..37ea9e70967c45
--- /dev/null
+++ b/docs/low_precision_transformations/quantization/pipelines/img/pipeline5/actual.svg
@@ -0,0 +1,391 @@
+
+
+
+
+
diff --git a/docs/low_precision_transformations/quantization/pipelines/img/pipeline5/step1_markup_precisions.svg b/docs/low_precision_transformations/quantization/pipelines/img/pipeline5/step1_markup_precisions.svg
new file mode 100644
index 00000000000000..e439081e90242e
--- /dev/null
+++ b/docs/low_precision_transformations/quantization/pipelines/img/pipeline5/step1_markup_precisions.svg
@@ -0,0 +1,395 @@
+
+
+
+
+
diff --git a/docs/low_precision_transformations/quantization/pipelines/img/pipeline5/step2_markup_avg_pool_precisions.svg b/docs/low_precision_transformations/quantization/pipelines/img/pipeline5/step2_markup_avg_pool_precisions.svg
new file mode 100644
index 00000000000000..4a7df2b0890243
--- /dev/null
+++ b/docs/low_precision_transformations/quantization/pipelines/img/pipeline5/step2_markup_avg_pool_precisions.svg
@@ -0,0 +1,400 @@
+
+
+
+
+
diff --git a/docs/low_precision_transformations/quantization/pipelines/img/pipeline5/step3_propagate_precisions.svg b/docs/low_precision_transformations/quantization/pipelines/img/pipeline5/step3_propagate_precisions.svg
new file mode 100644
index 00000000000000..112c0d17f5fe91
--- /dev/null
+++ b/docs/low_precision_transformations/quantization/pipelines/img/pipeline5/step3_propagate_precisions.svg
@@ -0,0 +1,400 @@
+
+
+
+
+
diff --git a/docs/low_precision_transformations/quantization/pipelines/img/pipeline5/step4_align_concat_quantization.svg b/docs/low_precision_transformations/quantization/pipelines/img/pipeline5/step4_align_concat_quantization.svg
new file mode 100644
index 00000000000000..95fde030f25459
--- /dev/null
+++ b/docs/low_precision_transformations/quantization/pipelines/img/pipeline5/step4_align_concat_quantization.svg
@@ -0,0 +1,413 @@
+
+
+
+
+
diff --git a/docs/low_precision_transformations/quantization/pipelines/img/pipeline5/transformed.svg b/docs/low_precision_transformations/quantization/pipelines/img/pipeline5/transformed.svg
new file mode 100644
index 00000000000000..a4482fa54523f8
--- /dev/null
+++ b/docs/low_precision_transformations/quantization/pipelines/img/pipeline5/transformed.svg
@@ -0,0 +1,397 @@
+
+
+
+
+
diff --git a/inference-engine/src/cldnn_engine/cldnn_engine.cpp b/inference-engine/src/cldnn_engine/cldnn_engine.cpp
index 0bea81efacea19..717c24b01901f0 100644
--- a/inference-engine/src/cldnn_engine/cldnn_engine.cpp
+++ b/inference-engine/src/cldnn_engine/cldnn_engine.cpp
@@ -69,7 +69,7 @@
#include
#include
#include
-#include
+#include
#include
#include
#include
@@ -146,7 +146,7 @@ InferenceEngine::CNNNetwork clDNNEngine::CloneAndTransformNetwork(const Inferenc
bool enableInt8;
{
ngraph::pass::Manager manager;
- enableInt8 = config.enableInt8 && ngraph::pass::low_precision::LowPrecisionTransformer::isFunctionQuantized(nGraphFunc);
+ enableInt8 = config.enableInt8 && ngraph::pass::low_precision::LowPrecision::isFunctionQuantized(nGraphFunc);
if (enableInt8) {
manager.register_pass(
std::vector{ ngraph::element::i8, ngraph::element::u8, ngraph::element::i4, ngraph::element::u4 });
@@ -366,25 +366,22 @@ InferenceEngine::CNNNetwork clDNNEngine::CloneAndTransformNetwork(const Inferenc
if (!config.enable_fp16_for_quantized_models) {
manager.register_pass(precisions_array {{ ngraph::element::f16, ngraph::element::f32 }});
}
- auto lptPrerequisites = manager.register_pass();
- const std::vector supportedTypes = { ngraph::element::i8, ngraph::element::u8 };
- lptPrerequisites->add_matcher(supportedTypes);
- lptPrerequisites->add_matcher(supportedTypes);
- lptPrerequisites->add_matcher();
- manager.run_passes(nGraphFunc);
- auto params = LayerTransformation::Params(true, // updatePrecisions
- LayerTransformation::QuantizedTensorAlignment::UpdateLevel, // quantizedTensorAlignmentOnActivations
- LayerTransformation::QuantizedTensorAlignment::None, // quantizedTensorAlignmentOnWeights
- true); // supportAsymmetricQuantization
- LowPrecisionTransformer transformer(LowPrecisionTransformer::getAllTransformations(params)
- .add(LayerTransformation::Params(params)
- .setSupportAsymmetricQuantization(false)
- .setSupport3DTensorOnActivations(false))
- // INT8 StridedSlice not supported
- .remove());
-
- transformer.transform(nGraphFunc);
+ auto supportedPrecisionsOnActivation = std::vector({
+ OperationPrecisionRestriction::create({
+ {0, {ngraph::element::u8, ngraph::element::i8}},
+ {1, {ngraph::element::i8}},
+ }),
+ OperationPrecisionRestriction::create({
+ {0, {ngraph::element::u8, ngraph::element::i8}},
+ {1, {ngraph::element::i8}}
+ }),
+ OperationPrecisionRestriction::create({})
+ });
+
+ manager.register_pass(supportedPrecisionsOnActivation);
+ auto lpt_pass_config = manager.get_pass_config();
+ lpt_pass_config->disable();
}
{
diff --git a/inference-engine/src/low_precision_transformations/include/low_precision/add.hpp b/inference-engine/src/low_precision_transformations/include/low_precision/add.hpp
index fa64037797a384..94568ca236ffec 100644
--- a/inference-engine/src/low_precision_transformations/include/low_precision/add.hpp
+++ b/inference-engine/src/low_precision_transformations/include/low_precision/add.hpp
@@ -13,9 +13,8 @@ namespace low_precision {
class TRANSFORMATIONS_API AddTransformation : public EltwiseBaseTransformation {
public:
- AddTransformation(const Params& params) : EltwiseBaseTransformation(params) {}
- ~AddTransformation() override {}
- void registerMatcherIn(GraphRewrite& pass, TransformationContext& context) const override;
+ NGRAPH_RTTI_DECLARATION;
+ AddTransformation(const Params& params = Params());
bool transform(TransformationContext& context, ngraph::pattern::Matcher &m) const override;
bool canBeTransformed(const TransformationContext& context, std::shared_ptr layer) const override;
};
diff --git a/inference-engine/src/low_precision_transformations/include/low_precision/align_quantization_intervals.hpp b/inference-engine/src/low_precision_transformations/include/low_precision/align_quantization_intervals.hpp
new file mode 100644
index 00000000000000..657e2d9482a20c
--- /dev/null
+++ b/inference-engine/src/low_precision_transformations/include/low_precision/align_quantization_intervals.hpp
@@ -0,0 +1,34 @@
+// Copyright (C) 2021 Intel Corporation
+// SPDX-License-Identifier: Apache-2.0
+//
+
+#pragma once
+
+#include
+
+#include
+#include
+
+#include
+#include
+#include
+
+namespace ngraph {
+namespace pass {
+namespace low_precision {
+
+class TRANSFORMATIONS_API AlignQuantizationIntervals;
+
+} // namespace low_precision
+} // namespace pass
+} // namespace ngraph
+
+class ngraph::pass::low_precision::AlignQuantizationIntervals : public ngraph::pass::FunctionPass {
+public:
+ NGRAPH_RTTI_DECLARATION;
+ AlignQuantizationIntervals(LayerTransformation::Params params = LayerTransformation::Params());
+ bool run_on_function(std::shared_ptr f) override;
+
+protected:
+ LayerTransformation::Params params;
+};
diff --git a/inference-engine/src/low_precision_transformations/include/low_precision/align_quantization_parameters.hpp b/inference-engine/src/low_precision_transformations/include/low_precision/align_quantization_parameters.hpp
new file mode 100644
index 00000000000000..75ac78de6a142e
--- /dev/null
+++ b/inference-engine/src/low_precision_transformations/include/low_precision/align_quantization_parameters.hpp
@@ -0,0 +1,34 @@
+// Copyright (C) 2021 Intel Corporation
+// SPDX-License-Identifier: Apache-2.0
+//
+
+#pragma once
+
+#include
+
+#include
+#include
+
+#include
+#include
+#include
+
+namespace ngraph {
+namespace pass {
+namespace low_precision {
+
+class TRANSFORMATIONS_API AlignQuantizationParameters;
+
+} // namespace low_precision
+} // namespace pass
+} // namespace ngraph
+
+class ngraph::pass::low_precision::AlignQuantizationParameters : public ngraph::pass::FunctionPass {
+public:
+ NGRAPH_RTTI_DECLARATION;
+ AlignQuantizationParameters(LayerTransformation::Params params = LayerTransformation::Params());
+ bool run_on_function(std::shared_ptr f) override;
+
+protected:
+ LayerTransformation::Params params;
+};
diff --git a/inference-engine/src/low_precision_transformations/include/low_precision/avg_pool.hpp b/inference-engine/src/low_precision_transformations/include/low_precision/avg_pool.hpp
index 823c8990110904..26e49f954777c0 100644
--- a/inference-engine/src/low_precision_transformations/include/low_precision/avg_pool.hpp
+++ b/inference-engine/src/low_precision_transformations/include/low_precision/avg_pool.hpp
@@ -13,8 +13,8 @@ namespace low_precision {
class TRANSFORMATIONS_API AvgPoolTransformation : public LayerTransformation {
public:
- AvgPoolTransformation(const Params& params);
- void registerMatcherIn(GraphRewrite& pass, TransformationContext& context) const override;
+ NGRAPH_RTTI_DECLARATION;
+ AvgPoolTransformation(const Params& params = Params());
bool transform(TransformationContext& context, ngraph::pattern::Matcher &m) const override;
bool isPrecisionPreserved(std::shared_ptr layer) const noexcept override;
bool canBeTransformed(const TransformationContext& context, std::shared_ptr layer) const override;
diff --git a/inference-engine/src/low_precision_transformations/include/low_precision/base_matcher_pass.hpp b/inference-engine/src/low_precision_transformations/include/low_precision/base_matcher_pass.hpp
new file mode 100644
index 00000000000000..4463c791537139
--- /dev/null
+++ b/inference-engine/src/low_precision_transformations/include/low_precision/base_matcher_pass.hpp
@@ -0,0 +1,24 @@
+// Copyright (C) 2021 Intel Corporation
+// SPDX-License-Identifier: Apache-2.0
+//
+
+#pragma once
+#include
+#include
+#include "rt_info/attribute_parameters.hpp"
+
+namespace ngraph {
+namespace pass {
+namespace low_precision {
+
+class TRANSFORMATIONS_API BaseMatcherPass;
+
+} // namespace low_precision
+} // namespace pass
+} // namespace ngraph
+
+class TRANSFORMATIONS_API ngraph::pass::low_precision::BaseMatcherPass : public ngraph::pass::MatcherPass {
+public:
+ BaseMatcherPass(const AttributeParameters& params = AttributeParameters());
+ AttributeParameters params;
+};
diff --git a/inference-engine/src/low_precision_transformations/include/low_precision/clamp.hpp b/inference-engine/src/low_precision_transformations/include/low_precision/clamp.hpp
index 7698cf5b6da3ca..89b9f1d2181f02 100644
--- a/inference-engine/src/low_precision_transformations/include/low_precision/clamp.hpp
+++ b/inference-engine/src/low_precision_transformations/include/low_precision/clamp.hpp
@@ -14,8 +14,8 @@ namespace low_precision {
class TRANSFORMATIONS_API ClampTransformation : public LayerTransformation {
public:
- ClampTransformation(const Params& params);
- void registerMatcherIn(GraphRewrite& pass, TransformationContext& context) const override;
+ NGRAPH_RTTI_DECLARATION;
+ ClampTransformation(const Params& params = Params());
bool transform(TransformationContext& context, ngraph::pattern::Matcher& m) const override;
bool canBeTransformed(const TransformationContext& context, std::shared_ptr op) const override;
bool isPrecisionPreserved(std::shared_ptr layer) const noexcept override;
diff --git a/inference-engine/src/low_precision_transformations/include/low_precision/common/fake_quantize_dequantization.hpp b/inference-engine/src/low_precision_transformations/include/low_precision/common/fake_quantize_dequantization.hpp
index 67c522bb7e3fcf..62b7d28d802ddf 100644
--- a/inference-engine/src/low_precision_transformations/include/low_precision/common/fake_quantize_dequantization.hpp
+++ b/inference-engine/src/low_precision_transformations/include/low_precision/common/fake_quantize_dequantization.hpp
@@ -8,6 +8,7 @@
#include
#include
#include
+#include
namespace ngraph {
namespace pass {
@@ -15,7 +16,7 @@ namespace low_precision {
typedef std::tuple, std::shared_ptr> FakeQuantizeDequantizationValues;
-class FakeQuantizeDequantization {
+class TRANSFORMATIONS_API FakeQuantizeDequantization {
public:
FakeQuantizeDequantization();
diff --git a/inference-engine/src/low_precision_transformations/include/low_precision/common/operation_per_tensor_quantization_restriction.hpp b/inference-engine/src/low_precision_transformations/include/low_precision/common/operation_per_tensor_quantization_restriction.hpp
new file mode 100644
index 00000000000000..cbf316ab082085
--- /dev/null
+++ b/inference-engine/src/low_precision_transformations/include/low_precision/common/operation_per_tensor_quantization_restriction.hpp
@@ -0,0 +1,56 @@
+// Copyright (C) 2021 Intel Corporation
+// SPDX-License-Identifier: Apache-2.0
+//
+
+#pragma once
+
+#include
+
+#include
+#include
+
+#include
+#include
+
+namespace ngraph {
+namespace pass {
+namespace low_precision {
+
+class OperationPerTensorQuantizationRestriction {
+public:
+ using RestrictedPorts = std::vector;
+
+ ngraph::Node::type_info_t operationType;
+ bool specifyVersion;
+ std::vector restrictedPorts;
+
+ OperationPerTensorQuantizationRestriction() = default;
+ OperationPerTensorQuantizationRestriction(
+ const ngraph::Node::type_info_t operationType,
+ const bool specifyVersion,
+ const RestrictedPorts& restrictedPorts) :
+ operationType(operationType),
+ specifyVersion(specifyVersion),
+ restrictedPorts(restrictedPorts) {}
+
+ template
+ static OperationPerTensorQuantizationRestriction create(
+ const RestrictedPorts& restrictedPorts = {},
+ const bool specifyVersion = false) {
+ return OperationPerTensorQuantizationRestriction(T::get_type_info_static(), specifyVersion, restrictedPorts);
+ }
+
+ template
+ static RestrictedPorts getPrecisionsByOperationType(std::vector& restrictions) {
+ for (const auto& restriction : restrictions) {
+ if (restriction.operationType == T::get_type_info_static()) {
+ return restriction.restrictedPorts;
+ }
+ }
+ return {};
+ }
+};
+
+} // namespace low_precision
+} // namespace pass
+} // namespace ngraph
diff --git a/inference-engine/src/low_precision_transformations/include/low_precision/common/operation_precision_restriction.hpp b/inference-engine/src/low_precision_transformations/include/low_precision/common/operation_precision_restriction.hpp
new file mode 100644
index 00000000000000..870577a2775b20
--- /dev/null
+++ b/inference-engine/src/low_precision_transformations/include/low_precision/common/operation_precision_restriction.hpp
@@ -0,0 +1,59 @@
+// Copyright (C) 2021 Intel Corporation
+// SPDX-License-Identifier: Apache-2.0
+//
+
+#pragma once
+
+#include
+#include
+#include
+#include
+
+#include
+#include
+
+#include
+#include
+
+namespace ngraph {
+namespace pass {
+namespace low_precision {
+
+class OperationPrecisionRestriction {
+public:
+ using PrecisionsByPort = std::vector>>;
+
+ ngraph::Node::type_info_t operationType;
+ bool specifyVersion;
+ std::vector>> precisionsByPort;
+
+ OperationPrecisionRestriction() = default;
+ OperationPrecisionRestriction(
+ const ngraph::Node::type_info_t operationType,
+ const bool specifyVersion,
+ const PrecisionsByPort& precisionsByPort) :
+ operationType(operationType),
+ specifyVersion(specifyVersion),
+ precisionsByPort(precisionsByPort) {}
+
+ template
+ static OperationPrecisionRestriction create(
+ const PrecisionsByPort& precisionsByPort,
+ const bool specifyVersion = false) {
+ return OperationPrecisionRestriction(T::get_type_info_static(), specifyVersion, precisionsByPort);
+ }
+
+ template
+ static PrecisionsByPort getPrecisionsByOperationType(std::vector& restrictions) {
+ for (const auto& restriction : restrictions) {
+ if (restriction.operationType == T::get_type_info_static()) {
+ return restriction.precisionsByPort;
+ }
+ }
+ return {};
+ }
+};
+
+} // namespace low_precision
+} // namespace pass
+} // namespace ngraph
diff --git a/inference-engine/src/low_precision_transformations/include/low_precision/concat.hpp b/inference-engine/src/low_precision_transformations/include/low_precision/concat.hpp
index e381fd5d0a0401..ae29282b5aecd9 100644
--- a/inference-engine/src/low_precision_transformations/include/low_precision/concat.hpp
+++ b/inference-engine/src/low_precision_transformations/include/low_precision/concat.hpp
@@ -22,9 +22,8 @@ namespace low_precision {
class TRANSFORMATIONS_API ConcatTransformation : public LayerTransformation {
public:
- ConcatTransformation(const Params& params) : LayerTransformation(params) {}
- ~ConcatTransformation() override {};
- void registerMatcherIn(GraphRewrite& pass, TransformationContext& context) const override;
+ NGRAPH_RTTI_DECLARATION;
+ ConcatTransformation(const Params& params = Params());
bool transform(TransformationContext& context, ngraph::pattern::Matcher &m) const override;
bool isPrecisionPreserved(std::shared_ptr layer) const noexcept override;
bool canBeTransformed(const TransformationContext& context, std::shared_ptr layer) const override;
diff --git a/inference-engine/src/low_precision_transformations/include/low_precision/concat_multi_channels.hpp b/inference-engine/src/low_precision_transformations/include/low_precision/concat_multi_channels.hpp
deleted file mode 100644
index 48c0a0ef9eaa5f..00000000000000
--- a/inference-engine/src/low_precision_transformations/include/low_precision/concat_multi_channels.hpp
+++ /dev/null
@@ -1,51 +0,0 @@
-// Copyright (C) 2018-2021 Intel Corporation
-// SPDX-License-Identifier: Apache-2.0
-//
-
-#pragma once
-
-#include
-#include
-#include
-
-#include
-
-#include "concat.hpp"
-#include "common/subgraph.hpp"
-#include "common/fake_quantize_dequantization.hpp"
-
-namespace ngraph {
-namespace pass {
-namespace low_precision {
-
-class TRANSFORMATIONS_API ConcatMultiChannelsTransformation : public ConcatTransformation {
-public:
- ConcatMultiChannelsTransformation(const Params& params) : ConcatTransformation(params) {}
- ~ConcatMultiChannelsTransformation() override {};
- void registerMatcherIn(GraphRewrite& pass, TransformationContext& context) const override;
- bool transform(TransformationContext& context, ngraph::pattern::Matcher &m) const override;
- bool isPrecisionPreserved(std::shared_ptr layer) const noexcept override;
-
-private:
- // Go through the parent elements of the layer and fill dequantization collection
- // with Dq operations that should be inserted before the layer.
- void fillDequantization(
- const std::shared_ptr layer,
- const std::unordered_map& dequantizationByFakeQuantize,
- std::vector& dequantization) const;
-
- FakeQuantizeDequantization getConcatenatedDequantization(
- const std::shared_ptr concat,
- const std::vector& dequantization) const;
-
- static FakeQuantizeDequantization getFoldedDequantization(
- const std::shared_ptr operation,
- const FakeQuantizeDequantization& dequantization,
- const size_t sourceOutputIdx);
-
- bool isMultiChannel(const std::vector>& concatLayers) const noexcept;
-};
-
-} // namespace low_precision
-} // namespace pass
-} // namespace ngraph
diff --git a/inference-engine/src/low_precision_transformations/include/low_precision/convert.hpp b/inference-engine/src/low_precision_transformations/include/low_precision/convert.hpp
index ca860903420873..a84deaed912789 100644
--- a/inference-engine/src/low_precision_transformations/include/low_precision/convert.hpp
+++ b/inference-engine/src/low_precision_transformations/include/low_precision/convert.hpp
@@ -13,9 +13,8 @@ namespace low_precision {
class TRANSFORMATIONS_API ConvertTransformation : public LayerTransformation {
public:
- ConvertTransformation(const Params& params) : LayerTransformation(params) {}
- ~ConvertTransformation() override {}
- void registerMatcherIn(GraphRewrite& pass, TransformationContext& context) const override;
+ NGRAPH_RTTI_DECLARATION;
+ ConvertTransformation(const Params& params = Params());
bool transform(TransformationContext& context, ngraph::pattern::Matcher &m) const override;
bool isPrecisionPreserved(std::shared_ptr layer) const noexcept override;
};
diff --git a/inference-engine/src/low_precision_transformations/include/low_precision/convolution.hpp b/inference-engine/src/low_precision_transformations/include/low_precision/convolution.hpp
index e3041a0b08f2c1..9bd0e2cda3adea 100644
--- a/inference-engine/src/low_precision_transformations/include/low_precision/convolution.hpp
+++ b/inference-engine/src/low_precision_transformations/include/low_precision/convolution.hpp
@@ -13,8 +13,8 @@ namespace low_precision {
class TRANSFORMATIONS_API ConvolutionTransformation : public WeightableLayerTransformation {
public:
- ConvolutionTransformation(const Params& params);
- void registerMatcherIn(GraphRewrite& pass, TransformationContext& context) const override;
+ NGRAPH_RTTI_DECLARATION;
+ ConvolutionTransformation(const Params& params = Params());
bool transform(TransformationContext& context, ngraph::pattern::Matcher &m) const override;
bool isQuantized(std::shared_ptr layer) const noexcept override;
};
diff --git a/inference-engine/src/low_precision_transformations/include/low_precision/create_attribute.hpp b/inference-engine/src/low_precision_transformations/include/low_precision/create_attribute.hpp
new file mode 100644
index 00000000000000..ea6a48e9cdd1d9
--- /dev/null
+++ b/inference-engine/src/low_precision_transformations/include/low_precision/create_attribute.hpp
@@ -0,0 +1,63 @@
+// Copyright (C) 2021 Intel Corporation
+// SPDX-License-Identifier: Apache-2.0
+//
+
+#pragma once
+
+#include
+#include
+
+#include
+#include
+#include
+
+#include
+#include
+#include "base_matcher_pass.hpp"
+#include "network_helper.hpp"
+#include "lpt_itt.hpp"
+
+namespace ngraph {
+namespace pass {
+namespace low_precision {
+
+template
+class CreateAttribute;
+
+} // namespace low_precision
+} // namespace pass
+} // namespace ngraph
+
+enum class AttributeSource {
+ Node,
+ OutputPort
+};
+
+template
+class ngraph::pass::low_precision::CreateAttribute : public ngraph::pass::low_precision::BaseMatcherPass {
+public:
+ CreateAttribute(const AttributeSource source = AttributeSource::Node) {
+ assert((source == AttributeSource::Node) || (source == AttributeSource::OutputPort));
+ auto operation = std::is_same::value ?
+ std::make_shared(element::f32, Shape{}, [](std::shared_ptr n) { return true; }) :
+ pattern::wrap_type();
+
+ ngraph::graph_rewrite_callback callback = [&](pattern::Matcher& m) {
+ auto op = m.get_match_root();
+ if (!op || transformation_callback(op)) {
+ return false;
+ }
+ {
+ OV_ITT_SCOPED_TASK(itt::domains::LPT_LT, "CreateAttribute");
+ auto attribute = ngraph::VariantWrapper>::create(op, params);
+ if (attribute == nullptr) {
+ return false;
+ }
+ }
+ return true;
+ };
+
+ auto matcher = std::make_shared(operation, "CreateAttribute");
+ this->register_matcher(matcher, callback);
+ }
+};
diff --git a/inference-engine/src/low_precision_transformations/include/low_precision/create_precisions_dependent_attribute.hpp b/inference-engine/src/low_precision_transformations/include/low_precision/create_precisions_dependent_attribute.hpp
new file mode 100644
index 00000000000000..e8d921967e2807
--- /dev/null
+++ b/inference-engine/src/low_precision_transformations/include/low_precision/create_precisions_dependent_attribute.hpp
@@ -0,0 +1,70 @@
+// Copyright (C) 2021 Intel Corporation
+// SPDX-License-Identifier: Apache-2.0
+//
+
+#pragma once
+
+#include
+#include
+
+#include
+#include
+#include
+
+#include
+#include
+#include
+#include "rt_info/precision_preserved_attribute.hpp"
+#include "network_helper.hpp"
+#include "lpt_itt.hpp"
+
+namespace ngraph {
+namespace pass {
+namespace low_precision {
+
+template
+class CreatePrecisionsDependentAttribute;
+
+} // namespace low_precision
+} // namespace pass
+} // namespace ngraph
+
+template
+class ngraph::pass::low_precision::CreatePrecisionsDependentAttribute : public ngraph::pass::MatcherPass {
+public:
+ CreatePrecisionsDependentAttribute() {
+ auto operation = pattern::wrap_type();
+
+ ngraph::graph_rewrite_callback callback = [&](pattern::Matcher& m) {
+ auto node = m.get_match_root();
+ if (!node || transformation_callback(node)) {
+ return false;
+ }
+
+ {
+ OV_ITT_SCOPED_TASK(itt::domains::LPT_LT, "CreatePrecisionsDependentAttribute");
+ auto &rt = node->get_rt_info();
+
+ const auto precisionPreservedAttribute = std::make_shared>(
+ std::make_shared(false));
+ rt[ngraph::VariantWrapper::type_info.name] = precisionPreservedAttribute;
+ const auto &targetSharedValue = precisionPreservedAttribute->get()->sharedValue;
+
+ const auto attribute = std::make_shared>>(
+ std::make_shared());
+ rt[ngraph::VariantWrapper>::type_info.name] = attribute;
+
+ ngraph::pass::low_precision::NetworkHelper::reassign(
+ targetSharedValue,
+ {
+ std::dynamic_pointer_cast(attribute->get()),
+ std::dynamic_pointer_cast(precisionPreservedAttribute->get())
+ });
+ }
+ return true;
+ };
+
+ auto matcher = std::make_shared(operation, "CreateAttribute");
+ this->register_matcher(matcher, callback);
+ }
+};
diff --git a/inference-engine/src/low_precision_transformations/include/low_precision/depth_to_space.hpp b/inference-engine/src/low_precision_transformations/include/low_precision/depth_to_space.hpp
index 0fc9d6446897d1..004b6c6c915c9c 100644
--- a/inference-engine/src/low_precision_transformations/include/low_precision/depth_to_space.hpp
+++ b/inference-engine/src/low_precision_transformations/include/low_precision/depth_to_space.hpp
@@ -12,10 +12,9 @@ namespace low_precision {
class TRANSFORMATIONS_API DepthToSpaceTransformation : public TransparentBaseTransformation {
public:
- DepthToSpaceTransformation(const Params& params) : TransparentBaseTransformation(params) {}
- ~DepthToSpaceTransformation() override {}
+ NGRAPH_RTTI_DECLARATION;
+ DepthToSpaceTransformation(const Params& params = Params());
bool transform(TransformationContext &context, ngraph::pattern::Matcher &m) const override;
- void registerMatcherIn(GraphRewrite& pass, TransformationContext& context) const override;
bool isPrecisionPreserved(std::shared_ptr layer) const noexcept override;
bool canBeTransformed(const TransformationContext& context, std::shared_ptr layer) const override;
};
diff --git a/inference-engine/src/low_precision_transformations/include/low_precision/fake_quantize.hpp b/inference-engine/src/low_precision_transformations/include/low_precision/fake_quantize.hpp
index ac75f406a2be98..c0419455d1b86c 100644
--- a/inference-engine/src/low_precision_transformations/include/low_precision/fake_quantize.hpp
+++ b/inference-engine/src/low_precision_transformations/include/low_precision/fake_quantize.hpp
@@ -15,8 +15,8 @@ namespace low_precision {
class TRANSFORMATIONS_API FakeQuantizeTransformation : public LayerTransformation {
public:
- FakeQuantizeTransformation(const Params& params) : LayerTransformation(params) {}
- void registerMatcherIn(GraphRewrite& pass, TransformationContext& context) const override;
+ NGRAPH_RTTI_DECLARATION;
+ FakeQuantizeTransformation(const Params& params = Params());
bool transform(TransformationContext& context, ngraph::pattern::Matcher &m) const override;
bool isPrecisionPreserved(std::shared_ptr layer) const noexcept override;
diff --git a/inference-engine/src/low_precision_transformations/include/low_precision/fake_quantize_decomposition.hpp b/inference-engine/src/low_precision_transformations/include/low_precision/fake_quantize_decomposition.hpp
index 0c6da56592e334..e857e7d080bf11 100644
--- a/inference-engine/src/low_precision_transformations/include/low_precision/fake_quantize_decomposition.hpp
+++ b/inference-engine/src/low_precision_transformations/include/low_precision/fake_quantize_decomposition.hpp
@@ -15,8 +15,8 @@ namespace low_precision {
class TRANSFORMATIONS_API FakeQuantizeDecompositionTransformation : public LayerTransformation {
public:
- FakeQuantizeDecompositionTransformation(const Params& params) : LayerTransformation(params) {}
- void registerMatcherIn(GraphRewrite& pass, TransformationContext& context) const override;
+ NGRAPH_RTTI_DECLARATION;
+ FakeQuantizeDecompositionTransformation(const Params& params = Params());
bool transform(TransformationContext& context, ngraph::pattern::Matcher &m) const override;
bool isPrecisionPreserved(std::shared_ptr layer) const noexcept override;
};
diff --git a/inference-engine/src/low_precision_transformations/include/low_precision/fold_convert.hpp b/inference-engine/src/low_precision_transformations/include/low_precision/fold_convert.hpp
index d41706f920579b..879fbb090beb08 100644
--- a/inference-engine/src/low_precision_transformations/include/low_precision/fold_convert.hpp
+++ b/inference-engine/src/low_precision_transformations/include/low_precision/fold_convert.hpp
@@ -14,9 +14,8 @@ namespace low_precision {
class TRANSFORMATIONS_API FoldConvertTransformation : public LayerTransformation {
public:
- FoldConvertTransformation(const Params& params) : LayerTransformation(params) {}
- ~FoldConvertTransformation() override {}
- void registerMatcherIn(GraphRewrite& pass, TransformationContext& context) const override;
+ NGRAPH_RTTI_DECLARATION;
+ FoldConvertTransformation(const Params& params = Params());
bool transform(TransformationContext& context, ngraph::pattern::Matcher &m) const override;
bool canBeTransformed(const TransformationContext& context, std::shared_ptr layer) const override;
bool isPrecisionPreserved(std::shared_ptr layer) const noexcept override;
diff --git a/inference-engine/src/low_precision_transformations/include/low_precision/fold_fake_quantize.hpp b/inference-engine/src/low_precision_transformations/include/low_precision/fold_fake_quantize.hpp
new file mode 100644
index 00000000000000..cae73503721587
--- /dev/null
+++ b/inference-engine/src/low_precision_transformations/include/low_precision/fold_fake_quantize.hpp
@@ -0,0 +1,25 @@
+// Copyright (C) 2021 Intel Corporation
+// SPDX-License-Identifier: Apache-2.0
+//
+
+#pragma once
+
+#include
+#include "low_precision/layer_transformation.hpp"
+
+namespace ngraph {
+namespace pass {
+namespace low_precision {
+
+class TRANSFORMATIONS_API FoldFakeQuantizeTransformation : public LayerTransformation {
+public:
+ NGRAPH_RTTI_DECLARATION;
+ FoldFakeQuantizeTransformation(const Params& params = Params());
+ bool transform(TransformationContext& context, ngraph::pattern::Matcher &m) const override;
+ bool canBeTransformed(const TransformationContext& context, std::shared_ptr layer) const override;
+ bool isPrecisionPreserved(std::shared_ptr layer) const noexcept override;
+};
+
+} // namespace low_precision
+} // namespace pass
+} // namespace ngraph
diff --git a/inference-engine/src/low_precision_transformations/include/low_precision/fuse_convert.hpp b/inference-engine/src/low_precision_transformations/include/low_precision/fuse_convert.hpp
index e8f2e864e46e29..62f1292172f615 100644
--- a/inference-engine/src/low_precision_transformations/include/low_precision/fuse_convert.hpp
+++ b/inference-engine/src/low_precision_transformations/include/low_precision/fuse_convert.hpp
@@ -14,9 +14,8 @@ namespace low_precision {
class TRANSFORMATIONS_API FuseConvertTransformation : public LayerTransformation {
public:
- FuseConvertTransformation(const Params& params) : LayerTransformation(params) {}
- ~FuseConvertTransformation() override {}
- void registerMatcherIn(GraphRewrite& pass, TransformationContext& context) const override;
+ NGRAPH_RTTI_DECLARATION;
+ FuseConvertTransformation(const Params& params = Params());
bool transform(TransformationContext& context, ngraph::pattern::Matcher &m) const override;
bool canBeTransformed(const TransformationContext& context, std::shared_ptr layer) const override;
bool isPrecisionPreserved(std::shared_ptr layer) const noexcept override;
diff --git a/inference-engine/src/low_precision_transformations/include/low_precision/fuse_fake_quantize.hpp b/inference-engine/src/low_precision_transformations/include/low_precision/fuse_fake_quantize.hpp
index 8d46c68f3d77d1..661b4e729c91b9 100644
--- a/inference-engine/src/low_precision_transformations/include/low_precision/fuse_fake_quantize.hpp
+++ b/inference-engine/src/low_precision_transformations/include/low_precision/fuse_fake_quantize.hpp
@@ -14,9 +14,8 @@ namespace low_precision {
class TRANSFORMATIONS_API FuseFakeQuantizeTransformation : public LayerTransformation {
public:
- FuseFakeQuantizeTransformation(const Params& params) : LayerTransformation(params) {}
- ~FuseFakeQuantizeTransformation() override {}
- void registerMatcherIn(GraphRewrite& pass, TransformationContext& context) const override;
+ NGRAPH_RTTI_DECLARATION;
+ FuseFakeQuantizeTransformation(const Params& params);
bool transform(TransformationContext& context, ngraph::pattern::Matcher &m) const override;
bool isPrecisionPreserved(std::shared_ptr layer) const noexcept override;
diff --git a/inference-engine/src/low_precision_transformations/include/low_precision/fuse_multiply_to_fake_quantize.hpp b/inference-engine/src/low_precision_transformations/include/low_precision/fuse_multiply_to_fake_quantize.hpp
index dea0fa340551b3..d1cfb935864da0 100644
--- a/inference-engine/src/low_precision_transformations/include/low_precision/fuse_multiply_to_fake_quantize.hpp
+++ b/inference-engine/src/low_precision_transformations/include/low_precision/fuse_multiply_to_fake_quantize.hpp
@@ -14,9 +14,8 @@ namespace low_precision {
class TRANSFORMATIONS_API FuseMultiplyToFakeQuantizeTransformation : public LayerTransformation {
public:
- FuseMultiplyToFakeQuantizeTransformation(const Params& params) : LayerTransformation(params) {}
- ~FuseMultiplyToFakeQuantizeTransformation() override {}
- void registerMatcherIn(GraphRewrite& pass, TransformationContext& context) const override;
+ NGRAPH_RTTI_DECLARATION;
+ FuseMultiplyToFakeQuantizeTransformation(const Params& params = Params());
bool transform(TransformationContext& context, ngraph::pattern::Matcher &m) const override;
bool canBeTransformed(const TransformationContext& context, std::shared_ptr layer) const override;
bool isPrecisionPreserved(std::shared_ptr layer) const noexcept override;
diff --git a/inference-engine/src/low_precision_transformations/include/low_precision/fuse_subtract_to_fake_quantize.hpp b/inference-engine/src/low_precision_transformations/include/low_precision/fuse_subtract_to_fake_quantize.hpp
index 2c67aebfcf186a..16a6a810bd723f 100644
--- a/inference-engine/src/low_precision_transformations/include/low_precision/fuse_subtract_to_fake_quantize.hpp
+++ b/inference-engine/src/low_precision_transformations/include/low_precision/fuse_subtract_to_fake_quantize.hpp
@@ -14,9 +14,8 @@ namespace low_precision {
class TRANSFORMATIONS_API FuseSubtractToFakeQuantizeTransformation : public LayerTransformation {
public:
- FuseSubtractToFakeQuantizeTransformation(const Params& params) : LayerTransformation(params) {}
- ~FuseSubtractToFakeQuantizeTransformation() override {}
- void registerMatcherIn(GraphRewrite& pass, TransformationContext& context) const override;
+ NGRAPH_RTTI_DECLARATION;
+ FuseSubtractToFakeQuantizeTransformation(const Params& params = Params());
bool transform(TransformationContext& context, ngraph::pattern::Matcher &m) const override;
bool canBeTransformed(const TransformationContext& context, std::shared_ptr layer) const override;
bool isPrecisionPreserved(std::shared_ptr layer) const noexcept override;
diff --git a/inference-engine/src/low_precision_transformations/include/low_precision/group_convolution.hpp b/inference-engine/src/low_precision_transformations/include/low_precision/group_convolution.hpp
index 0372f0173d9d87..dfde7003109c10 100644
--- a/inference-engine/src/low_precision_transformations/include/low_precision/group_convolution.hpp
+++ b/inference-engine/src/low_precision_transformations/include/low_precision/group_convolution.hpp
@@ -13,8 +13,8 @@ namespace low_precision {
class TRANSFORMATIONS_API GroupConvolutionTransformation : public ConvolutionTransformation {
public:
- GroupConvolutionTransformation(const Params& params);
- void registerMatcherIn(GraphRewrite& pass, TransformationContext& context) const override;
+ NGRAPH_RTTI_DECLARATION;
+ GroupConvolutionTransformation(const Params& params = Params());
bool transform(TransformationContext& context, ngraph::pattern::Matcher &m) const override;
bool isQuantized(std::shared_ptr layer) const noexcept override;
};
diff --git a/inference-engine/src/low_precision_transformations/include/low_precision/interpolate.hpp b/inference-engine/src/low_precision_transformations/include/low_precision/interpolate.hpp
index 184d1c159fe615..fb8169af85c844 100644
--- a/inference-engine/src/low_precision_transformations/include/low_precision/interpolate.hpp
+++ b/inference-engine/src/low_precision_transformations/include/low_precision/interpolate.hpp
@@ -12,10 +12,9 @@ namespace low_precision {
class TRANSFORMATIONS_API InterpolateTransformation : public LayerTransformation {
public:
- InterpolateTransformation(const Params& params) : LayerTransformation(params) {}
- ~InterpolateTransformation() override {}
+ NGRAPH_RTTI_DECLARATION;
+ InterpolateTransformation(const Params& params = Params());
bool transform(TransformationContext &context, ngraph::pattern::Matcher &m) const override;
- void registerMatcherIn(GraphRewrite& pass, TransformationContext& context) const override;
bool isPrecisionPreserved(std::shared_ptr layer) const noexcept override;
bool canBeTransformed(const TransformationContext& context, std::shared_ptr layer) const override;
};
diff --git a/inference-engine/src/low_precision_transformations/include/low_precision/layer_transformation.hpp b/inference-engine/src/low_precision_transformations/include/low_precision/layer_transformation.hpp
index 36b1293cd425b3..cf46dfa0a41f4a 100644
--- a/inference-engine/src/low_precision_transformations/include/low_precision/layer_transformation.hpp
+++ b/inference-engine/src/low_precision_transformations/include/low_precision/layer_transformation.hpp
@@ -164,7 +164,7 @@ inline std::ostream &operator << (std::ostream &os, const DataPrecision& value)
}
// Base class for all LP transformations, holds some common data structures
-class TRANSFORMATIONS_API LayerTransformation {
+class TRANSFORMATIONS_API LayerTransformation : public ngraph::pass::MatcherPass {
public:
enum QuantizedTensorAlignment {
None,
@@ -177,7 +177,7 @@ class TRANSFORMATIONS_API LayerTransformation {
const bool updatePrecisions = true,
const QuantizedTensorAlignment quantizedTensorAlignmentOnActivations = QuantizedTensorAlignment::UpdateLevel,
const QuantizedTensorAlignment quantizedTensorAlignmentOnWeights = QuantizedTensorAlignment::None,
- bool supportAsymmetricQuantization = false,
+ bool supportAsymmetricQuantization = true,
std::vector precisionsOnActivations = { element::u8, element::i8 },
std::vector precisionsOnWeights = { element::i8 },
element::Type deqPrecision = element::f32,
@@ -258,11 +258,12 @@ class TRANSFORMATIONS_API LayerTransformation {
LayerTransformation(const Params& params);
virtual ~LayerTransformation() = default;
- virtual void registerMatcherIn(ngraph::pass::GraphRewrite& pass, TransformationContext& context) const = 0;
virtual bool transform(TransformationContext& context, ngraph::pattern::Matcher &m) const = 0;
+ void setParams(const Params& params);
void setParamsManager(IParamsManager* paramsManager) noexcept;
void setLayerTransformationsManager(ILayerTransformationsManager* layerTransformationsManager) noexcept;
+ void setContext(TransformationContext* context) noexcept;
void setUpdatePrecisions(const bool updatePrecisions);
void setQuantizedTensorAlignmentOnActivations(const QuantizedTensorAlignment quantizedTensorAlignmentOnActivations);
@@ -281,7 +282,7 @@ class TRANSFORMATIONS_API LayerTransformation {
bool canSubtractBeHandled(const std::shared_ptr& op, const FakeQuantizeDequantization& dequantization) const;
- PrecisionDetails getPrecisionDetails(const QuantizationDetails& quantizationDetails) const;
+ static PrecisionDetails getPrecisionDetails(const QuantizationDetails& quantizationDetails);
// return true if operation can be quantized and false otherwise
// for example: if convolution operation weights are not quantized, then isQuantize returns false and true otherwise
@@ -328,6 +329,7 @@ class TRANSFORMATIONS_API LayerTransformation {
static const char originalLayerPostfix[];
IParamsManager* paramsManager;
ILayerTransformationsManager* layerTransformationsManager;
+ TransformationContext* context;
protected:
std::shared_ptr moveDequantizationAfter(
diff --git a/inference-engine/src/low_precision_transformations/include/low_precision/low_precision.hpp b/inference-engine/src/low_precision_transformations/include/low_precision/low_precision.hpp
new file mode 100644
index 00000000000000..1d8dab4877b87a
--- /dev/null
+++ b/inference-engine/src/low_precision_transformations/include/low_precision/low_precision.hpp
@@ -0,0 +1,59 @@
+// Copyright (C) 2021 Intel Corporation
+// SPDX-License-Identifier: Apache-2.0
+//
+
+#pragma once
+
+#include
+#include
+
+// one place to include all Low Precision Transformations from ngraph::pass::low_precision
+#include
+#include
+#include
+#include
+
+#include
+#include
+#include
+#include
+
+
+#include
+#include
+#include
+#include "low_precision/layer_transformation.hpp"
+#include "low_precision/markup_precisions.hpp"
+
+namespace ngraph {
+namespace pass {
+namespace low_precision {
+
+class TRANSFORMATIONS_API LowPrecision;
+
+} // namespace low_precision
+} // namespace pass
+} // namespace ngraph
+
+class TRANSFORMATIONS_API ngraph::pass::low_precision::LowPrecision : public ngraph::pass::FunctionPass {
+public:
+ class TRANSFORMATIONS_API TypeRelaxedReplacer : public GraphRewrite {
+ public:
+ TypeRelaxedReplacer();
+ };
+
+ NGRAPH_RTTI_DECLARATION;
+ LowPrecision(
+ const std::vector& precisionRestrictions = {},
+ const std::vector& quantizationRestrictions = {},
+ const LayerTransformation::Params = LayerTransformation::Params());
+ bool run_on_function(std::shared_ptr f) override;
+
+ static bool isFunctionQuantized(const std::shared_ptr& function);
+
+protected:
+ std::vector precisionRestrictions;
+ std::vector quantizationRestrictions;
+ // remove
+ LayerTransformation::Params params;
+};
diff --git a/inference-engine/src/low_precision_transformations/include/low_precision/lpt_itt.hpp b/inference-engine/src/low_precision_transformations/include/low_precision/lpt_itt.hpp
new file mode 100644
index 00000000000000..3b207c1bf8f0c0
--- /dev/null
+++ b/inference-engine/src/low_precision_transformations/include/low_precision/lpt_itt.hpp
@@ -0,0 +1,27 @@
+// Copyright (C) 2021 Intel Corporation
+// SPDX-License-Identifier: Apache-2.0
+//
+
+/**
+ * @brief Defines openvino domains for tracing
+ * @file lpt_itt.hpp
+ */
+
+#pragma once
+
+#include
+
+namespace ngraph {
+namespace pass {
+namespace low_precision {
+namespace itt {
+namespace domains {
+
+OV_ITT_DOMAIN(LPT);
+OV_ITT_DOMAIN(LPT_LT);
+
+} // namespace domains
+} // namespace itt
+} // namespace low_precision
+} // namespace pass
+} // namespace ngraph
\ No newline at end of file
diff --git a/inference-engine/src/low_precision_transformations/include/low_precision/markup_avg_pool_precision_preserved.hpp b/inference-engine/src/low_precision_transformations/include/low_precision/markup_avg_pool_precision_preserved.hpp
new file mode 100644
index 00000000000000..6f6ad5e11d6be3
--- /dev/null
+++ b/inference-engine/src/low_precision_transformations/include/low_precision/markup_avg_pool_precision_preserved.hpp
@@ -0,0 +1,29 @@
+// Copyright (C) 2021 Intel Corporation
+// SPDX-License-Identifier: Apache-2.0
+//
+
+#pragma once
+
+#include
+
+#include
+#include
+
+#include
+#include
+
+namespace ngraph {
+namespace pass {
+namespace low_precision {
+
+class TRANSFORMATIONS_API MarkupAvgPoolPrecisionPreserved;
+
+} // namespace low_precision
+} // namespace pass
+} // namespace ngraph
+
+class ngraph::pass::low_precision::MarkupAvgPoolPrecisionPreserved : public ngraph::pass::FunctionPass {
+public:
+ NGRAPH_RTTI_DECLARATION;
+ bool run_on_function(std::shared_ptr f) override;
+};
diff --git a/inference-engine/src/low_precision_transformations/include/low_precision/markup_per_tensor_quantization.hpp b/inference-engine/src/low_precision_transformations/include/low_precision/markup_per_tensor_quantization.hpp
new file mode 100644
index 00000000000000..870779f37235d8
--- /dev/null
+++ b/inference-engine/src/low_precision_transformations/include/low_precision/markup_per_tensor_quantization.hpp
@@ -0,0 +1,48 @@
+// Copyright (C) 2021 Intel Corporation
+// SPDX-License-Identifier: Apache-2.0
+//
+
+#pragma once
+
+#include
+#include
+#include
+
+#include
+#include
+#include
+#include
+
+#include "common/operation_per_tensor_quantization_restriction.hpp"
+
+namespace ngraph {
+namespace pass {
+namespace low_precision {
+
+class TRANSFORMATIONS_API MarkupPerTensorQuantization;
+
+} // namespace low_precision
+} // namespace pass
+} // namespace ngraph
+
+class ngraph::pass::low_precision::MarkupPerTensorQuantization : public ngraph::pass::FunctionPass {
+public:
+ class PerTensorQuantization {
+ public:
+ PerTensorQuantization() = default;
+ PerTensorQuantization(const bool versionIsRequired) : versionIsRequired(versionIsRequired) {}
+ void add(const uint64_t version, const std::vector& precisions) {
+ precisionsByVersion.emplace(version, precisions);
+ }
+
+ bool versionIsRequired;
+ std::unordered_map> precisionsByVersion;
+ };
+
+ NGRAPH_RTTI_DECLARATION;
+ MarkupPerTensorQuantization(const std::vector& restrictions = {});
+ bool run_on_function(std::shared_ptr f) override;
+
+private:
+ std::unordered_map restrictionsByOperation;
+};
diff --git a/inference-engine/src/low_precision_transformations/include/low_precision/markup_precisions.hpp b/inference-engine/src/low_precision_transformations/include/low_precision/markup_precisions.hpp
new file mode 100644
index 00000000000000..9f86b8e454f8a7
--- /dev/null
+++ b/inference-engine/src/low_precision_transformations/include/low_precision/markup_precisions.hpp
@@ -0,0 +1,52 @@
+// Copyright (C) 2021 Intel Corporation
+// SPDX-License-Identifier: Apache-2.0
+//
+
+#pragma once
+
+#include
+#include
+#include
+
+#include
+#include
+
+#include
+#include
+#include "common/operation_precision_restriction.hpp"
+
+namespace ngraph {
+namespace pass {
+namespace low_precision {
+
+class TRANSFORMATIONS_API MarkupPrecisions;
+
+} // namespace low_precision
+} // namespace pass
+} // namespace ngraph
+
+// Transformation is used to add customization options runtime
+class ngraph::pass::low_precision::MarkupPrecisions : public ngraph::pass::FunctionPass {
+public:
+ class Restriction {
+ public:
+ Restriction() = default;
+ Restriction(const bool versionIsRequired) : versionIsRequired(versionIsRequired) {}
+ void add(const uint64_t version, const std::vector>>& precisions) {
+ precisionsByVersion.emplace(version, precisions);
+ }
+
+ bool versionIsRequired;
+ std::unordered_map>>> precisionsByVersion;
+ };
+
+ NGRAPH_RTTI_DECLARATION;
+ MarkupPrecisions(const std::vector& restrictions = {});
+ bool run_on_function(std::shared_ptr f) override;
+
+private:
+ static bool isPrecisionPreserved(const std::shared_ptr& node);
+ static bool isQuantized(const std::shared_ptr& node);
+
+ std::unordered_map restrictionsByOperation;
+};
diff --git a/inference-engine/src/low_precision_transformations/include/low_precision/mat_mul.hpp b/inference-engine/src/low_precision_transformations/include/low_precision/mat_mul.hpp
index 332d28b934b44e..9eefaacc05d7b7 100644
--- a/inference-engine/src/low_precision_transformations/include/low_precision/mat_mul.hpp
+++ b/inference-engine/src/low_precision_transformations/include/low_precision/mat_mul.hpp
@@ -13,10 +13,9 @@ namespace low_precision {
class TRANSFORMATIONS_API MatMulTransformation : public LayerTransformation {
public:
- MatMulTransformation(const Params& params) : LayerTransformation(params) {}
- ~MatMulTransformation() override {}
+ NGRAPH_RTTI_DECLARATION;
+ MatMulTransformation(const Params& params = Params());
bool transform(TransformationContext &context, ngraph::pattern::Matcher &m) const override;
- void registerMatcherIn(GraphRewrite& pass, TransformationContext& context) const override;
bool isPrecisionPreserved(std::shared_ptr layer) const noexcept override;
bool canBeTransformed(const TransformationContext& context, std::shared_ptr layer) const override;
};
diff --git a/inference-engine/src/low_precision_transformations/include/low_precision/max_pool.hpp b/inference-engine/src/low_precision_transformations/include/low_precision/max_pool.hpp
index 2cf1d54eda7f44..6855f347a9ef76 100644
--- a/inference-engine/src/low_precision_transformations/include/low_precision/max_pool.hpp
+++ b/inference-engine/src/low_precision_transformations/include/low_precision/max_pool.hpp
@@ -14,8 +14,8 @@ namespace low_precision {
class TRANSFORMATIONS_API MaxPoolTransformation : public LayerTransformation {
public:
- MaxPoolTransformation(const Params& params);
- void registerMatcherIn(GraphRewrite& pass, TransformationContext& context) const override;
+ NGRAPH_RTTI_DECLARATION;
+ MaxPoolTransformation(const Params& params = Params());
bool canBeTransformed(const TransformationContext& context, std::shared_ptr op) const override;
bool transform(TransformationContext& context, ngraph::pattern::Matcher &m) const override;
bool isPrecisionPreserved(std::shared_ptr layer) const noexcept override;
diff --git a/inference-engine/src/low_precision_transformations/include/low_precision/multiply.hpp b/inference-engine/src/low_precision_transformations/include/low_precision/multiply.hpp
index 30f1cff5444d37..a0d038c0b654ac 100644
--- a/inference-engine/src/low_precision_transformations/include/low_precision/multiply.hpp
+++ b/inference-engine/src/low_precision_transformations/include/low_precision/multiply.hpp
@@ -13,9 +13,8 @@ namespace low_precision {
class TRANSFORMATIONS_API MultiplyTransformation : public EltwiseBaseTransformation {
public:
- MultiplyTransformation(const Params& params) : EltwiseBaseTransformation(params) {}
- ~MultiplyTransformation() override {}
- void registerMatcherIn(GraphRewrite& pass, TransformationContext& context) const override;
+ NGRAPH_RTTI_DECLARATION;
+ MultiplyTransformation(const Params& params = Params());
bool transform(TransformationContext& context, ngraph::pattern::Matcher &m) const override;
};
diff --git a/inference-engine/src/low_precision_transformations/include/low_precision/multiply_to_group_convolution.hpp b/inference-engine/src/low_precision_transformations/include/low_precision/multiply_to_group_convolution.hpp
index d4a575f4d9a9de..422e1fc17b1919 100644
--- a/inference-engine/src/low_precision_transformations/include/low_precision/multiply_to_group_convolution.hpp
+++ b/inference-engine/src/low_precision_transformations/include/low_precision/multiply_to_group_convolution.hpp
@@ -7,6 +7,7 @@
#include
#include
#include "low_precision/layer_transformation.hpp"
+#include "common/operation_precision_restriction.hpp"
namespace ngraph {
namespace pass {
@@ -14,9 +15,11 @@ namespace low_precision {
class TRANSFORMATIONS_API MultiplyToGroupConvolutionTransformation : public LayerTransformation {
public:
- MultiplyToGroupConvolutionTransformation(const Params& params) : LayerTransformation(params), groupSize(1ul) {}
+ NGRAPH_RTTI_DECLARATION;
+ MultiplyToGroupConvolutionTransformation(
+ const Params& params = Params(),
+ const OperationPrecisionRestriction::PrecisionsByPort& restrictions = {});
~MultiplyToGroupConvolutionTransformation() override {}
- void registerMatcherIn(GraphRewrite& pass, TransformationContext& context) const override;
bool transform(TransformationContext& context, ngraph::pattern::Matcher &m) const override;
bool canBeTransformed(const TransformationContext& context, std::shared_ptr layer) const override;
bool isPrecisionPreserved(std::shared_ptr layer) const noexcept override;
@@ -25,6 +28,7 @@ class TRANSFORMATIONS_API MultiplyToGroupConvolutionTransformation : public Laye
void setGroupSize(const size_t groupSize);
size_t getGroupSize() const;
private:
+ OperationPrecisionRestriction::PrecisionsByPort restrictions;
size_t groupSize;
};
diff --git a/inference-engine/src/low_precision_transformations/include/low_precision/mvn.hpp b/inference-engine/src/low_precision_transformations/include/low_precision/mvn.hpp
index 37244a3aa74c0b..59844e0201411c 100644
--- a/inference-engine/src/low_precision_transformations/include/low_precision/mvn.hpp
+++ b/inference-engine/src/low_precision_transformations/include/low_precision/mvn.hpp
@@ -12,8 +12,8 @@ namespace low_precision {
class TRANSFORMATIONS_API MVNTransformation : public LayerTransformation {
public:
- MVNTransformation(const Params& params) : LayerTransformation(params) {}
- void registerMatcherIn(GraphRewrite& pass, TransformationContext& context) const override;
+ NGRAPH_RTTI_DECLARATION;
+ MVNTransformation(const Params& params = Params());
bool transform(TransformationContext &context, ngraph::pattern::Matcher &m) const override;
bool canBeTransformed(const TransformationContext& context, std::shared_ptr layer) const override;
bool isPrecisionPreserved(std::shared_ptr layer) const noexcept override;
diff --git a/inference-engine/src/low_precision_transformations/include/low_precision/network_helper.hpp b/inference-engine/src/low_precision_transformations/include/low_precision/network_helper.hpp
index 9846ef50d6aa2d..e718abf65e7cab 100644
--- a/inference-engine/src/low_precision_transformations/include/low_precision/network_helper.hpp
+++ b/inference-engine/src/low_precision_transformations/include/low_precision/network_helper.hpp
@@ -16,6 +16,9 @@
#include "ngraph_ops/type_relaxed.hpp"
#include
+#include "rt_info/shared_value_attribute.hpp"
+#include "rt_info/precisions_attribute.hpp"
+#include "rt_info/per_tensor_quantization_attribute.hpp"
#include "transformation_context.hpp"
#include "quantization_details.hpp"
#include "transformations/utils/utils.hpp"
@@ -76,6 +79,10 @@ class TRANSFORMATIONS_API NetworkHelper {
static std::shared_ptr swapMultiplyAndAdd(std::shared_ptr addAfterMultiply, const int multiplyBranch);
+ static void copyInfo(const std::vector>& sources, const std::vector>& targets);
+
+ static void copyInfo(const std::vector>& sources, const std::shared_ptr& target);
+
static void copyInfo(const std::shared_ptr& source, const std::shared_ptr& target);
static void cleanRunTimeInfo(const std::shared_ptr& layer);
@@ -115,7 +122,8 @@ class TRANSFORMATIONS_API NetworkHelper {
std::shared_ptr fq,
element::Type precision,
float min,
- float max);
+ float max,
+ const bool replace = true);
static FakeQuantizeDequantization makeDequantization(
const float dequantizationMul,
@@ -123,7 +131,8 @@ class TRANSFORMATIONS_API NetworkHelper {
const ngraph::element::Type originalPrecision,
const ngraph::Shape dataNodeOutputShape,
element::Type precision,
- const element::Type deqPrecision = element::f32);
+ const element::Type deqPrecision = element::f32,
+ std::shared_ptr input = nullptr);
static FakeQuantizeDequantization createDequantizationFromFakeQuantize(
std::shared_ptr fq,
@@ -191,6 +200,105 @@ class TRANSFORMATIONS_API NetworkHelper {
static std::shared_ptr fuseConvert(const std::shared_ptr& fakeQuantize);
+ static bool isPrecisionPreserved(const std::shared_ptr& node);
+
+ static void replaceAttributeInNodes(
+ std::shared_ptr f,
+ const std::string& name,
+ const std::shared_ptr newAttribute,
+ const std::shared_ptr oldAttribute,
+ const std::shared_ptr& initialNode) {
+ std::set> visited;
+ std::deque> nodes;
+ nodes.emplace_back(initialNode);
+
+ // bool initialNodeIsNotInitialized = true;
+
+ while (!nodes.empty()) {
+ auto node = nodes.front();
+ nodes.pop_front();
+
+ if (visited.count(node) || is_type(node)) {
+ continue;
+ }
+
+ visited.insert(node);
+
+ bool handleConnectedNodes = false;
+ if (NetworkHelper::isPrecisionPreserved(node) || is_type(node)) {
+ auto& rt = node->get_rt_info();
+
+ if (node == initialNode) {
+ rt[name] = newAttribute;
+ handleConnectedNodes = true;
+ } else {
+ auto it = rt.find(name);
+ if (it != rt.end()) {
+ const auto currentAttribute = it->second;
+ if (oldAttribute.get() == currentAttribute.get()) {
+ rt[name] = newAttribute;
+ }
+ handleConnectedNodes = true;
+ }
+ }
+ }
+
+ if (!handleConnectedNodes) {
+ continue;
+ }
+
+ if (!is_type(node)) {
+ for (size_t index = 0ul; index < node->get_input_size(); ++index) {
+ auto getInput = [](const std::shared_ptr& node, const size_t index) {
+ const auto dequantization = NetworkHelper::getDequantization(node, index);
+ if (!dequantization.empty() &&
+ (is_type(dequantization.data.get_node())) &&
+ is_type(dequantization.data.get_node()->get_input_node_ptr(0))) {
+ const auto input = dequantization.data.get_node()->input(0);
+ return input;
+ }
+ return node->input(index);
+ };
+
+ const auto& input = getInput(node, index);
+ const auto& input_node = input.get_source_output().get_node_shared_ptr();
+
+ //const auto& input_node = input.get_source_output().get_node_shared_ptr();
+ if (visited.count(input_node) || is_type(input_node)) {
+ continue;
+ }
+
+ nodes.push_front(input_node);
+ }
+ }
+
+ for (auto& output : node->outputs()) {
+ for (auto& input_value : output.get_target_inputs()) {
+ const auto& output_node = input_value.get_node()->shared_from_this();
+ if (visited.count(output_node) || is_type(output_node)) {
+ continue;
+ }
+
+ nodes.push_front(output_node);
+ }
+ }
+ }
+ }
+
+ template
+ static void reassign(
+ const std::shared_ptr& sharedValue,
+ const std::vector>& attributes) {
+ for (const auto attributeWeakPtr : attributes) {
+ auto attribute = attributeWeakPtr.lock();
+ if (attribute == nullptr) {
+ continue;
+ }
+ attribute->sharedValue = sharedValue;
+ sharedValue->attributes.push_back(attribute);
+ }
+ }
+
private:
static std::shared_ptr foldFakeQuantize(const std::shared_ptr& fq, const bool roundValues, const bool roundValuesWasSet);
@@ -273,6 +381,81 @@ std::shared_ptr fold_reshape(Args&&... args) {
return node;
}
+template
+std::shared_ptr> getAttribute(const std::shared_ptr& inputNode) {
+ auto& rt = inputNode->get_rt_info();
+ auto it = rt.find(ngraph::VariantWrapper::type_info.name);
+ if (it == rt.end()) {
+ return nullptr;
+ }
+
+ auto attribute = std::dynamic_pointer_cast>(it->second);
+ assert(attribute != nullptr);
+ return attribute;
+}
+
+template
+std::shared_ptr> getAttribute(const Input