DequantizeLinear.py can't compile layer when x_scale_rank == 1 #733

ReschakRyan · 2025-01-21T19:17:00Z

Issue Type

Others

OS

Linux

onnx2tf version number

1.22.3

onnx version number

1.17.0

onnxruntime version number

1.20.0

onnxsim (onnx_simplifier) version number

0.4.36

tensorflow version number

2.18.0

Download URL for ONNX

https://drive.google.com/file/d/1V67C1yzkjCejLkR5vykay3MTsN7BXFPH/view?usp=drivesdk

Parameter Replacement JSON

None

Description

Hello to those involved (let me know if the link to the file is broken)

I am trying to convert a QAT YOLO model from onnx to tensorflow for research purposes.
The error I am getting on a layer is on line https://github.com/PINTO0309/onnx2tf/blob/main/onnx2tf/ops/DequantizeLinear.py#L106 where the tf.reshape is passed (tensor=<tf.Tensor: shape=(1,), dtype=float32, numpy=array([0.14131849], dtype=float32)> and shape=([1,1,1,3])). In other words it's impossible to do.
Currently I am trying to mess with the DequantizeLinear.py file manual to see if I can fix the problem

The problem is happening with this simple part of the model:
`class Conv(nn.Module):
"""Standard convolution with args(ch_in, ch_out, kernel, stride, padding, groups, dilation, activation)."""

default_act = nn.SiLU()  # default activation

def __init__(self, c1, c2, k=1, s=1, p=None, g=1, d=1, act=True):
    """Initialize Conv layer with given arguments including activation."""
    super().__init__()
    self.conv = nn.Conv2d(c1, c2, k, s, autopad(k, p, d), groups=g, dilation=d, bias=False)
    self.bn = nn.BatchNorm2d(c2)
    self.act = self.default_act if act is True else act if isinstance(act, nn.Module) else nn.Identity()

def forward(self, x):
    """Apply convolution, batch normalization and activation to input tensor."""
    return self.act(self.bn(self.conv(x)))

def forward_fuse(self, x):
    """Perform transposed convolution of 2D data."""
    return self.act(self.conv(x))`

It is fused together and wrapped around the model is the Dequant and Quat layers to do QAT.

These are all the translated layers for this model:

%/0/conv/Cast_output_0 = Cast[to = 2](%/quant/QuantizeLinear_output_0)
  %/0/conv/Constant_output_0 = Constant[value = <Scalar Tensor []>]()
  %/0/conv/Constant_1_output_0 = Constant[value = <Scalar Tensor []>]()
  %/0/conv/DequantizeLinear_output_0 = DequantizeLinear(%/0/conv/Cast_output_0, %/0/conv/Constant_output_0, %/0/conv/Constant_1_output_0)
  %/0/conv/Constant_2_output_0 = Constant[value = <Tensor>]()
  %/0/conv/Constant_3_output_0 = Constant[value = <Tensor>]()
  %/0/conv/Constant_4_output_0 = Constant[value = <Tensor>]()
  %/0/conv/DequantizeLinear_1_output_0 = DequantizeLinear(%/0/conv/Constant_2_output_0, %/0/conv/Constant_3_output_0, %/0/conv/Constant_4_output_0)
  %/0/conv/Constant_5_output_0 = Constant[value = <Tensor>]()
  %/0/conv/ConstantOfShape_output_0 = ConstantOfShape[value = <Tensor>](%/0/conv/Constant_5_output_0)
  %/0/conv/Constant_6_output_0 = Constant[value = <Tensor>]()
  %/0/conv/Constant_7_output_0 = Constant[value = <Tensor>]()
  %/0/conv/Cast_1_output_0 = Cast[to = 6](%/0/conv/ConstantOfShape_output_0)
  %/0/conv/DequantizeLinear_2_output_0 = DequantizeLinear(%/0/conv/Constant_6_output_0, %/0/conv/Constant_7_output_0, %/0/conv/Cast_1_output_0)
  %/0/conv/Conv_output_0 = Conv[dilations = [1, 1], group = 1, kernel_shape = [3, 3], pads = [1, 1, 1, 1], strides = [2, 2]](%/0/conv/DequantizeLinear_output_0, %/0/conv/DequantizeLinear_1_output_0, %/0/conv/DequantizeLinear_2_output_0)
  %/0/conv/Relu_output_0 = Relu(%/0/conv/Conv_output_0)
  %/0/conv/Constant_8_output_0 = Constant[value = <Scalar Tensor []>]()
  %/0/conv/Constant_9_output_0 = Constant[value = <Scalar Tensor []>]()
  %/0/conv/QuantizeLinear_output_0 = QuantizeLinear(%/0/conv/Relu_output_0, %/0/conv/Constant_8_output_0, %/0/conv/Constant_9_output_0)

The issue arrises with DequantizeLinear_1_output_0
These are it's input layers:

(Pdb) graph_node_input_1
Variable (wa/0/conv/Constant_2_output_0): (shape=[16, 3, 3, 3], dtype=int8)
(Pdb) graph_node_input_2
Variable (wa/0/conv/Constant_3_output_0): (shape=[1], dtype=float32)
(Pdb) graph_node_input_3
Variable (wa/0/conv/Constant_4_output_0): (shape=[1], dtype=int8)

Because Constant_2 has the shape it has, the program tries to reshape Constant_3_output_0 (a single value) to be [1, 1, 1, 3] which is impossible
I know this should be possible as I am able to use onnx2tf to compile and translate the model without doing quantization (no QAT) and the onnx model I can run inference on.
This is the command I run to execute onnx2tf (I have tried it in python script too and it's the same result)

onnx2tf -i "runs/detect/train6/weights/best.onnx" -o "runs/detect/train6/weights/best_saved_model" -nuo --verbosity info -oiqt -qt per-tensor -cind images "runs/detect/train6/weights/best_saved_model/tmp_tflite_int8_calibration_images.npy" "[[[[0, 0, 0]]]]" "[[[[255, 255, 255]]]]"

Any help would be greatly appreciated :) Thank you in advance.

The text was updated successfully, but these errors were encountered:

ReschakRyan · 2025-01-21T19:45:11Z

Alright I got it to create the saved model as I commented out the lines:

subed_tensor = input_tensor
"""
if x_scale_rank == 1:
        shape_broadcast = list([1 for _ in range(axis)] + [input_tensor_shape[axis]] + [1 for _ in range(axis + 1, input_tensor_rank)])
        x_scale = tf.reshape(
            tensor=x_scale,
            shape=shape_broadcast,
        )
if len(graph_node.inputs) >= 3 and input_tensor.dtype != tf.int32:
        x_zero_point = tf.cast(
            x=x_zero_point,
            dtype=tf.float32,
        )
        x_zero_point = tf.reshape(
            tensor=x_zero_point,
            shape=shape_broadcast,
        ) if x_scale_rank == 1 else x_zero_point
        subed_tensor = tf.subtract(
            x=input_tensor,
            y=x_zero_point,
        )
"""

Perhaps there could be a check around it if the shapes of the subded_tensor and x_scale are compatible for multiplication before doing this reshaping as it doesn't seem to be necessary in my case.

However I am getting a different issue that I wonder if this is related?
Here is my output from the terminal:

Automatic generation of each OP name started ========================================
Automatic generation of each OP name complete!

Model loaded ========================================================================

Model conversion started ============================================================
WARNING: Tensorflow incompatible padding detected. Extra pad layer is inserted automatically. 
WARNING: Tensorflow incompatible padding detected. Extra pad layer is inserted automatically. 
WARNING: Tensorflow incompatible padding detected. Extra pad layer is inserted automatically. 
saved_model output started ==========================================================
saved_model output complete!
I0000 00:00:1737487745.111560  899625 devices.cc:67] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 0
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
I0000 00:00:1737487745.111711  899625 single_machine.cc:361] Starting new session
W0000 00:00:1737487747.527733  899625 tf_tfl_flatbuffer_helpers.cc:365] Ignored output_format.
W0000 00:00:1737487747.527770  899625 tf_tfl_flatbuffer_helpers.cc:368] Ignored drop_control_dependency.
Float32 tflite output complete!
I0000 00:00:1737487749.026345  899625 devices.cc:67] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 0
I0000 00:00:1737487749.026484  899625 single_machine.cc:361] Starting new session
W0000 00:00:1737487751.422271  899625 tf_tfl_flatbuffer_helpers.cc:365] Ignored output_format.
W0000 00:00:1737487751.422295  899625 tf_tfl_flatbuffer_helpers.cc:368] Ignored drop_control_dependency.
Float16 tflite output complete!
I0000 00:00:1737487752.595559  899625 devices.cc:67] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 0
I0000 00:00:1737487752.595669  899625 single_machine.cc:361] Starting new session
W0000 00:00:1737487754.914976  899625 tf_tfl_flatbuffer_helpers.cc:365] Ignored output_format.
W0000 00:00:1737487754.915001  899625 tf_tfl_flatbuffer_helpers.cc:368] Ignored drop_control_dependency.
Dynamic Range Quantization tflite output complete!
Input signature information for quantization
signature_name: serving_default
input_name.0: images shape: (None, None, None, 3) dtype: <dtype: 'float32'>
W0000 00:00:1737487779.179262  899625 tf_tfl_flatbuffer_helpers.cc:365] Ignored output_format.
W0000 00:00:1737487779.179285  899625 tf_tfl_flatbuffer_helpers.cc:368] Ignored drop_control_dependency.
I0000 00:00:1737487779.229588  899625 mlir_graph_optimization_pass.cc:401] MLIR V1 optimization pass is not enabled
INFO: Created TensorFlow Lite delegate for select TF ops.
INFO: TfLiteFlexDelegate delegate: 9 nodes delegated out of 811 nodes with 9 partitions.

Traceback (most recent call last):
  File "/opt/conda/lib/python3.11/site-packages/onnx2tf/onnx2tf.py", line 1495, in convert
    tflite_model = converter.convert()
                   ^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/tensorflow/lite/python/lite.py", line 1238, in wrapper
    return self._convert_and_export_metrics(convert_func, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/tensorflow/lite/python/lite.py", line 1190, in _convert_and_export_metrics
    result = convert_func(self, *args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/tensorflow/lite/python/lite.py", line 1572, in convert
    return self._convert_from_saved_model(graph_def)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/tensorflow/lite/python/lite.py", line 1431, in _convert_from_saved_model
    return self._optimize_tflite_model(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/tensorflow/lite/python/convert_phase.py", line 215, in wrapper
    raise error from None  # Re-throws the exception.
    ^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/tensorflow/lite/python/convert_phase.py", line 205, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/tensorflow/lite/python/lite.py", line 1134, in _optimize_tflite_model
    model = self._quantize(
            ^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/tensorflow/lite/python/lite.py", line 751, in _quantize
    calibrated = calibrate_quantize.calibrate(
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/tensorflow/lite/python/convert_phase.py", line 215, in wrapper
    raise error from None  # Re-throws the exception.
    ^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/tensorflow/lite/python/convert_phase.py", line 205, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/tensorflow/lite/python/optimize/calibrator.py", line 254, in calibrate
    self._feed_tensors(dataset_gen, resize_input=True)
  File "/opt/conda/lib/python3.11/site-packages/tensorflow/lite/python/optimize/calibrator.py", line 152, in _feed_tensors
    self._calibrator.FeedTensor(input_array)
RuntimeError: tensorflow/lite/kernels/concatenation.cc:202 t->dims->data[d] != t0->dims->data[d] (40 != 20)Node number 316 (CONCATENATION) failed to prepare.

WARNING: Full INT8 Quantization tflite output failed.

PINTO0309 · 2025-01-22T13:34:18Z

https://github.com/PINTO0309/onnx2tf/releases/tag/1.26.7
DequantizeLinear
- Fixed the broadcast processing when x_scale is 1D 1Elem.
- best.onnx.zip
- onnx2tf -i best.onnx -ois images:1,3,512,640
DequantizeLinear.py can't compile layer when x_scale_rank == 1 #733

What's Changed

Fixed the broadcast processing when x_scale is 1D 1Elem by @PINTO0309 in Fixed the broadcast processing when x_scale is 1D 1Elem #735

Full Changelog: 1.26.6...1.26.7

PINTO0309 added TODO TODO OP:DequantizeLinear OP:DequantizeLinear labels Jan 21, 2025

PINTO0309 mentioned this issue Jan 22, 2025

Fixed the broadcast processing when x_scale is 1D 1Elem #735

Merged

PINTO0309 removed the TODO TODO label Jan 22, 2025

PINTO0309 added the Bug bug label Jan 22, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DequantizeLinear.py can't compile layer when x_scale_rank == 1 #733

DequantizeLinear.py can't compile layer when x_scale_rank == 1 #733

ReschakRyan commented Jan 21, 2025 •

edited

Loading

ReschakRyan commented Jan 21, 2025

PINTO0309 commented Jan 22, 2025 •

edited

Loading

DequantizeLinear.py can't compile layer when x_scale_rank == 1 #733

DequantizeLinear.py can't compile layer when x_scale_rank == 1 #733

Comments

ReschakRyan commented Jan 21, 2025 • edited Loading

Issue Type

OS

onnx2tf version number

onnx version number

onnxruntime version number

onnxsim (onnx_simplifier) version number

tensorflow version number

Download URL for ONNX

Parameter Replacement JSON

Description

ReschakRyan commented Jan 21, 2025

PINTO0309 commented Jan 22, 2025 • edited Loading

What's Changed

ReschakRyan commented Jan 21, 2025 •

edited

Loading

PINTO0309 commented Jan 22, 2025 •

edited

Loading