QKeras to QONNX converter #7

vloncar · 2021-11-10T16:56:35Z

Long-overdue, incomplete, horribly-coded first stab at the QKeras to QONNX converter.

It relies on tf2onnx to perform the actual conversion. The conversion is done in several steps:

A model is stripped of QKeras-related attributes (kernel, bias and activation quantizers) and layers, and replaced with generic Keras layers. A map of layers and their quantizers is saved.
Stripped model is fed into tf2onnx
We add hooks into the standard tf2onnx handler for Keras layers, and insert Quant nodes based on the quantizer map.

The 3rd step can in theory be replaced with the ModelWrapper-based approach after the initial conversion, however for that ModelWrapper needs to be extended with graph manipulation routines (like, insert before/after a specific node, remove node etc). Also, the name mangling that occurs after the model comes out of tf2onnx makes it more difficult to map the layer names in ONNX to the saved quantizer map. This was the original idea, however I couldn't figure out a perfect way to deduce the final layer names. It makes sense to revisit this once I get a bit more knowledge on the insides of tf2onnx.

Some known caveats:

The current version (1.9.2) of tf2onnx has an API bug, so until a new version is released we use it incorrectly, and should be updated for 1.9.3 and the project should depend on that version.
Overriding tf2onnx handlers may not be foolproof. Some layers are handled in unexpected ways, requiring deeper understanding of the conversion process and figuring out what actually needs to be overridden.
The included test doesn't really test the conversion was correct, only that it was successful.
The supported Q layer and quantizer list is rather incomplete.
pre-commit doesn't like the code.

heborras · 2021-11-22T10:20:10Z

Hi @vloncar,
the converter already looks very interesting and the approach of using hooks into tf2onnx is quite nice.
I've had a closer look at the code and in particular the test cases. Here I've made some additions to get them to run in tox on my machine and also to pass the pre-commit hooks, though you may notice that some lines of unused code were simply commented out to get this to work, so at a later stage we should take another look at unused comments.
After this I had a look at the conversion itself and fixed some smaller things to get the network through the QONNX cleanup procedure:

Some Inputs of the Quant nodes were set as attributes, so I moved them to the inputs.
tf2onnx defines the input and output shape of the network as [?,...] and finn-base can only handle known batch sizes, so I replaced this with [1,...].
One really puzzling thing was, that for some reason the output datatype of the Quant node got set to literally nothing, as in not even defined. So I manually set these types to float32. However, there might be some deeper underlying reason for this behavior in the tf2onnx API, which I couldn't spot. So we should keep an eye out for this behavior.

I also have yet to run some actual execution tests to compare the QKeras and onnx execution, but with these changes that should become easier to implement in the next few days.

As an aside: Do you know if the tf2onnx API has some auto-generated documentation somewhere? I noticed that there are comments in the code for such a process, but on their project page I could only find their general documentation for the tool, not the API.

maltanar · 2021-11-24T12:10:43Z

This is a great start, thanks @vloncar ! Couple of comments & questions from a joint discussion with @HenniOVP , @volcacius and myself that came up while reviewing:

It would be beneficial to have some single-layer export/conversion tests with execution comparison to easily catch + debug issues at a smaller granularity. We have some tests like this in FINN that helped discover plenty of bugs, which define a single quantized layer in Brevitas, export it to FINN-ONNX, then compare the execution with PyTorch/Brevitas. To start with I'd suggest single-layer tests for the layers of interest so far, i.e. quantized convs, ReLU and FC layers.
It seems that quantized ReLU nodes are currently converted as Quant -> ReLU, while in Brevitas they are exported as ReLU -> Quant. Does QKeras implement quantized activations in a different way, or is this an error?
The core idea of extracting/stripping quantization info, then re-introducing it on a node name basis in ONNX seems to work, but we wondered whether the tf2onnx hooks can be used earlier without stripping the quant layers? Is this because exporting QKeras layer types directly with tf2onnx is not possible?
If a QKeras layer will turn into more than one ONNX node (e.g. FC/conv layer with quantized bias, where the bias turns into a separate Add node), will the current mechanism for tracking the quantization information handle this correctly? If not, I would propose to add assertions to catch this if the user tries to convert such a model.
Minor: @HenniOVP 's comment above re. "the output datatype of the Quant node got set to literally nothing" may warrant a closer look, since it seems an ONNX ValueInfoProto is created for these tensors but without any datatype. I'm not sure exactly where these get created but I suspect that one of the insertion functions from tf2onnx (insert_node_on_output or insert_new_node_on_input) may take an extra argument to specify the ONNX datatype correctly.

vloncar · 2021-11-24T18:46:33Z

A brief summary of discussion on @maltanar's questions:

Totally agree on the single-layer tests. Will be included in the next few commits.
The obeserved issue comes from the quirkiness of the model, the real bug here is that the Quant node is missing after the activation. We can explore special cases where the Quant node before the activation can be removed with a transformation pass post-conversion.
The approach with hooks for QKeras layers was attempted, but there were issues. Will be revisited now that the internals of tf2onnx is better understood.
1-to-many mapping between QKeras nodes and ONNX nodes can be handled in the tf2onnx's handler.

src/qonnx/converters/qkeras/quantizers.py

jmduarte · 2022-04-22T02:55:10Z

Just checking in on this PR, and doing some housekeeping. I also checked which tests fail:

======================================================= short test summary info ========================================================
FAILED tests/test_keras_convert.py::test_qkeras_qactivation[4] - AssertionError: 
FAILED tests/test_keras_convert.py::test_qkeras_qactivation[5] - AssertionError: 
FAILED tests/test_keras_convert.py::test_qkeras_qactivation[6] - AssertionError: 
FAILED tests/test_keras_convert.py::test_qkeras_qactivation[7] - AssertionError: 
FAILED tests/test_keras_convert.py::test_qkeras_qactivation[8] - AssertionError: 
FAILED tests/test_keras_convert.py::test_qkeras_qactivation[9] - AssertionError: 
FAILED tests/test_keras_convert.py::test_qkeras_qactivation[10] - AssertionError: 
FAILED tests/test_keras_convert.py::test_qkeras_qactivation[11] - AssertionError: 
FAILED tests/test_keras_convert.py::test_qkeras_qactivation[12] - AssertionError: 
FAILED tests/test_keras_convert.py::test_qkeras_qactivation[13] - AssertionError: 
FAILED tests/test_keras_convert.py::test_qkeras_qactivation[14] - AssertionError: 
FAILED tests/test_keras_convert.py::test_qkeras_qactivation[15] - AssertionError: 
FAILED tests/test_keras_convert.py::test_qkeras_qactivation[16] - AssertionError: 
FAILED tests/test_keras_convert.py::test_qkeras_conv2d_conversion - onnxruntime.capi.onnxruntime_pybind11_state.Fail: [ONNXRuntimeErr...
============================================ 14 failed, 32 passed, 1073 warnings in 27.01s =============================================

Only the test_qkeras_conv2d_conversion seems very concerning at the moment in that the inference fails entirely for the ONNX model. Looking at the converted ONNX model in netron, I wonder if it has to do with the shape inference not working properly.

Error message:

E           onnxruntime.capi.onnxruntime_pybind11_state.Fail: [ONNXRuntimeError] : 1 : FAIL : Non-zero status code returned while running Conv node. Name:'Conv_1' Status Message: Input channels C is not equal to kernel channels * group. C: 14 kernel channels: 3 group: 1

../miniconda3/envs/qonnx/lib/python3.8/site-packages/onnxruntime/capi/session.py:110: Fail
--------------------------------------------------------- Captured stderr call ---------------------------------------------------------
2022-04-21 18:45:32.772521537 [E:onnxruntime:, sequential_executor.cc:309 Execute] Non-zero status code returned while running Conv node. Name:'Conv_1' Status Message: Input channels C is not equal to kernel channels * group. C: 14 kernel channels: 3 group: 1
---------------------------------------------------------- Captured log call -----------------------------------------------------------
ERROR    tf2onnx.tfonnx:tfonnx.py:302 Failed to convert node 'model_45/conv2d_1_m/BiasAdd' (fct=<function process_parsed_graph.<locals>.compat_handler at 0x7fe330156790>)
'OP=Conv\nName=model_45/conv2d_1_m/BiasAdd\nInputs:\n\tmodel_45/act0_m/Relu_activation_quantizer:0=Quant, None, 1\n\tmodel_45/conv2d_1_m/Conv2D/ReadVariableOp:0=Const, [3, 3, 32, 64], 1\n\tmodel_45/conv2d_1_m/BiasAdd/ReadVariableOp:0=Const, [64], 1\nOutpus:\n\tmodel_45/conv2d_1_m/BiasAdd:0=[-1, 6, 6, 64], 1'
Traceback (most recent call last):
  File "/crucial/1/jduarte/miniconda3/envs/qonnx/lib/python3.8/site-packages/tf2onnx/tfonnx.py", line 292, in tensorflow_onnx_mapping
    func(g, node, **kwargs, initialized_tables=initialized_tables, dequantize=dequantize)
  File "/crucial/1/jduarte/miniconda3/envs/qonnx/lib/python3.8/site-packages/tf2onnx/tfonnx.py", line 534, in compat_handler
    return func(ctx, node, name, args)
  File "/crucial/1/jduarte/qonnx/src/qonnx/converters/qkeras/onnx.py", line 109, in conv2d_handler
    ConvOp.any_version(11, ctx, node)
  File "/crucial/1/jduarte/miniconda3/envs/qonnx/lib/python3.8/site-packages/tf2onnx/onnx_opset/nn.py", line 367, in any_version
    shape_dim = ctx.get_shape(node.input[0])[3]
TypeError: 'NoneType' object is not subscriptable

model:

heborras · 2022-04-22T14:17:45Z

Hi @jmduarte ,
the missing shapes are for sure an issue. The ?x64 shape after the second Reshape node might also be confusing to the shape inference. I can maybe take a look at what's happening next week.

thesps · 2022-04-25T11:18:19Z

To add some info on the other failing tests, they mostly use have the pattern of an activation layer like QActivation(activation=quantized_bits(...)) (it should just become a single Quant node in the 'activation path'). I think what happens is that the _strip_qkeras_model returns that layer as Activation(activation=linear)), which is then optimized away by tf2onnx during tf2onnx.convert.from_keras call before we're able to put back in the Quant node. I didn't come up with a fix yet.

heborras · 2022-04-27T11:26:21Z

So the shape inference seems to be doing fine, in the sense that it works as expected. It also simply propagates the ? shape, which we should probably avoid, since the FINN execution engine likely can't work with that.
This is what the network looks like after shape inference:

However, the actual error comes from somewhere else. This appears to have to do with the spurious Transpose in the graph. I'm not sure where it comes from, but it converts to the data to channels last and interestingly also the weights for the Conv node afterwards are in a channels last shape. However, since ONNX doesn't support execution in the channels last format, the attempt at inference then results in the error which we see in the test.
Interestingly, also the third and fourth conv node have weights in the channels last shape. And the shape inference itself is somewhat oblivious to the issue and just predicts shapes, which then later won't work with the Dense layer.

Any idea where this partial channels last conversion and the transpose come form?

jmduarte · 2022-04-28T18:29:27Z

I think it comes from the fact that the shapes are not correct for the quantized relu. I added some debug statements in ConvOp here: https://github.com/onnx/tensorflow-onnx/blob/310949ef8246d73c0392311826ac9050e292412f/tf2onnx/onnx_opset/nn.py#L378

        print("debug")
        print("node inputs", node.input)
        print("node input 0", node.input[0])
        print("shape", ctx.get_shape(node.input[0]))

output for test_keras_conv2d_conversion:

tests/test_keras_convert.py::test_keras_conv2d_conversion debug
node inputs ['input', 'model/conv2d_0_m/Conv2D/ReadVariableOp:0']
node input 0 input
shape [-1, 28, 28, 1]
debug
node inputs ['model/act0_m/Relu:0', 'model/conv2d_1_m/Conv2D/ReadVariableOp:0']
node input 0 model/act0_m/Relu:0
shape [-1, 14, 14, 32]
debug
node inputs ['model/conv2d_1_m/Relu:0', 'model/conv2d_2_m/Conv2D/ReadVariableOp:0']
node input 0 model/conv2d_1_m/Relu:0
shape [-1, 6, 6, 64]

output for test_qkeras_conv2d_conversion

tests/test_keras_convert.py::test_qkeras_conv2d_conversion debug
node inputs ['input', 'model/conv2d_0_m/Conv2D/ReadVariableOp:0', 'model/conv2d_0_m/BiasAdd/ReadVariableOp:0']
node input 0 input
shape [-1, 28, 28, 1]
debug
node inputs ['model/act0_m/Relu_activation_quantizer:0', 'model/conv2d_1_m/Conv2D/ReadVariableOp:0', 'model/conv2d_1_m/BiasAdd/ReadVariableOp:0']
node input 0 model/act0_m/Relu_activation_quantizer:0
shape None
debug
node inputs ['model/conv2d_1_m/Relu_activation_quantizer:0', 'model/conv2d_2_m/Conv2D/ReadVariableOp:0']
node input 0 model/conv2d_1_m/Relu_activation_quantizer:0
shape None
debug
node inputs ['model/act2_m/Relu_activation_quantizer:0', 'model/conv2d_3_m/Conv2D/ReadVariableOp:0']
node input 0 model/act2_m/Relu_activation_quantizer:0
shape None

… reinitialised

…on against so far unsupported features. More tests.

jmduarte · 2022-06-28T15:55:48Z

To do:

Add a test for multiple quantized_bits activations @jmduarte
Debug + write slides on failed QConv2D conversion @jmduarte @thesps
Raise issue on quantized sigmoid/tanh issues (if applicable here) @jmitrevs
@vloncar point to tf2onnx example with custom ops for rewrite of code: https://github.com/onnx/tensorflow-onnx/blob/main/examples/tf_custom_op/custom_op.md

jmduarte · 2022-12-22T13:45:10Z

discussion can continue in #28

jmduarte reviewed Apr 21, 2022

View reviewed changes

src/qonnx/converters/qkeras/quantizers.py Show resolved Hide resolved

jmduarte mentioned this pull request Apr 22, 2022

Support QConv2DBatchnorm and QDenseBatchnorm layers #10

Open

jmduarte force-pushed the keras2qonnx branch from 407687c to d43e754 Compare April 28, 2022 18:38

vloncar and others added 19 commits June 24, 2022 07:31

Initial (unfinished) QKeras to QONNX converter

649c505

Proper name for QKeras package

3cd3ab7

Update minimum python version in github actions

17823b0

Make pre-commit pass.

82767c6

Add TF as a dependency to allow tests to run.

4a093af

Moved some Quant node attributes into inputs.

19e7d0e

Fixed unknown shape in tensor shapes

313228b

Fixed undefined datatypes at Quant outputs.

c45c1b9

Set default names for intermediate outputs in keras converter.

0a3c594

Save returned clean-up model.

d85d950

Add fix attaching shape of initializers for proper shape inference

b75c7fc

Add single layer QKeras tests and checks of numerical consistency

631c72d

Tidy up. Use RandomUniform initializers in tests

51ca081

Fix: copy the weights to the stripped Keras model, otherwise they are…

ed3b8c5

… reinitialised

Fixes to quantizer conversion for numerical correctness, and protecti…

00e53fd

…on against so far unsupported features. More tests.

reorder imports

d61d982

run black

ac3ac0c

flake8

ea33ed0

update some names

a638428

fix quant tests

5fd3aba

jmduarte force-pushed the keras2qonnx branch from 6c57791 to 5fd3aba Compare June 24, 2022 14:32

jmduarte added 3 commits June 24, 2022 07:43

update to finn

9ffe070

flake8

aef2fee

pre-commit hook working

445c5ee

maltanar mentioned this pull request Aug 12, 2022

add dtype onnx/onnx#4363

Open

jmduarte added the enhancement New feature or request label Nov 22, 2022

selwyn96 mentioned this pull request Dec 22, 2022

QKeras to QONNX converter redux #28

Merged

jmduarte closed this Dec 22, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

QKeras to QONNX converter #7

QKeras to QONNX converter #7

vloncar commented Nov 10, 2021

heborras commented Nov 22, 2021

maltanar commented Nov 24, 2021

vloncar commented Nov 24, 2021

jmduarte commented Apr 22, 2022

heborras commented Apr 22, 2022

thesps commented Apr 25, 2022

heborras commented Apr 27, 2022

jmduarte commented Apr 28, 2022

jmduarte commented Jun 28, 2022

jmduarte commented Dec 22, 2022

QKeras to QONNX converter #7

QKeras to QONNX converter #7

Conversation

vloncar commented Nov 10, 2021

heborras commented Nov 22, 2021

maltanar commented Nov 24, 2021

vloncar commented Nov 24, 2021

jmduarte commented Apr 22, 2022

heborras commented Apr 22, 2022

thesps commented Apr 25, 2022

heborras commented Apr 27, 2022

jmduarte commented Apr 28, 2022

jmduarte commented Jun 28, 2022

jmduarte commented Dec 22, 2022