Skip to content
This repository has been archived by the owner on Oct 23, 2023. It is now read-only.

fix_float64 not passed if cleanup is run #63

Open
jmitrevs opened this issue May 16, 2022 · 5 comments
Open

fix_float64 not passed if cleanup is run #63

jmitrevs opened this issue May 16, 2022 · 5 comments

Comments

@jmitrevs
Copy link

In the transform member function of ModelWrapper:

def transform(
self, transformation, make_deepcopy=True, cleanup=True, fix_float64=True
):
"""Applies given Transformation repeatedly until no more changes can be made
and returns a transformed ModelWrapper instance.
- make_deepcopy : operates on a new (deep)copy of model.
- fix_float64 : DoubleToSingleFloat correction before starting
- cleanup : execute cleanup transformations before returning
"""
transformed_model = self
if make_deepcopy:
transformed_model = copy.deepcopy(self)
if fix_float64:
(transformed_model, model_was_changed) = DoubleToSingleFloat().apply(
transformed_model
)
model_was_changed = True
while model_was_changed:
(transformed_model, model_was_changed) = transformation.apply(
transformed_model
)
if cleanup:
transformed_model.cleanup()
return transformed_model

if cleanup=True, then the transform is called with fix_float64=True during the cleanup, no matter what the transform was originally called with. I think either fix_float64 should default to False, or fix_float64 should be passed to the cleanup.

@jmitrevs
Copy link
Author

A simple onnx file that shows this behavior is hare:

https://drive.google.com/file/d/16Opd4K3xjEEp9jgmmkZgqW5fXCFJ84lA/view?usp=sharing

(github doesn't allow me to upload onnx files)

The standard way to initiate this problem is with:

    qonnx.util.cleanup.cleanup("MLP.onnx", out_file="MLP_clean.onnx")
    model = ModelWrapper("MLP_clean.onnx")
    X = create_predict_data(model)
    idict = {model.graph.input[0].name: X}
    y_qonnx = oxe.execute_onnx(model, idict)[model.graph.output[0].name]

where create_predict_data basically crates x = np.random.uniform(-1, 1, (1, 259)).

@jmitrevs
Copy link
Author

I can get the above version of this model to work by setting the default of fix_float64 to false, but interestingly, the same steps do not work for a float version of this model:

https://drive.google.com/file/d/1Ogjh7H3X8hpT3qktXs-11lVlZjpDCh_Y/view?usp=sharing

For some reason, I get Found unspecified tensor shapes, try infer_shapes even though I run the model cleaning first.

@maltanar
Copy link
Collaborator

Thanks for providing the example models, I can reproduce the issue and I'm looking into a clean solution for this. The origin of the fix_float64 (which indistriminately replaces float64 initializers with float32 ones) were some erronous ONNX models that had standard ONNX nodes with mismatched input data types, e.g. an Add node with one float32 and one float64 input. This violates the ONNX standard and onnxruntime complains when trying to execute them, so we put in a quick-and-dirty fix that replaces float64 initializers with float32 ones.

If I remove the workaround completely, we get numerous test failures in FINN. However, for models that actually use float64 for both inputs of a standard node, the workaround actually creates the original problem itself :-) so a cleaner solution could be one of:

  1. Identify the source of the invalid ONNX models, fix it and remove the fix_float64 workaround completely. This is what I'm looking into currently. At least one source of float64s seems to be the QONNX-to-FINN-ONNX conversion by @HenniOVP e.g. https://github.com/Xilinx/finn/blob/dev/src/finn/transformation/qonnx/qonnx_activation_handlers.py#L193-L194 - @HenniOVP could you comment on this? ONNX float attributes seem to be actually only allowed to be float32 according to the protobuf spec: https://github.com/onnx/onnx/blob/main/onnx/onnx.proto#L118

  2. Do the float64 fixing more intelligently instead of changing all initializers, e.g. only performing the conversion if the input types are mismatched. However, this will likely require paying attention to the node type (there may be ONNX nodes that is actually valid with mixed precision).

  3. Propagate the fix_float64 across multiple calls including the ModelWrapper cleanup and the qonnx cleanup

@maltanar
Copy link
Collaborator

I think I found the answer for the question I asked above to @HenniOVP : it's actually not the case that ONNX only accepts 64bit floats as attributes, but rather, the protobuf wrapper for ONNX doesn't recognize numpy float32 (as opposed to native Python float). Working on a fix for that part.

@heborras
Copy link
Contributor

I was just about to comment something similar. It seems that the issue, which I mentioned in the code only appears with the attributes and not the initializes.
So in line 194 everything works fine, when using float32 instead, but in lines 168 to 170 the onnx numpy helper will complain, when using float32 and thus required float64.

So, in fact line 194 should probably be changed to float32.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants