Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using TransposeOptimizer breaks model #1809

Closed
CarlPoirier opened this issue Dec 22, 2021 · 7 comments · Fixed by #1918
Closed

Using TransposeOptimizer breaks model #1809

CarlPoirier opened this issue Dec 22, 2021 · 7 comments · Fixed by #1918
Assignees

Comments

@CarlPoirier
Copy link

CarlPoirier commented Dec 22, 2021

Describe the bug
I have an onnx model on which I want to run the onnx-optimize.py script. All the optimizers work fine except the TransposeOptimizer. When I use it, my onnx model is not working anymore, with onnxruntime at least. I get this error :

[E:onnxruntime:, sequential_executor.cc:346 onnxruntime::SequentialExecutor::Execute] Non-zero status code returned while running Split node. Name:'Split_1297' Status Message: Cannot split using values in 'split' attribute. Axis=1 Input shape={1,1710} NumOutputs=1 Num entries in 'split' (must equal number of outputs) was 1 Sum of sizes in 'split' (must equal size of selected axis) was 1

Urgency
None, for now I deactivated this optimizer and I'm using all the others.

System information

  • OS Platform and Distribution: Windows 10
  • Tf2onnx version: 1.9.3
  • Tensorflow Version: 2.5.2
  • Python version: 3.9.9

To Reproduce
So I modified the onnx-optimize.py script to specify only the TransposeOptimizer. Then, I'm running it this way:

python onnx-optimize.py --input testmodel6.onnx --output testmodel6_broken.onnx 2021-12-22 21:27:52.255558: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library cudart64_110.dll 2021-12-22 21:27:58,174 - INFO - Optimizing ONNX model 2021-12-22 21:27:59,100 - INFO - After optimization: Transpose -4 (15->11)

If you need any more information, let me know!

@fatcat-z
Copy link
Collaborator

Could you please share the related code which will cause such issue so we can do a local repro for a further investigation?

@fatcat-z fatcat-z added the pending on user response Waiting for more information or validation from user label Mar 19, 2022
@CarlPoirier
Copy link
Author

CarlPoirier commented Mar 21, 2022

Sure, so I put the initial and broken models in a zip file which you can download from here(link removed).

I modified the optimize script initially just to narrow down the problem to the transpose optimizer, but for reproduction purposes I use the original script this way :
$ python onnx-optimize.py --input test_transpose.onnx --output test_transpose_broken.onnx 2022-03-21 14:00:16.180182: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library cudart64_110.dll 2022-03-21 14:00:23,577 - INFO - Optimizing ONNX model 2022-03-21 14:00:30,831 - INFO - After optimization: Add -1 (114->113), Cast -35 (118->83), Concat -4 (91->87), Const -374 (616->242), Gather -10 (138->128), Identity -13 (16->3), Reshape -5 (87->82), Shape -47 (110->63), Transpose -4 (15->11), Unsqueeze -55 (153->98)

To test a model, I use ONNXRuntime with the following script with any image of width 500 by height 224. Just replace the model filename to the broken one to get the error :

import onnxruntime as rt
import numpy as np
from PIL import Image

def process_image(image):
    rgbimage = image.convert("RGB")
    np_image = np.asarray(rgbimage)
    np_image = np.transpose(np_image, (2,0,1))
    return np_image

sess = rt.InferenceSession("test_transpose.onnx")

img_file = "500x224test.bmp"
image = Image.open(img_file).convert("RGB")
img = process_image(image)

ort_inputs = {sess.get_inputs()[0].name: [img]}
(boxes, labels, scores) = sess.run(["2949", "2926", "2925"], ort_inputs)

If this is not enough, let me know!

@hwangdeyu hwangdeyu removed the pending on user response Waiting for more information or validation from user label Mar 31, 2022
@hwangdeyu hwangdeyu self-assigned this Mar 31, 2022
@hwangdeyu
Copy link
Contributor

hwangdeyu commented Apr 1, 2022

If we comment the _split_handler optimizer code, the optimizer will run successfully.
This transpose optimizer will remove transpose op and make the tensor shape transpose in place.
It looks like there is nothing wrong with the graph.
image
Need to do more investigation to check if there is any error.

@hwangdeyu
Copy link
Contributor

Hi @CarlPoirier
It's the split axes transpose optimizer issue. Once we deleted the transpose op, the split op axes should be changed at the same time.
#1918 would be helpful.

@CarlPoirier
Copy link
Author

CarlPoirier commented Apr 21, 2022

Hi @hwangdeyu,

Nice! I'm looking forward to the next release.

Thanks for the time you spent onto this.

@hwangdeyu
Copy link
Contributor

hwangdeyu commented May 9, 2022

Hi @hwangdeyu Deyu Huang FTE,

Nice! I'm looking forward to the next release.

Thanks for the time you spent onto this.

TF2ONNX 1.10.1 has been released. ☺

@CarlPoirier
Copy link
Author

Hi @hwangdeyu, It does work now for the models I had linked earlier using tf2onnx 1.10.1, but it seems there are still some issues with some other models. I opened a new issue, #1941 .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants