-
Notifications
You must be signed in to change notification settings - Fork 608
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update IREE onnx import to be in sync with Torch-MLIR #17476
Conversation
Signed-off-by: saienduri <saimanas.enduri@amd.com>
Signed-off-by: saienduri <saimanas.enduri@amd.com>
|
# Make a temp dir for all the temp files we'll be generating as a side | ||
# effect of infering shapes. For now, the only file is a new .onnx holding | ||
# the revised model with shapes. | ||
# | ||
# TODO: If the program temp_dir is None, we should be using an ephemeral | ||
# temp directory instead of a hard-coded path in order to avoid data races | ||
# by default. | ||
input_dir = os.path.dirname(os.path.abspath(args.input_file)) | ||
temp_dir = ( | ||
Path(input_dir if args.temp_dir is None else args.temp_dir) | ||
/ "onnx-importer-temp" | ||
) | ||
shutil.rmtree(temp_dir, ignore_errors=True) | ||
temp_dir.mkdir(exist_ok=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is sketchy code to be copying as-is. If the --temp-dir
path is not specified, this looks like it will rm -rf
the directory that the input file is in??
I don't see why https://docs.python.org/3/library/tempfile.html wouldn't work here? No need to have the user create their own temp dir and pass it in via a flag, and then no risk of deleting an existing directory.
Since this is a user tool for IREE, I also don't see too much value in adding options around configurable temp paths that would more facilitate torch-mlir debugging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the user does not specify --temp-dir, I think it is creating an onnx-importer-temp
directory in the directory that the input file is in
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah. I'd prefer to use tempfile. That's what the TODO there is suggesting anyways. It should be less code :P
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think tempfile works here because our one temporary file is being generated here: onnx.shape_inference.infer_shapes_path(args.input_file, <path_to_output_file>, data_prop=args.data_prop)
. So, we have to pass a path where it will be stored and can't change the file being generated by onnx to a tempfile. But, if we don't want to deal with temp_dir
in iree, I can just remove that arg and just make it create a file with a certain path in the input dir and delete it at the end if the arg to remove temp files is true so we don't have to deal with any directories being deleted. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
won't this work?
# create a temporary directory using the context manager
with tempfile.TemporaryDirectory() as tmpdirname:
print('created temporary directory', tmpdirname)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah temp directory would work. But, at the moment, it's only one file. Do we want to create a directory for it, or just delete the file created by the onnx call at the end. Either way works
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Keep it simple - use standard library functions with predictable behavior (create a file or directory with tempfile, use that for scratch space). I don't want to risk a bug like https://www.theregister.com/2015/01/17/scary_code_of_the_week_steam_cleans_linux_pcs/ :P
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it, went with the tempfile.TemporaryDirectory
compiler/bindings/python/iree/compiler/tools/import_onnx/__main__.py
Outdated
Show resolved
Hide resolved
compiler/bindings/python/iree/compiler/tools/import_onnx/__main__.py
Outdated
Show resolved
Hide resolved
compiler/bindings/python/iree/compiler/tools/import_onnx/__main__.py
Outdated
Show resolved
Hide resolved
compiler/bindings/python/iree/compiler/tools/import_onnx/__main__.py
Outdated
Show resolved
Hide resolved
parser.add_argument( | ||
"--data-dir", | ||
help="Path between CWD and the base directory of the data," | ||
" excluding the directories given in the 'location' argument of " | ||
" convert_model_to_external_data. For example, if 'location' was" | ||
' "data/data.bin" and the relative path from CWD to that .bin file is' | ||
' a/b/data/data.bin, then set data-dir to "a/b".' | ||
" Defaults to the directory of the input file.", | ||
type=Path, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This docstring is confusing. Why not just take a regular path?
https://onnx.ai/onnx/api/external_data_helper.html#convert-model-to-external-data takes a relative path, sure, but that's for creating external data files. All we need to do here is load from an external path, which has no requirements on being in the same directory tree as the base model file: https://onnx.ai/onnx/api/external_data_helper.html#load-external-data-for-model
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I don't see anything that suggests this has to be a relative path either. made the change
Signed-off-by: saienduri <saimanas.enduri@amd.com>
compiler/bindings/python/iree/compiler/tools/import_onnx/__main__.py
Outdated
Show resolved
Hide resolved
compiler/bindings/python/iree/compiler/tools/import_onnx/__main__.py
Outdated
Show resolved
Hide resolved
compiler/bindings/python/iree/compiler/tools/import_onnx/__main__.py
Outdated
Show resolved
Hide resolved
Signed-off-by: saienduri <saimanas.enduri@amd.com>
Signed-off-by: saienduri <saimanas.enduri@amd.com>
Signed-off-by: saienduri <saimanas.enduri@amd.com>
compiler/bindings/python/iree/compiler/tools/import_onnx/__main__.py
Outdated
Show resolved
Hide resolved
# Load the temp file and the external data. | ||
inferred_model = onnx.load(temp_inferred_file, load_external_data=False) | ||
data_dir = Path(input_dir if args.temp_dir is None else args.data_dir) | ||
onnx.load_external_data_for_model(inferred_model, data_dir) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This data_dir
code looks wrong. The temp_dir arg is no longer used, so what is this actually trying to do?
I'm not really following the logic here. New tests cases would help: https://github.com/iree-org/iree/blob/main/compiler/bindings/python/test/tools/import_onnx_test.py.
(note that those aren't running on CI: #16624)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that this is a typo from whoever wrote the torch-mlir importer. They probably meant args.data_dir. I don't see how these two could be related
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I see, can add CI for this in a follow up PR. I've been locally testing that it works. Also, for all e2eshark testing we don't use the shape inference with a file. Might have to figure out which cases require this. Might get more models passing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah no need to get the CI running as part of this. I'm just hesitant to accept clearly buggy code from upstream blindly :P. If the shape inference with a file path hasn't been exercised yet, we could also remove it until needed... smaller surface area that way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, will just remove the shape inference with a file path then. I haven't seen it used anywhere
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or actually, I'll leave it in case it is required somewhere. Logic seems to be correct from what I can tell, but I'll try to get some tests for all these cases.
Signed-off-by: saienduri <saimanas.enduri@amd.com>
compiler/bindings/python/iree/compiler/tools/import_onnx/__main__.py
Outdated
Show resolved
Hide resolved
Signed-off-by: saienduri <saimanas.enduri@amd.com>
# Run the checker to test whether the file is above the threshold for | ||
# in-memory shape inference. If not, go ahead and do the shape inference. | ||
try: | ||
onnx.checker.check_model(raw_model) | ||
inferred_model = onnx.shape_inference.infer_shapes( | ||
raw_model, data_prop=args.data_prop | ||
) | ||
return inferred_model | ||
except ValueError: | ||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I saw 159 new test import failures on Windows after pulling this code in.
D:\dev\projects\SHARK-TestSuite\iree_tests (main -> upstream)
(nightly_pip.venv) λ iree-import-onnx D:\dev\projects\SHARK-TestSuite\third_party\onnx\onnx\backend\test\data\node\test_cast_DOUBLE_to_FLOAT\model.onnx -o D:\dev\projects\iree-tmp\double_to_float.mlir
Traceback (most recent call last):
File "<frozen runpy>", line 198, in _run_module_as_main
File "<frozen runpy>", line 88, in _run_code
File "D:\dev\projects\SHARK-TestSuite\iree_tests\nightly_pip.venv\Scripts\iree-import-onnx.exe\__main__.py", line 7, in <module>
File "D:\dev\projects\SHARK-TestSuite\iree_tests\nightly_pip.venv\Lib\site-packages\iree\compiler\tools\import_onnx\__main__.py", line 140, in _cli_main
sys.exit(main(parse_arguments()))
^^^^^^^^^^^^^^^^^^^^^^^
File "D:\dev\projects\SHARK-TestSuite\iree_tests\nightly_pip.venv\Lib\site-packages\iree\compiler\tools\import_onnx\__main__.py", line 44, in main
model_proto = load_onnx_model(args)
^^^^^^^^^^^^^^^^^^^^^
File "D:\dev\projects\SHARK-TestSuite\iree_tests\nightly_pip.venv\Lib\site-packages\iree\compiler\tools\import_onnx\__main__.py", line 84, in load_onnx_model
onnx.checker.check_model(raw_model)
File "D:\dev\projects\SHARK-TestSuite\iree_tests\nightly_pip.venv\Lib\site-packages\onnx\checker.py", line 148, in check_model
C.check_model(protobuf_string, full_check, skip_opset_compatibility_check)
onnx.onnx_cpp2py_export.checker.ValidationError: Your model ir_version 10 is higher than the checker's (9).
Updating my onnx
Python package from 1.15.0 to 1.16.1 helped. We should update some of these or make this code tolerate different versions:
Line 470 in 6c75aa1
"onnx>=1.15.0", |
Line 624 in 6c75aa1
pip install onnx>=1.15.0 |
With the newer onnx
Python package, there are now 2 new import successes and 0 new import failures.
This pulls in iree-org/iree#17476 which now runs shape inference: `onnx.shape_inference.infer_shapes(model, data_prop=True)`. Generated from an already configured venv with: ``` python -m pip install --upgrade --find-links https://iree.dev/pip-release-links.html iree-compiler python -m pip install --upgrade onnx python .\onnx\import_tests.py ```
As discovered on #17476 (comment), this is needed to avoid an error running `iree-import-onnx` on certain test files: > onnx.onnx_cpp2py_export.checker.ValidationError: Your model ir_version 10 is higher than the checker's (9). If we care about supporting older onnx package versions, we could add some fallback code to `iree-import-onnx`. We may also want to update the version used in torch-mlir? https://github.com/llvm/torch-mlir/blob/e0a5adb1db38b0072c44b87570bc530eb3b324ad/setup.py#L264
This commit updates `iree-import-onnx` so that it behaves the same as torch-mlir's version (https://github.com/llvm/torch-mlir/blob/main/python/torch_mlir/tools/import_onnx/__main__.py). Specifically, enabling the data propagation improves shape inference and is leading to more models passing. Related to iree-org#17021 --------- Signed-off-by: saienduri <saimanas.enduri@amd.com>
As discovered on iree-org#17476 (comment), this is needed to avoid an error running `iree-import-onnx` on certain test files: > onnx.onnx_cpp2py_export.checker.ValidationError: Your model ir_version 10 is higher than the checker's (9). If we care about supporting older onnx package versions, we could add some fallback code to `iree-import-onnx`. We may also want to update the version used in torch-mlir? https://github.com/llvm/torch-mlir/blob/e0a5adb1db38b0072c44b87570bc530eb3b324ad/setup.py#L264
This commit updates `iree-import-onnx` so that it behaves the same as torch-mlir's version (https://github.com/llvm/torch-mlir/blob/main/python/torch_mlir/tools/import_onnx/__main__.py). Specifically, enabling the data propagation improves shape inference and is leading to more models passing. Related to iree-org#17021 --------- Signed-off-by: saienduri <saimanas.enduri@amd.com>
As discovered on iree-org#17476 (comment), this is needed to avoid an error running `iree-import-onnx` on certain test files: > onnx.onnx_cpp2py_export.checker.ValidationError: Your model ir_version 10 is higher than the checker's (9). If we care about supporting older onnx package versions, we could add some fallback code to `iree-import-onnx`. We may also want to update the version used in torch-mlir? https://github.com/llvm/torch-mlir/blob/e0a5adb1db38b0072c44b87570bc530eb3b324ad/setup.py#L264
This commit updates `iree-import-onnx` so that it behaves the same as torch-mlir's version (https://github.com/llvm/torch-mlir/blob/main/python/torch_mlir/tools/import_onnx/__main__.py). Specifically, enabling the data propagation improves shape inference and is leading to more models passing. Related to iree-org#17021 --------- Signed-off-by: saienduri <saimanas.enduri@amd.com>
As discovered on iree-org#17476 (comment), this is needed to avoid an error running `iree-import-onnx` on certain test files: > onnx.onnx_cpp2py_export.checker.ValidationError: Your model ir_version 10 is higher than the checker's (9). If we care about supporting older onnx package versions, we could add some fallback code to `iree-import-onnx`. We may also want to update the version used in torch-mlir? https://github.com/llvm/torch-mlir/blob/e0a5adb1db38b0072c44b87570bc530eb3b324ad/setup.py#L264
This pulls in iree-org/iree#17476 which now runs shape inference: `onnx.shape_inference.infer_shapes(model, data_prop=True)`. Generated from an already configured venv with: ``` python -m pip install --upgrade --find-links https://iree.dev/pip-release-links.html iree-compiler python -m pip install --upgrade onnx python .\onnx\import_tests.py ```
This commit updates `iree-import-onnx` so that it behaves the same as torch-mlir's version (https://github.com/llvm/torch-mlir/blob/main/python/torch_mlir/tools/import_onnx/__main__.py). Specifically, enabling the data propagation improves shape inference and is leading to more models passing. Related to iree-org#17021 --------- Signed-off-by: saienduri <saimanas.enduri@amd.com> Signed-off-by: Lubo Litchev <lubol@google.com>
As discovered on iree-org#17476 (comment), this is needed to avoid an error running `iree-import-onnx` on certain test files: > onnx.onnx_cpp2py_export.checker.ValidationError: Your model ir_version 10 is higher than the checker's (9). If we care about supporting older onnx package versions, we could add some fallback code to `iree-import-onnx`. We may also want to update the version used in torch-mlir? https://github.com/llvm/torch-mlir/blob/e0a5adb1db38b0072c44b87570bc530eb3b324ad/setup.py#L264 Signed-off-by: Lubo Litchev <lubol@google.com>
This commit updates
iree-import-onnx
so that it behaves the same as torch-mlir's version (https://github.com/llvm/torch-mlir/blob/main/python/torch_mlir/tools/import_onnx/__main__.py).Specifically, enabling the data propagation improves shape inference and is leading to more models passing.
Related to #17021