-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Resolve GH issue 12706 #12815
Resolve GH issue 12706 #12815
Conversation
// Adjust this restriction once the other EPs' Resize | ||
// kernel(s) supports NHWC input. | ||
if (args.node.GetExecutionProviderType() != "CPUExecutionProvider") { | ||
return false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs some more thought as the TransposeOptimizer is a L1 optimizer and there is no partitioning info at this point. But, we do need to handle the CUDA EP case as the CUDA Resize kernel is implemented assuming the provided input is NCHW.
CC:@ yihonglyu. Any comment ? This causes a regression for CUDA EP users in 1.12. Ideally, a fix similar to the CPU Resize op should be applied to the CUDA kernel as well.
EDIT: Since the EP info can't be ascertained at runtime, the Resize handler has to be temporarily dropped from the handler map in CUDA builds until the CUDA Resize kernel gets a fix similar to the one the CPU Resize kernel got in #10824.
// Per tests included in #10824, the ROCM EP also generates | ||
// incorrect results when this handler is used, so the Resize | ||
// handler is not enabled even for those builds. | ||
#if !defined(USE_CUDA) && !defined(USE_ROCM) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to give a warning or skip the tests instead of just comment out the tests for CUDA or ROCM?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that the resize handler is not part of such builds (CUDA or ROCM), we will have to skip it (can't even continue with warning). But I am curious - what value will it add instead of this approach ? Can you elaborate please ?
@@ -498,6 +509,7 @@ | |||
/*opset_version*/ 13); | |||
} | |||
|
|||
#endif | |||
TEST(TransposeOptimizerTests, TestAdd) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: #endif // !defined(USE_CUDA) && !defined(USE_ROCM)
would be nicer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. I will include the change in another PR to avoid running all the CIs again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Description:
Only the CPU Resize kernel handlles NHWC input. Adjust the Transpose optimizer's Resize handler accordingly. Without this, the CUDA Resize kernel runs the NCHW logic on NHWC input and produces garbage output.
Motivation and Context
Fix regression in 1.12 (Resolve #12706)
Relevant PR - #10824