-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WebNN EP] Support int64 output data type for CoreML backend #21401
Comments
Indeed, I really wonder given all indices are int64 in ONNX. |
CoreML EP converts all int64 attribute and initializer values to int32 when creating the CoreML model (and checks for overflow errors as it does it). it also tracks if it needs to convert specific inputs/outputs between int64 and int32 when executing the CoreML model. Once you have the attributes, initializers, and coreml model inputs as int32 the internals of the coreml model will produce int32 values, and we just need to convert the output from the coreml model back to int64 if applicable. |
Thank you @skottmckay, that's really helpful!
|
This PR adds some workarounds to enable int64 support for some WebNN backends which don't support int64 data type. - Do not fallback ops that are specifically due to the int64 limitation. - Convert all int64 initializer and input values to int32 and handle potential overflow errors. - Register all int64 model intputs and outputs as int32 ml-tensor. - Handle ONNX ops that need intputs or outputs conversion between int64 and int32. e.g. ArgMax, ArgMin, Cast, etc. - Convert int64 output data back to int32. - Disallow int64 outputs as 'ml-tensor' preferredOutputLocation. Fixed microsoft#21401
This PR adds some workarounds to enable int64 support for some WebNN backends which don't support int64 data type. - Do not fallback ops that are specifically due to the int64 limitation. - Convert all int64 initializer and input values to int32 and handle potential overflow errors. - Register all int64 model intputs and outputs as int32 ml-tensor. - Handle ONNX ops that need intputs or outputs conversion between int64 and int32. e.g. ArgMax, ArgMin, Cast, etc. - Convert int64 output data back to int32. - Disallow int64 outputs as 'ml-tensor' preferredOutputLocation. Fixed microsoft#21401
This PR adds some workarounds to enable int64 support for some WebNN backends which don't support int64 data type. - Do not fallback ops that are specifically due to the int64 limitation. - Convert all int64 initializer and input values to int32 and handle potential overflow errors. - Register all int64 model intputs and outputs as int32 ml-tensor. - Handle ONNX ops that need intputs or outputs conversion between int64 and int32. e.g. ArgMax, ArgMin, Cast, etc. - Convert int64 output data back to int32. - Disallow int64 outputs as 'ml-tensor' preferredOutputLocation. Fixed microsoft#21401
This PR adds some workarounds to enable int64 support for some WebNN backends which don't support int64 data type. - Do not fallback ops that are specifically due to the int64 limitation. - Convert all int64 initializer and input values to int32 and handle potential overflow errors. - Register all int64 model intputs and outputs as int32 ml-tensor. - Handle ONNX ops that need intputs or outputs conversion between int64 and int32. e.g. ArgMax, ArgMin, Cast, etc. - Convert int64 output data back to int32. - Disallow int64 outputs as 'ml-tensor' preferredOutputLocation. Fixed microsoft#21401
This PR adds some workarounds to enable int64 support for some WebNN backends which don't support int64 data type. - Do not fallback ops that are specifically due to the int64 limitation. - Convert all int64 initializer and input values to int32 and handle potential overflow errors. - Register all int64 model intputs and outputs as int32 ml-tensor. - Handle ONNX ops that need intputs or outputs conversion between int64 and int32. e.g. ArgMax, ArgMin, Cast, etc. - Convert int64 output data back to int32. - Disallow int64 outputs as 'ml-tensor' preferredOutputLocation. Fixed microsoft#21401
Describe the feature request
WebNN CoreML backend doesn't support int64 data type, however some ops from ONNX produce int64 output, e.g. ArgMax, ArgMin, etc., CoreML's AragMax reproduces int32 output.
That means we should check the dimension size being reduced is within int32 range, then do type casting (int32 -> int64) for the output.
The node of such op must be the output of a subgraph model, as its next node is int64 input which is not supported by CoreML backend, and it will fall back, unless it is a special case: ArgMax-Cast (from int64 to int32).
Following actions can be taken into account:
Besides, how CoreML EP handles int64 data type would be a good reference.
Describe scenario use case
N/A
The text was updated successfully, but these errors were encountered: