Softmax axis absent #466

fdwr · 2023-09-29T00:53:40Z

(raised by @Honry in review https://github.com/microsoft/onnxruntime/pull/17665/files)

TF/PT/ONNX all take an axis parameter:

...but WebNN's softmax does not, making it challenging to implement a caller's softmax in terms of the function of the same name in WebNN. It is possible (see here) via bracketing transposes+reshapes around it, but the transpose+reshape contortions are unfortunate, and they could be more efficiently implemented in the backend rather than in each framework.

✅ Apple Metal Performance Shaders softMax has an axis.
✅ Apple MIL activation.softmax supports an axis.
✅ DirectML's DML_ACTIVATION_SOFTMAX1_OPERATOR_DESC supports an arbitrary axis list and dimensions, just like reduce. The older DML_ACTIVATION_SOFTMAX_OPERATOR_DESC can achieve it via reshapes/transpose/strides.
☑ XNNPack - limited to 2D input currently. This can be achieved by updating XNNPack to accept an axis or by using the existing XNNPack operator plus a reshape (in the simple case when the axis is the last dimension) or transpose (if the axis comes before the last dimension).

So it's achievable in each backend, even without any changes to the DML/XNNPack API's, but it would move the pain from the caller down to where it can be handled efficiently.

https://www.w3.org/TR/webnn/#api-mlgraphbuilder-softmax-method

partial interface MLGraphBuilder {
-  MLOperand softmax(MLOperand input);
+  MLOperand softmax(MLOperand input, unsigned long axis);
-  MLActivation softmax();
+  MLActivation softmax(unsigned long axis);
};

The behavior of this operation can be generically emulated from the usage of other operations as follow. However, user agents typically have a more efficient implementation for it, therefore its usage is encouraged from the performance standpoint.
// This sample deploys a well-known implementation trick [1] to compute the
// exponentials of the distances to the max value, instead of the exponentials
// of the input values itself, in order to increase the numerical stability of
// the result.
// [1]: https://cs231n.github.io/linear-classify/#softmax
- const maxX = builder.reduceMax(x, { axes: [1], keepDimensions: true });
+ const maxX = builder.reduceMax(x, { axes: [axis], keepDimensions: true });
const expX = builder.exp(builder.sub(x, maxX));
- return builder.div(expX, builder.reduceSum(expX, { axes: [1], keepDimensions: true }));
+ return builder.div(expX, builder.reduceSum(expX, { axes: [axis], keepDimensions: true }));

The text was updated successfully, but these errors were encountered:

huningxin · 2023-11-02T14:43:12Z

Should it be:

-  MLOperand softmax(MLOperand input);
+  MLOperand softmax(MLOperand input, unsigned long axis);

fdwr · 2023-11-02T14:44:32Z

Ningxin: Indeed, I fixed my typo right before you wrote your comment 😅.

@fdwr

Frameworks (TensorFlow, PyTorch, ONNX) all accept an axis parameter. Most backends also support an axis, or it can be emulated with a reshape. As @fdwr wrote: So it's achievable in each backend... but it would move the pain from the caller down to where it can be handled efficiently. Fixes webmachinelearning#466

@fdwr

* Add axis argument to softmax() Frameworks (TensorFlow, PyTorch, ONNX) all accept an axis parameter. Most backends also support an axis, or it can be emulated with a reshape. As @fdwr wrote: So it's achievable in each backend... but it would move the pain from the caller down to where it can be handled efficiently. Fixes #466 * revert activation example to softmax * validate softmax axis against inputs rank * update TOC headers * Update index.bs Co-authored-by: Dwayne Robinson <dwayner@microsoft.com> * camelCase not snake_case * Remove unnecessary condition * Update index.bs Co-authored-by: Dwayne Robinson <dwayner@microsoft.com> * Update index.bs Co-authored-by: Dwayne Robinson <dwayner@microsoft.com> * Update index.bs Co-authored-by: Dwayne Robinson <dwayner@microsoft.com> * Sketch of validation for activations * For gru() and lstm(), calculate gate descriptor, validate activations with it * fix some copy/pasta --------- Co-authored-by: Dwayne Robinson <dwayner@microsoft.com>

fdwr mentioned this issue Nov 23, 2023

Support for transformers #375

Open

inexorabletash mentioned this issue Feb 6, 2024

Process: Add documentation for labels, current and proposed #533

Merged

3 tasks

anssiko added the operator specific label Feb 7, 2024

inexorabletash self-assigned this Apr 18, 2024

inexorabletash mentioned this issue Apr 18, 2024

Add axis argument to softmax() #649

Merged

fdwr mentioned this issue Apr 22, 2024

argMax/Min only support scalar axis in TFLite runtime #629

Closed

fdwr closed this as completed in #649 Apr 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Softmax axis absent #466

Softmax axis absent #466

fdwr commented Sep 29, 2023 •

edited

Loading

huningxin commented Nov 2, 2023

fdwr commented Nov 2, 2023 •

edited

Loading

Softmax axis absent #466

Softmax axis absent #466

Comments

fdwr commented Sep 29, 2023 • edited Loading

huningxin commented Nov 2, 2023

fdwr commented Nov 2, 2023 • edited Loading

fdwr commented Sep 29, 2023 •

edited

Loading

fdwr commented Nov 2, 2023 •

edited

Loading