Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add axis argument to softmax() #649

Merged
merged 17 commits into from
Apr 25, 2024
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 4 additions & 3 deletions docs/SpecCodingConventions.md
Original file line number Diff line number Diff line change
Expand Up @@ -78,8 +78,8 @@ Example:
* The spec is encoded with UTF-8.
* For non-ASCII characters, prefer to use characters directly, rather than [character references](https://html.spec.whatwg.org/multipage/syntax.html#character-references) (a.k.a. entities), except when necessary for escaping e.g. `sequence<DOMString>`. These commonly occur in names in the Acknowledgements and References sections.
* Commonly used punctuation and symbol characters include:
* « » (U+00AB / U+00BB Left/Right Pointing Double Angle Quotation Marks) used for [list literals](https://infra.spec.whatwg.org/#lists)
* → (U+2192 Rightwards Arrow) used for [map iteration](https://infra.spec.whatwg.org/#map-iterate)
* « » (U+00AB / U+00BB Left/Right Pointing Double Angle Quotation Marks) used for [list literals](https://infra.spec.whatwg.org/#lists) and [map literals](https://infra.spec.whatwg.org/#maps).
* → (U+2192 Rightwards Arrow) used for [map iteration](https://infra.spec.whatwg.org/#map-iterate) and [map literals](https://infra.spec.whatwg.org/#maps).
* In expressions:
* Use * (U+002A Asterisk) for multiplication, / (U+002F Solidus) for division, and - (U+002D Hyphen-Minux), to reduce friction for implementers. Don't use × (U+00D7 Multiplication Sign), ∗ (U+2217 Asterisk Operator), ÷ (U+00F7 Division Sign), or − (U+2212 Minus Sign).
* Use named functions like _floor(x)_ and _ceil()_ rather than syntax like ⌊_x_⌋ and ⌈_x_⌉.
Expand Down Expand Up @@ -108,7 +108,8 @@ Example:
* Use `[=list/For each=] |item| of |list|` when iterating over a list, but use more specific terms for the item (e.g. _For each dimension of dimensions:_)
* Use `[=list/For each=] |index| in [=the range=] X to Y, inclusive` when iterating over a numeric range; a range is implicitly an ordered set which is a type of list. Specify _inclusive_ or _exclusive_ regarding the upper bound, for clarity.
* Use "let" to introduce a variable and "set" to update a variable or assign to a property.
* Use « » notation for literal lists, which helps make it clear that they are not JavaScript arrays.
* Use « » notation for literal [lists](https://infra.spec.whatwg.org/#lists), which helps make it clear that they are not JavaScript arrays.
* Use «[ _k_ → _v_ ]» notation for literal [maps](https://infra.spec.whatwg.org/#maps).
* When referring to abstract properties, use the short possessive form `|object|'s [=property=]`. Avoid the wordier `the [=property=] of |object|` form.
* Use "rank" when describing the number of dimensions of a tensor (e.g. in variable names) rather than the ambiguous "size".
* Only use single capital letters as variable names when referring to tensors; i.e. prefer `|shapeA|` to `|A|`, but tensor `|T|` is okay.
Expand Down
24 changes: 13 additions & 11 deletions index.bs
Original file line number Diff line number Diff line change
Expand Up @@ -640,7 +640,7 @@ The {{MLGraphBuilder}} interface serves as a builder (factory) to construct a [=

In WebNN, a [=computational graph=] is composed of <dfn>operators</dfn> which act on data, and are the nodes of the graph. {{MLOperand}}s are a representation of data that flows within the computational graph, and are the edges of the graph. {{MLOperand}}s include a [=computational graph=]'s <dfn for="computational graph">input</dfn> values for inference, <dfn for="computational graph">constants</dfn> (including trained weights) used for inference, intermediate values (often referred to as activations) computed during inference, as well as the output values of inference. An [=operator=]'s <dfn for=operator>input</dfn> is one or more {{MLOperand}}s. An [=operator=]'s <dfn for=operator>output</dfn> is one or more {{MLOperand}}s. [=Operators=] have operator-specific parameters that control their behavior, which can include zero or more <dfn for=operator lt="activation|activation function">activation functions</dfn>, which are {{MLActivation}}s.

A key part of the {{MLGraphBuilder}} interface are methods such as {{MLGraphBuilder/gemm()}} and {{MLGraphBuilder/softmax()}} which create an [=operator=] which represents the actual operation to perform on the input data when the computation is run, and return a new {{MLOperand}} or {{MLActivation}} holding the operator. Methods that create an {{MLOperand}} connect any [=operator/inputs=] and [=operator/activations=] to the operator. Each method invocation returns a distinct new value, without changing the value of any other {{MLOperand}}.
A key part of the {{MLGraphBuilder}} interface are methods such as {{MLGraphBuilder/gemm()}} and {{MLGraphBuilder/softmax(axis)|softmax()}} which create an [=operator=] which represents the actual operation to perform on the input data when the computation is run, and return a new {{MLOperand}} or {{MLActivation}} holding the operator. Methods that create an {{MLOperand}} connect any [=operator/inputs=] and [=operator/activations=] to the operator. Each method invocation returns a distinct new value, without changing the value of any other {{MLOperand}}.

At inference time, every {{MLOperand}} will be bound to a tensor (the actual data), which are essentially multidimensional arrays. The representation of the tensors is implementation dependent, but it typically includes the array data stored in some buffer (memory) and some metadata describing the array data (such as its shape).

Expand Down Expand Up @@ -5278,8 +5278,8 @@ Compute the [softmax](https://en.wikipedia.org/wiki/Softmax_function) values of
the 2-D input tensor along axis 1.
inexorabletash marked this conversation as resolved.
Show resolved Hide resolved
<script type=idl>
partial interface MLGraphBuilder {
MLOperand softmax(MLOperand input);
MLActivation softmax();
MLOperand softmax(MLOperand input, unsigned long axis);
MLActivation softmax(unsigned long axis);
fdwr marked this conversation as resolved.
Show resolved Hide resolved
};
</script>

Expand All @@ -5294,38 +5294,40 @@ partial interface MLGraphBuilder {
// of the input values itself, in order to increase the numerical stability of
// the result.
// [1]: https://cs231n.github.io/linear-classify/#softmax
const max_x = builder.reduceMax(x, { axes: [1], keepDimensions: true });
const max_x = builder.reduceMax(x, { axes: [axis], keepDimensions: true });
const exp_x = builder.exp(builder.sub(x, max_x));
return builder.div(exp_x, builder.reduceSum(exp_x, { axes: [1], keepDimensions: true }));
return builder.div(exp_x, builder.reduceSum(exp_x, { axes: [axis], keepDimensions: true }));
inexorabletash marked this conversation as resolved.
Show resolved Hide resolved
</pre>
</details>
</div>

#### {{MLGraphBuilder/softmax(input)}} #### {#api-mlgraphbuilder-softmax-input}
#### {{MLGraphBuilder/softmax(input, axis)}} #### {#api-mlgraphbuilder-softmax-input-axis}
<div>
**Arguments:**
- *input*: an {{MLOperand}}. The input 2-D tensor.
inexorabletash marked this conversation as resolved.
Show resolved Hide resolved
- *axis*: an {{unsigned long}} scalar. The dimension the softmax will be performed on.
inexorabletash marked this conversation as resolved.
Show resolved Hide resolved

**Returns:**
- an {{MLOperand}}. The output 2-D tensor that contains the softmax results, of the same shape as *input*.
inexorabletash marked this conversation as resolved.
Show resolved Hide resolved
</div>

<details open algorithm>
<summary>
The <dfn method for=MLGraphBuilder>softmax(|input|)</dfn> method steps are:
The <dfn method for=MLGraphBuilder>softmax(|input|, |axis|)</dfn> method steps are:
</summary>
1. If [=MLGraphBuilder/validating operand=] with [=this=] and |input| returns false, then [=exception/throw=] a {{TypeError}}.
1. If |input|'s [=MLOperand/rank=] is not 2, then [=exception/throw=] a {{TypeError}}.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The axis needs to be validated according to input's rank.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added for the operator version in 99e8773

For the activation version - would that be done synchronously when the activation is provided in a graph builder call, or at build time? Do we have other cases of activation validation?

Copy link
Collaborator

@fdwr fdwr Apr 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the activation version - would that be done synchronously when the activation is provided in a graph builder call,

All the information should be available at construction time (pre-build), since WebNN requires explicit shapes during construction.

Do we have other cases of activation validation?

None come to mind, except possibly softplus steepness, but that's not impacted by the tensor shape constraints.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the activation version - would that be done synchronously when the activation is provided in a graph builder call,

All the information should be available at construction time (pre-build), since WebNN requires explicit shapes during construction.

Agreed. In particular, I suppose the validation should be done at the build method of operators that accept a fused activation function, like conv2d.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's a sketch of what this could look like:

  • In create an MLActivation, add optional validation steps parameter, store the validation steps in an internal slot. The default validation steps are to return true.
  • When creating softmax activation, pass these steps as the validation steps:
    1. If axis is greater than or equal to input’s rank, then return false.
    2. Otherwise, return true.
    • Note that if this were C++, this would be a lambda function, capturing axis.
  • In all 7 methods that take activation functions (singular or multiple), add steps like this:
    1. If options.activation exists, and running its validation steps with input returns false, then throw a TypeError.

Thoughts:

  • This could also be bundled into the validate activation steps - make those take an MLOperand in addition to or instead of MLGraphBuilder.
  • Is passing a single MLOperand (i.e. input) sufficient for validation?

WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Is passing a single MLOperand (i.e. input) sufficient for validation?

I suppose it should pass output MLOperand since the activation follows the operator that fuses it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we'll need to pass the output descriptor.

Copy link
Collaborator

@fdwr fdwr Apr 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In particular, I suppose the validation should be done at the build method of operators that accept a fused activation function, like conv2d.

Yep, since stand-alone activations adopt whatever the size is of the operator they are joined to or used inside. So full validation must be deferred for these.

MLConv2dOptions::activation
MLConvTranspose2dOptions::activation
MLBatchNormalizationOptions::activation
MLGruOptions::activations
MLGruCellOptions::activations
MLLstmOptions::activations
MLLstmCellOptions::activations

Is passing a single MLOperand (i.e. input) sufficient for validation?

Sounds fine, the MLActivation and at least either the MLOperand.shape() or the MLOperand itself for more generality.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay - 7981fa5 sketches out the validation. Notes:

  • I went with passing an MLOperandDescriptor, because earlier we decided that we'd do all validation before creating the operator and output MLOperand to pass.
  • The phrasing around creating/passing a lambda is based on Fetch
  • The GRU/LSTM outputs are complicated. I'm not sure I'm passing the right thing, so please please double check.

inexorabletash marked this conversation as resolved.
Show resolved Hide resolved
inexorabletash marked this conversation as resolved.
Show resolved Hide resolved
1. If |axis| is greater than or equal to |input|'s [=MLOperand/rank=], then [=exception/throw=] a {{TypeError}}.
1. *Make graph connections:*
1. Let |output| be the result of [=copying an MLOperand=] given |input|.
1. Let |operator| be an [=operator=] for the softmax operation.
1. Let |operator| be an [=operator=] for the softmax operation, given |axis|.
1. Set |output|.{{MLOperand/[[operator]]}} to |operator|.
1. Set |operator|'s [=operator/input=] to |input|.
1. Set |operator|'s [=operator/output=] to |output|.
1. Return |output|.
</details>

#### {{MLGraphBuilder/softmax()}} #### {#api-mlgraphbuilder-softmax}
#### {{MLGraphBuilder/softmax(axis)}} #### {#api-mlgraphbuilder-softmax-axis}
<div>
**Arguments:**
- None.
Expand All @@ -5336,9 +5338,9 @@ partial interface MLGraphBuilder {

<details open algorithm>
<summary>
The <dfn method for=MLGraphBuilder id=softmax-noargs>softmax()</dfn> method steps are:
The <dfn method for=MLGraphBuilder>softmax(|axis|)</dfn> method steps are:
</summary>
1. Let |op| be the result of [=creating an MLActivation=] given [=this=] and "softmax".
1. Let |op| be the result of [=creating an MLActivation=] given [=this=], "softmax", and «[ "axis" → |axis| ]».
inexorabletash marked this conversation as resolved.
Show resolved Hide resolved
1. Return |op|.
</details>

Expand Down