Operator set wave 3 #805

fdwr · 2025-01-16T14:59:22Z

Adds the following operators, per #375 (comment):

Resolves #93
Resolves #467
Resolves #767
Resolves #772
Resolves #773
Resolves #779

Some TODO's remain:

finish algorithm steps in gatherElements/scatterElements
finish algorithm steps in gatherNd/scatterNd
update slice output shape
add blockwise-compatible definition
show emulation of DQ/Q
examples to the scattering/gathering operators
data type tables
⏳u/int4 split into separate change

partial interface MLGraphBuilder
{
    ...
    MLOperand cumulativeSum(MLOperand input, unsigned long axis, optional MLCumulativeSumOptions options = {});
    MLOperand sign(MLOperand input, optional MLOperatorOptions options = {});
    MLOperand tile(MLOperand input, sequence<unsigned long> repetitions, optional MLOperatorOptions options = {});

    // Extends the family beyond the existing gather.
    MLOperand gatherElements(MLOperand input, MLOperand indices, optional MLGatherOptions options = {});
    MLOperand scatterElements(MLOperand input, MLOperand indices, MLOperand updates, optional MLScatterOptions options = {});
    MLOperand gatherND(MLOperand input, MLOperand indices, optional MLOperatorOptions options = {});
    MLOperand scatterND(MLOperand input, MLOperand indices, MLOperand updates, optional MLOperatorOptions options = {});

    MLOperand dequantizeLinear(MLOperand input, MLOperand scale, MLOperand zeroPoint, optional MLOperatorOptions options = {});
    MLOperand quantizeLinear(MLOperand input, MLOperand scale, MLOperand zeroPoint, optional MLOperatorOptions options = {});

    MLOperand logicalAnd(MLOperand a, MLOperand b, optional MLOperatorOptions options = {});
    MLOperand logicalOr(MLOperand a, MLOperand b, optional MLOperatorOptions options = {});
    MLOperand logicalXor(MLOperand a, MLOperand b, optional MLOperatorOptions options = {});
    MLOperand notEqual(MLOperand a, MLOperand b, optional MLOperatorOptions options = {});

    MLOperand reverse(MLOperand input, optional MLReverseOptions options = {});
    MLOperand slice(
        MLOperand input,
        sequence<[EnforceRange] unsigned long> starts,
        sequence<[EnforceRange] unsigned long> sizes,
        optional MLSliceOptions options = {} // Now includes steps
    );
    ...
}

dictionary MLCumulativeSumOptions : MLOperatorOptions
{
    boolean exclusive = false; // Post-sum addition rather than inclusive pre-sum. https://en.wikipedia.org/wiki/Prefix_sum
    boolean reversed = false; // Reverse the summation direction
}

// Already exists for `gather`. Reuse for `gatherElements` too.
dictionary MLGatherOptions : MLOperatorOptions
{
    unsigned long axis = 0;
};

dictionary MLScatterOptions : MLOperatorOptions
{
    unsigned long axis = 0;
};

dictionary MLReverseOptions : MLOperatorOptions {
  sequence<[EnforceRange] unsigned long> axes;
};

dictionary MLSliceOptions : MLOperatorOptions {
  sequence<[EnforceRange] unsigned long> strides;
};

Preview | Diff

inexorabletash · 2025-01-16T15:17:02Z

Another "TODO" - the new ops need "constraints" tables

inexorabletash · 2025-01-16T18:18:27Z

Note from discussion w/ @a-sully - CoreML has restrictions on the dequantize op that we'll need to think about.

scale and bias must be constant
the input must be int8 or uint8 (note: CoreML has a different operator for (u)int4, but it requires everything - including the input - to be constant)
scale and bias must be either scalars or 1D
scale and bias must be the same rank (so, both scalars or both 1D)
scale must be the same data type as the output (note: the existing WPTs appear to assert the scale is always float32 (and the description of that test appears to have a typo))
scale must be positive

Re-emphasizing that dequantizing (u)int4 in CoreML is extremely limited (input must be const). @mwyrzykowski - any thoughts about how we can handle the proposed ops efficiently?

inexorabletash

Initial pass.

index.bs

fdwr · 2025-01-17T03:24:25Z

Another "TODO" - the new ops need "constraints" tables

Added data type tables.

Initial pass.

Thanks - will address more ~~tomorrow~~ after the weekend.

inexorabletash · 2025-01-17T19:17:22Z

Add "Resolves #779" to the summary, so that issue will get linked to this PR and auto-closed when this merges?

... and "Resolves #773"
... and "Resolves #772"
... and "Resolves #467"
... and "Resolves #93" (I think?)

Maybe #767 too

index.bs

huningxin · 2025-01-27T08:28:20Z

index.bs

+partial interface MLGraphBuilder {
+  MLOperand dequantizeLinear(MLOperand input,
+                             MLOperand scale,
+                             MLOperand zeroPoint,


Should we consider allowing optional zeroPoint? For example, having it in MLDequantizeLinearOptions.zeroPoint, zero point calculation is not applied if it is not present. Frameworks, like ONNX Runtime, supporting optional zero point tensor can just leave it to default if zero point is not provided, that avoids creating zero-value tensor and calculating input - zeroPoint if underlying ML API supports optional zero point.

⏳ Maybe, as there are perf benefits. Some considerations:

If there are backends that require ZP (don't support optionality), does that just push the burden down to the backend to invent a ZP on the fly?

Since there's no precedent for optional tensors in parameters, only option dictionaries, should we create a MLQuantizationOptions just to hold the zeroPoint tensor, which is shared by both quantize and dequantize?

There is some ambiguity about appropriate defaults. For symmetric quantization (generally preferred), the expected defaults are {int8=0, int4=0, uint8=128, uint4=8}, and DirectML and (IIRC) PyTorch default to this, but ONNX defaults 0 for all data types including uint8 (which makes little sense because then it's asymmetric and doesn't support positive and negative numbers). Passing an explicit ZP avoids this ambiguity. 🤔

index.bs

huningxin · 2025-01-27T13:06:40Z

index.bs

    1. [=list/For each=] |index| in [=the range=] 0 to |input|'s [=MLOperand/rank=], exclusive:
        1. If |sizes|[|index|] is 0, then [=exception/throw=] a {{TypeError}}.

            Issue(391): If 0-size dimensions are allowed, revise these steps.

        1. If |starts|[|index|] is greater than or equal to |input|'s [=MLOperand/shape=][|index|], then [=exception/throw=] a {{TypeError}}.
        1. If |starts|[|index|] + |sizes|[|index|] is greater than |input|'s [=MLOperand/shape=][|index|], then [=exception/throw=] a {{TypeError}}.
+        1. If |options|.{{MLSliceOptions/strides}} [=map/exists=]:
+            1. If |options|.{{MLSliceOptions/strides}}[|index|] is less than 1, then [=exception/throw=] a {{TypeError}}.
    1. *Make graph connections:*
        1. Let |output| be the result of [=copying an MLOperand=] given |input|.


The output size should be calculated by size and stride, as Chromium prototype does: https://source.chromium.org/chromium/chromium/src/+/main:services/webnn/public/cpp/graph_validation_utils.cc;l=2450

(The existing output size should be set to size rather than copy input size)

huningxin · 2025-01-27T13:08:25Z

index.bs

+        Issue(391): If 0-size dimensions are allowed, revise these steps.
+
+    1. *Make graph connections:*
+        1. Let |output| be the result of [=copying an MLOperand=] given |input|.


The output size should be calculated by input size * repetition, as Chromium prototype does: https://source.chromium.org/chromium/chromium/src/+/main:services/webnn/public/cpp/graph_validation_utils.cc;l=2324

fdwr

Addressed most feedback (not output shape calculation and blockwise compatible definition).

index.bs

fdwr · 2025-02-13T14:53:06Z

index.bs

    1. [=list/For each=] |index| in [=the range=] 0 to |input|'s [=MLOperand/rank=], exclusive:
        1. If |sizes|[|index|] is 0, then [=exception/throw=] a {{TypeError}}.

            Issue(391): If 0-size dimensions are allowed, revise these steps.

        1. If |starts|[|index|] is greater than or equal to |input|'s [=MLOperand/shape=][|index|], then [=exception/throw=] a {{TypeError}}.
        1. If |starts|[|index|] + |sizes|[|index|] is greater than |input|'s [=MLOperand/shape=][|index|], then [=exception/throw=] a {{TypeError}}.
+        1. If |options|.{{MLSliceOptions/strides}} [=map/exists=]:
+            1. If |options|.{{MLSliceOptions/strides}}[|index|] is less than 1, then [=exception/throw=] a {{TypeError}}.
    1. *Make graph connections:*
        1. Let |output| be the result of [=copying an MLOperand=] given |input|.


fdwr · 2025-02-13T14:54:46Z

index.bs

+</details>
+
+### scatterElements ### {#api-mlgraphbuilder-scatterelements}
+Scatter values from the updates tensor along an axis according to the indices in place of the input tensor.


🤔 What about "atop" in-place of inplace? Or in-place of a copy of?

fdwr · 2025-02-13T14:54:59Z

index.bs

+        Issue(391): If 0-size dimensions are allowed, revise these steps.
+
+    1. *Make graph connections:*
+        1. Let |output| be the result of [=copying an MLOperand=] given |input|.


…mplete algorithm.

inexorabletash

Another batch of feedback; I didn't make it all the way through the PR though.

index.bs

inexorabletash · 2025-02-14T18:07:24Z

index.bs

@@ -8398,6 +9377,28 @@ To <dfn data-lt="bidirectionally broadcasting">bidirectionally broadcast the sha
 |shapeA| is <dfn>bidirectionally broadcastable</dfn> to |shapeB| if [=bidirectionally broadcasting=] |shapeA| and |shapeB| does not result in failure.
 </p>

+<details open algorithm>
+<summary>
+To <dfn data-lt="blockwise broadcasting">blockwise broadcast the shapes</dfn> |shapeFrom| and |shapeTo|, perform the following steps. |shapeFrom| and |shapeTo| are [=/lists=] of positive integers, representing the dimensions of tensors, and the steps return a new [=/list=] of positive integers, or failure.


Do we need to return anything other than success or failure?

If successful, the output is equivalent to shapeTo and the invocations don't capture it. Are you future-proofing or ...?

Not future proofing - just thought it consistent to make the return type similar with the other two functions. I'll delete the outputShape and append inside the loop (which yes, is silly).

and the invocations don't capture it

True, quantizeLinear and dequantizeLinear discard the return value (since it's just the inputShape anyway), but it is acceptable to discard a function return value if mainly interested the failure aspect. ⚖️🤷‍♂️ Would you prefer it return true/false?

Do you even need to append in the loop? Anyway, true/false is fine. The definition can be phrased as something like "shapeFrom is blockwise broadcastable to shapeTo if the following steps return true:"

index.bs

…ements examples, fix flatten on edge conditions

fdwr · 2025-02-14T23:58:06Z

index.bs

-
-  This has the convenient properties that the sampling is evenly distributed, symmetric, robust to image mirroring, and the corner values are aligned.
-</div>
+### scatterND ### {#api-mlgraphbuilder-scatternd}


❔ scatterND or
❔ scatterNd?

A pretty good policy to follow internal consistency and existing precedent within WebNN itself:

✅ resample2d, conv2d, scatterNd

☑️ resample2D, conv2D, scatterND (would have to rename existing functions)

❌ resample2d, conv2d, scatterND (inconsistent casing of '#d')

So I'm updating this to scatterNd...

Looking at other JS Web guidelines, I see conflicting recommendations out there. e.g.

Favoring scatterNd:
https://google.github.io/styleguide/tsguide.html#descriptive-names

Treat abbreviations like acronyms in names as whole words, i.e. use loadHttpUrl, not loadHTTPURL, unless required by a platform name (e.g. XMLHttpRequest).

// Good identifiers:
dnsConnectionIndex // Most people know what "DNS" stands for.
referrerUrl // Ditto for "URL".
customerId // "Id" is both ubiquitous and unlikely to be misunderstood.

// Disallowed identifiers:
...
customerID // Incorrect camelcase of "ID".

Favoring scatterND:
https://github.com/airbnb/javascript?tab=readme-ov-file#naming--Acronyms-and-Initialisms

// bad
import SmsContainer from './containers/SmsContainer';
const HttpRequests = [ ... ];

// good
import SMSContainer from './containers/SMSContainer';
const HTTPRequests = [ ... ];

The canonical guidance for Web APIs should be https://w3ctag.github.io/design-principles/#casing-rules

So per "Repeated initialisms in APIs", scatterND would be preferred.

I'm still on the fence about '2d'; CanvasRenderingContext2D comes to mind as an example contrary to what WebNN has done so far. Did conv2d etc go through TAG review in the past?

Ooh, so then...

resample2D, conv2D, scatterND

...might be preferred. Thanks for the link and precedent. I was thinking of "2d" as abbreviation of "2-dimensionsional", falling more into the "getElementById()" camp. If we change it, I'll rename it in a separate CR to be isolated (since it impacts existing operators).

Did conv2d etc go through TAG review in the past?

I don't know. Maybe @huningxin does. I'll look at WebGPU to see any precedent there (since that's the most similarly related API).

So in WebGPU...

Lowercase in enums:

enum GPUTextureDimension { "1d", "2d", "3d", }; enum GPUTextureViewDimension { "1d", "2d", "2d-array", "cube", "cube-array", "3d", };

Uppercase in class names:

CanvasRenderingContext2D GPUOrigin2D GPUOrigin2DDict GPUExtent3D GPUOrigin3D GPUOrigin3DDict

Uppercase in field names:

maxTextureDimension1D maxTextureDimension2D maxTextureDimension3D

So, rename in order.

*Now eventually I aim to chop off the 2D from conv2d/convTranspose2d/resample2d/averagePool2d/l2Pool2d/maxPool2d anyway to make them dimensionally generic, since all the backends support at least 1D-3D, but it will still matter for gatherND/scatterND.

Did conv2d etc go through TAG review in the past?

conv2d was in scope of the initial TAG review w3ctag/design-reviews#570 and no casing related feedback was recorded: https://github.com/webmachinelearning/webnn/issues?q=label%3Atag-tracker

Also the later "delta" (aka changes since previous) TAG review w3ctag/design-reviews#771 did not comment on this.

I'm planning to kick off a new TAG review when we land this PR. We can explicitly consult TAG about casing assuming https://w3ctag.github.io/design-principles/#casing-rules is not clear to help improve the guidance with our example.

@anssiko : Thanks for the background info and links. Then I'll defer any wider consistency renaming of existing operators and keep scatterND as ND for now (since the actual Chromium implementation, baseline implementation, and ORT WebNN EP use that casing).

#821

fdwr · 2025-02-15T01:39:18Z

index.bs

+          const unexpandedShape = [flattenedShape[0], flattenedShape[1], 1, flattenedShape[2]];
+          const expandedShape = [flattenedShape[0], flattenedShape[1], elementRepeatCount, flattenedShape[2]];
+          const reshapedInput = builder.reshape(expandedInput, unexpandedShape);
+          expandedInput = builder.expand(reshapedInput, expandedShape);


(a realization) This whole function could be simplified to a single expand call if expand accepted any blockwise-compatible shape, rather the current limitation that each input dimension length be either 1 or the corresponding target dimension length. Alternately if resample2d accepted multiple dimensions to upsample rather than just 2 (like ONNX's resampling operator), it would be a single call. Anyway, can defer that ponderance to a later issue.

fdwr · 2025-02-15T04:39:23Z

index.bs

@@ -8398,6 +9377,28 @@ To <dfn data-lt="bidirectionally broadcasting">bidirectionally broadcast the sha
 |shapeA| is <dfn>bidirectionally broadcastable</dfn> to |shapeB| if [=bidirectionally broadcasting=] |shapeA| and |shapeB| does not result in failure.
 </p>

+<details open algorithm>
+<summary>
+To <dfn data-lt="blockwise broadcasting">blockwise broadcast the shapes</dfn> |shapeFrom| and |shapeTo|, perform the following steps. |shapeFrom| and |shapeTo| are [=/lists=] of positive integers, representing the dimensions of tensors, and the steps return a new [=/list=] of positive integers, or failure.


Not future proofing - just thought it consistent to make the return type similar with the other two functions. I'll delete the outputShape and append inside the loop (which yes, is silly).

and the invocations don't capture it

True, quantizeLinear and dequantizeLinear discard the return value (since it's just the inputShape anyway), but it is acceptable to discard a function return value if mainly interested the failure aspect. ⚖️🤷‍♂️ Would you prefer it return true/false?

index.bs

…ples. Fix bool bikeshed issue in cumulativeSum.

fdwr added 5 commits December 18, 2024 19:08

Wave 3 skeleton

cabfc94

Add cumulativeSum

3e0be03

Merge with main

42b6522

Update more operators

9d0d231

Updates

8789b79

fdwr changed the title ~~Opset wave3~~ Operator set wave 3 Jan 16, 2025

inexorabletash reviewed Jan 16, 2025

View reviewed changes

Copy pasta cleanup - feedback updates pending

99ece12

fdwr commented Jan 17, 2025

View reviewed changes

index.bs Outdated Show resolved Hide resolved

fdwr commented Jan 17, 2025

View reviewed changes

index.bs Outdated Show resolved Hide resolved

Rank typo for Q and DQ

4da4feb

huningxin reviewed Jan 27, 2025

View reviewed changes

Merge branch 'main' into opsetWave3

38ade9c

huningxin mentioned this pull request Feb 11, 2025

Implement dequantizeLinear shiyi9801/chromium#134

Merged

CR feedback

83efdf3

fdwr commented Feb 13, 2025

View reviewed changes

fdwr and others added 6 commits February 13, 2025 08:20

Forgot feedback - delete Increment dimCount by one.

5f9501e

Merge main

6572481

Add blockwise broadcastable dimension. Add Q/DQ emulation and more co…

c012530

…mplete algorithm.

Fix bikeshed errors

611a432

Merge branch 'webmachinelearning:main' into opsetWave3

e465196

More bikeshed errors

6202405

inexorabletash reviewed Feb 14, 2025

View reviewed changes

CR feedback, renamed scatterNd/gatherNd, add gatherElements/scatterEl…

f5ac286

…ements examples, fix flatten on edge conditions

fdwr commented Feb 15, 2025

View reviewed changes

Fix more bikeshed errors

9d11f09

Add gather/scatterElements algorithm steps. Add gather/scatterNd exam…

93de1fb

…ples. Fix bool bikeshed issue in cumulativeSum.

webmachinelearning deleted a comment from huningxin Feb 18, 2025

fdwr added 4 commits February 18, 2025 02:23

scatter* use atop a copy wording

675d105

Merge branch 'main' into opsetWave3

bdd984e

Simplify blockwise broadcasting removing temporaries

69b8d8e

Bikeshed asymmetric bracket

b8ed963

fdwr mentioned this pull request Feb 21, 2025

Operator naming 2D vs 2d #821

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Operator set wave 3 #805

Operator set wave 3 #805

fdwr commented Jan 16, 2025 •

edited by pr-preview bot

Loading

inexorabletash commented Jan 16, 2025

inexorabletash commented Jan 16, 2025

inexorabletash left a comment

fdwr commented Jan 17, 2025 •

edited

Loading

inexorabletash commented Jan 17, 2025 •

edited

Loading

huningxin Jan 27, 2025

fdwr Feb 12, 2025 •

edited

Loading

huningxin Jan 27, 2025

fdwr Feb 13, 2025

huningxin Jan 27, 2025

fdwr Feb 13, 2025

fdwr left a comment •

edited

Loading

fdwr Feb 13, 2025

fdwr Feb 13, 2025 •

edited

Loading

fdwr Feb 13, 2025

inexorabletash left a comment

inexorabletash Feb 14, 2025

fdwr Feb 15, 2025

inexorabletash Feb 18, 2025

fdwr Feb 18, 2025 •

edited

Loading

fdwr Feb 14, 2025 •

edited

Loading

inexorabletash Feb 18, 2025

inexorabletash Feb 18, 2025 •

edited

Loading

fdwr Feb 18, 2025

fdwr Feb 18, 2025 •

edited

Loading

anssiko Feb 20, 2025

fdwr Feb 20, 2025 •

edited

Loading

fdwr Feb 15, 2025

fdwr Feb 15, 2025

Operator set wave 3 #805

Are you sure you want to change the base?

Operator set wave 3 #805

Conversation

fdwr commented Jan 16, 2025 • edited by pr-preview bot Loading

inexorabletash commented Jan 16, 2025

inexorabletash commented Jan 16, 2025

inexorabletash left a comment

Choose a reason for hiding this comment

fdwr commented Jan 17, 2025 • edited Loading

inexorabletash commented Jan 17, 2025 • edited Loading

Choose a reason for hiding this comment

fdwr Feb 12, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fdwr left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fdwr Feb 13, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

inexorabletash left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fdwr Feb 18, 2025 • edited Loading

Choose a reason for hiding this comment

fdwr Feb 14, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

inexorabletash Feb 18, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fdwr Feb 18, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fdwr Feb 20, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fdwr commented Jan 16, 2025 •

edited by pr-preview bot

Loading

fdwr commented Jan 17, 2025 •

edited

Loading

inexorabletash commented Jan 17, 2025 •

edited

Loading

fdwr Feb 12, 2025 •

edited

Loading

fdwr left a comment •

edited

Loading

fdwr Feb 13, 2025 •

edited

Loading

fdwr Feb 18, 2025 •

edited

Loading

fdwr Feb 14, 2025 •

edited

Loading

inexorabletash Feb 18, 2025 •

edited

Loading

fdwr Feb 18, 2025 •

edited

Loading

fdwr Feb 20, 2025 •

edited

Loading