Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Operator set wave 3 #805

Draft
wants to merge 22 commits into
base: main
Choose a base branch
from
Draft

Operator set wave 3 #805

wants to merge 22 commits into from

Conversation

fdwr
Copy link
Collaborator

@fdwr fdwr commented Jan 16, 2025

Adds the following operators, per #375 (comment):

Resolves #93
Resolves #467
Resolves #767
Resolves #772
Resolves #773
Resolves #779

Some TODO's remain:

  • finish algorithm steps in gatherElements/scatterElements
  • finish algorithm steps in gatherNd/scatterNd
  • update slice output shape
  • add blockwise-compatible definition
  • show emulation of DQ/Q
  • examples to the scattering/gathering operators
  • data type tables
  • ⏳u/int4 split into separate change
partial interface MLGraphBuilder
{
    ...
    MLOperand cumulativeSum(MLOperand input, unsigned long axis, optional MLCumulativeSumOptions options = {});
    MLOperand sign(MLOperand input, optional MLOperatorOptions options = {});
    MLOperand tile(MLOperand input, sequence<unsigned long> repetitions, optional MLOperatorOptions options = {});

    // Extends the family beyond the existing gather.
    MLOperand gatherElements(MLOperand input, MLOperand indices, optional MLGatherOptions options = {});
    MLOperand scatterElements(MLOperand input, MLOperand indices, MLOperand updates, optional MLScatterOptions options = {});
    MLOperand gatherND(MLOperand input, MLOperand indices, optional MLOperatorOptions options = {});
    MLOperand scatterND(MLOperand input, MLOperand indices, MLOperand updates, optional MLOperatorOptions options = {});

    MLOperand dequantizeLinear(MLOperand input, MLOperand scale, MLOperand zeroPoint, optional MLOperatorOptions options = {});
    MLOperand quantizeLinear(MLOperand input, MLOperand scale, MLOperand zeroPoint, optional MLOperatorOptions options = {});

    MLOperand logicalAnd(MLOperand a, MLOperand b, optional MLOperatorOptions options = {});
    MLOperand logicalOr(MLOperand a, MLOperand b, optional MLOperatorOptions options = {});
    MLOperand logicalXor(MLOperand a, MLOperand b, optional MLOperatorOptions options = {});
    MLOperand notEqual(MLOperand a, MLOperand b, optional MLOperatorOptions options = {});

    MLOperand reverse(MLOperand input, optional MLReverseOptions options = {});
    MLOperand slice(
        MLOperand input,
        sequence<[EnforceRange] unsigned long> starts,
        sequence<[EnforceRange] unsigned long> sizes,
        optional MLSliceOptions options = {} // Now includes steps
    );
    ...
}
dictionary MLCumulativeSumOptions : MLOperatorOptions
{
    boolean exclusive = false; // Post-sum addition rather than inclusive pre-sum. https://en.wikipedia.org/wiki/Prefix_sum
    boolean reversed = false; // Reverse the summation direction
}

// Already exists for `gather`. Reuse for `gatherElements` too.
dictionary MLGatherOptions : MLOperatorOptions
{
    unsigned long axis = 0;
};

dictionary MLScatterOptions : MLOperatorOptions
{
    unsigned long axis = 0;
};

dictionary MLReverseOptions : MLOperatorOptions {
  sequence<[EnforceRange] unsigned long> axes;
};

dictionary MLSliceOptions : MLOperatorOptions {
  sequence<[EnforceRange] unsigned long> strides;
};

Preview | Diff

@inexorabletash
Copy link
Member

Another "TODO" - the new ops need "constraints" tables

@fdwr fdwr changed the title Opset wave3 Operator set wave 3 Jan 16, 2025
@inexorabletash
Copy link
Member

Note from discussion w/ @a-sully - CoreML has restrictions on the dequantize op that we'll need to think about.

  • scale and bias must be constant
  • the input must be int8 or uint8 (note: CoreML has a different operator for (u)int4, but it requires everything - including the input - to be constant)
  • scale and bias must be either scalars or 1D
  • scale and bias must be the same rank (so, both scalars or both 1D)
  • scale must be the same data type as the output (note: the existing WPTs appear to assert the scale is always float32 (and the description of that test appears to have a typo))
  • scale must be positive

Re-emphasizing that dequantizing (u)int4 in CoreML is extremely limited (input must be const). @mwyrzykowski - any thoughts about how we can handle the proposed ops efficiently?

Copy link
Member

@inexorabletash inexorabletash left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initial pass.

@fdwr
Copy link
Collaborator Author

fdwr commented Jan 17, 2025

Another "TODO" - the new ops need "constraints" tables

Added data type tables.

Initial pass.

Thanks - will address more tomorrow after the weekend.

@inexorabletash
Copy link
Member

inexorabletash commented Jan 17, 2025

Add "Resolves #779" to the summary, so that issue will get linked to this PR and auto-closed when this merges?

... and "Resolves #773"
... and "Resolves #772"
... and "Resolves #467"
... and "Resolves #93" (I think?)

Maybe #767 too

partial interface MLGraphBuilder {
MLOperand dequantizeLinear(MLOperand input,
MLOperand scale,
MLOperand zeroPoint,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we consider allowing optional zeroPoint? For example, having it in MLDequantizeLinearOptions.zeroPoint, zero point calculation is not applied if it is not present. Frameworks, like ONNX Runtime, supporting optional zero point tensor can just leave it to default if zero point is not provided, that avoids creating zero-value tensor and calculating input - zeroPoint if underlying ML API supports optional zero point.

Copy link
Collaborator Author

@fdwr fdwr Feb 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⏳ Maybe, as there are perf benefits. Some considerations:

  • If there are backends that require ZP (don't support optionality), does that just push the burden down to the backend to invent a ZP on the fly?
  • Since there's no precedent for optional tensors in parameters, only option dictionaries, should we create a MLQuantizationOptions just to hold the zeroPoint tensor, which is shared by both quantize and dequantize?
  • There is some ambiguity about appropriate defaults. For symmetric quantization (generally preferred), the expected defaults are {int8=0, int4=0, uint8=128, uint4=8}, and DirectML and (IIRC) PyTorch default to this, but ONNX defaults 0 for all data types including uint8 (which makes little sense because then it's asymmetric and doesn't support positive and negative numbers). Passing an explicit ZP avoids this ambiguity. 🤔

1. [=list/For each=] |index| in [=the range=] 0 to |input|'s [=MLOperand/rank=], exclusive:
1. If |sizes|[|index|] is 0, then [=exception/throw=] a {{TypeError}}.

Issue(391): If 0-size dimensions are allowed, revise these steps.

1. If |starts|[|index|] is greater than or equal to |input|'s [=MLOperand/shape=][|index|], then [=exception/throw=] a {{TypeError}}.
1. If |starts|[|index|] + |sizes|[|index|] is greater than |input|'s [=MLOperand/shape=][|index|], then [=exception/throw=] a {{TypeError}}.
1. If |options|.{{MLSliceOptions/strides}} [=map/exists=]:
1. If |options|.{{MLSliceOptions/strides}}[|index|] is less than 1, then [=exception/throw=] a {{TypeError}}.
1. *Make graph connections:*
1. Let |output| be the result of [=copying an MLOperand=] given |input|.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The output size should be calculated by size and stride, as Chromium prototype does: https://source.chromium.org/chromium/chromium/src/+/main:services/webnn/public/cpp/graph_validation_utils.cc;l=2450

(The existing output size should be set to size rather than copy input size)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue(391): If 0-size dimensions are allowed, revise these steps.

1. *Make graph connections:*
1. Let |output| be the result of [=copying an MLOperand=] given |input|.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The output size should be calculated by input size * repetition, as Chromium prototype does: https://source.chromium.org/chromium/chromium/src/+/main:services/webnn/public/cpp/graph_validation_utils.cc;l=2324

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator Author

@fdwr fdwr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed most feedback (not output shape calculation and blockwise compatible definition).

1. [=list/For each=] |index| in [=the range=] 0 to |input|'s [=MLOperand/rank=], exclusive:
1. If |sizes|[|index|] is 0, then [=exception/throw=] a {{TypeError}}.

Issue(391): If 0-size dimensions are allowed, revise these steps.

1. If |starts|[|index|] is greater than or equal to |input|'s [=MLOperand/shape=][|index|], then [=exception/throw=] a {{TypeError}}.
1. If |starts|[|index|] + |sizes|[|index|] is greater than |input|'s [=MLOperand/shape=][|index|], then [=exception/throw=] a {{TypeError}}.
1. If |options|.{{MLSliceOptions/strides}} [=map/exists=]:
1. If |options|.{{MLSliceOptions/strides}}[|index|] is less than 1, then [=exception/throw=] a {{TypeError}}.
1. *Make graph connections:*
1. Let |output| be the result of [=copying an MLOperand=] given |input|.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

index.bs Outdated
</details>

### scatterElements ### {#api-mlgraphbuilder-scatterelements}
Scatter values from the updates tensor along an axis according to the indices in place of the input tensor.
Copy link
Collaborator Author

@fdwr fdwr Feb 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 What about "atop" in-place of inplace? Or in-place of a copy of?

Issue(391): If 0-size dimensions are allowed, revise these steps.

1. *Make graph connections:*
1. Let |output| be the result of [=copying an MLOperand=] given |input|.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

@inexorabletash inexorabletash left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another batch of feedback; I didn't make it all the way through the PR though.

@@ -8398,6 +9377,28 @@ To <dfn data-lt="bidirectionally broadcasting">bidirectionally broadcast the sha
|shapeA| is <dfn>bidirectionally broadcastable</dfn> to |shapeB| if [=bidirectionally broadcasting=] |shapeA| and |shapeB| does not result in failure.
</p>

<details open algorithm>
<summary>
To <dfn data-lt="blockwise broadcasting">blockwise broadcast the shapes</dfn> |shapeFrom| and |shapeTo|, perform the following steps. |shapeFrom| and |shapeTo| are [=/lists=] of positive integers, representing the dimensions of tensors, and the steps return a new [=/list=] of positive integers, or failure.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to return anything other than success or failure?

If successful, the output is equivalent to shapeTo and the invocations don't capture it. Are you future-proofing or ...?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not future proofing - just thought it consistent to make the return type similar with the other two functions. I'll delete the outputShape and append inside the loop (which yes, is silly).

and the invocations don't capture it

True, quantizeLinear and dequantizeLinear discard the return value (since it's just the inputShape anyway), but it is acceptable to discard a function return value if mainly interested the failure aspect. ⚖️🤷‍♂️ Would you prefer it return true/false?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you even need to append in the loop? Anyway, true/false is fine. The definition can be phrased as something like "shapeFrom is blockwise broadcastable to shapeTo if the following steps return true:"

Copy link
Collaborator Author

@fdwr fdwr Feb 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you even need to append in the loop?

Nope, no appending needed. The most recent iteration was simplified to just return the to shape.

1. If |shapeFrom|'s [=list/size=] is not equal to |shapeTo|'s [=list/size=], then return failure.
1. [=list/For each=] |index| in [=the range=] 0 to |shapeTo|'s [=list/size=], exclusive:
    1. If |shapeFrom|[|index|] is not exactly divisible into |shapeTo|[|index|], then return failure.
1. Return |shapeTo|.

…ements examples, fix flatten on edge conditions
index.bs Outdated

This has the convenient properties that the sampling is evenly distributed, symmetric, robust to image mirroring, and the corner values are aligned.
</div>
### scatterND ### {#api-mlgraphbuilder-scatternd}
Copy link
Collaborator Author

@fdwr fdwr Feb 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

scatterND or
scatterNd?

A pretty good policy to follow internal consistency and existing precedent within WebNN itself:

  • resample2d, conv2d, scatterNd
  • ☑️ resample2D, conv2D, scatterND (would have to rename existing functions)
  • resample2d, conv2d, scatterND (inconsistent casing of '#d')

So I'm updating this to scatterNd...

Looking at other JS Web guidelines, I see conflicting recommendations out there. e.g.

Favoring scatterNd:
https://google.github.io/styleguide/tsguide.html#descriptive-names

Treat abbreviations like acronyms in names as whole words, i.e. use loadHttpUrl, not loadHTTPURL, unless required by a platform name (e.g. XMLHttpRequest).

// Good identifiers:
dnsConnectionIndex // Most people know what "DNS" stands for.
referrerUrl // Ditto for "URL".
customerId // "Id" is both ubiquitous and unlikely to be misunderstood.

// Disallowed identifiers:
...
customerID // Incorrect camelcase of "ID".

Favoring scatterND:
https://github.com/airbnb/javascript?tab=readme-ov-file#naming--Acronyms-and-Initialisms

// bad
import SmsContainer from './containers/SmsContainer';
const HttpRequests = [ ... ];

// good
import SMSContainer from './containers/SMSContainer';
const HTTPRequests = [ ... ];

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The canonical guidance for Web APIs should be https://w3ctag.github.io/design-principles/#casing-rules

So per "Repeated initialisms in APIs", scatterND would be preferred.

Copy link
Member

@inexorabletash inexorabletash Feb 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still on the fence about '2d'; CanvasRenderingContext2D comes to mind as an example contrary to what WebNN has done so far. Did conv2d etc go through TAG review in the past?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ooh, so then...

resample2D, conv2D, scatterND

...might be preferred. Thanks for the link and precedent. I was thinking of "2d" as abbreviation of "2-dimensionsional", falling more into the "getElementById()" camp. If we change it, I'll rename it in a separate CR to be isolated (since it impacts existing operators).

Did conv2d etc go through TAG review in the past?

I don't know. Maybe @huningxin does. I'll look at WebGPU to see any precedent there (since that's the most similarly related API).

Copy link
Collaborator Author

@fdwr fdwr Feb 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So in WebGPU...

Lowercase in enums:

enum GPUTextureDimension {
    "1d",
    "2d",
    "3d",
};

enum GPUTextureViewDimension {
    "1d",
    "2d",
    "2d-array",
    "cube",
    "cube-array",
    "3d",
};

Uppercase in class names:

CanvasRenderingContext2D
GPUOrigin2D
GPUOrigin2DDict
GPUExtent3D 
GPUOrigin3D
GPUOrigin3DDict

Uppercase in field names:

maxTextureDimension1D
maxTextureDimension2D
maxTextureDimension3D

So, rename in order.

*Now eventually I aim to chop off the 2D from conv2d/convTranspose2d/resample2d/averagePool2d/l2Pool2d/maxPool2d anyway to make them dimensionally generic, since all the backends support at least 1D-3D, but it will still matter for gatherND/scatterND.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did conv2d etc go through TAG review in the past?

conv2d was in scope of the initial TAG review w3ctag/design-reviews#570 and no casing related feedback was recorded: https://github.com/webmachinelearning/webnn/issues?q=label%3Atag-tracker

Also the later "delta" (aka changes since previous) TAG review w3ctag/design-reviews#771 did not comment on this.

I'm planning to kick off a new TAG review when we land this PR. We can explicitly consult TAG about casing assuming https://w3ctag.github.io/design-principles/#casing-rules is not clear to help improve the guidance with our example.

Copy link
Collaborator Author

@fdwr fdwr Feb 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@anssiko : Thanks for the background info and links. Then I'll defer any wider consistency renaming of existing operators and keep scatterND as ND for now (since the actual Chromium implementation, baseline implementation, and ORT WebNN EP use that casing).

#821

const unexpandedShape = [flattenedShape[0], flattenedShape[1], 1, flattenedShape[2]];
const expandedShape = [flattenedShape[0], flattenedShape[1], elementRepeatCount, flattenedShape[2]];
const reshapedInput = builder.reshape(expandedInput, unexpandedShape);
expandedInput = builder.expand(reshapedInput, expandedShape);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(a realization) This whole function could be simplified to a single expand call if expand accepted any blockwise-compatible shape, rather the current limitation that each input dimension length be either 1 or the corresponding target dimension length. Alternately if resample2d accepted multiple dimensions to upsample rather than just 2 (like ONNX's resampling operator), it would be a single call. Anyway, can defer that ponderance to a later issue.

@@ -8398,6 +9377,28 @@ To <dfn data-lt="bidirectionally broadcasting">bidirectionally broadcast the sha
|shapeA| is <dfn>bidirectionally broadcastable</dfn> to |shapeB| if [=bidirectionally broadcasting=] |shapeA| and |shapeB| does not result in failure.
</p>

<details open algorithm>
<summary>
To <dfn data-lt="blockwise broadcasting">blockwise broadcast the shapes</dfn> |shapeFrom| and |shapeTo|, perform the following steps. |shapeFrom| and |shapeTo| are [=/lists=] of positive integers, representing the dimensions of tensors, and the steps return a new [=/list=] of positive integers, or failure.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not future proofing - just thought it consistent to make the return type similar with the other two functions. I'll delete the outputShape and append inside the loop (which yes, is silly).

and the invocations don't capture it

True, quantizeLinear and dequantizeLinear discard the return value (since it's just the inputShape anyway), but it is acceptable to discard a function return value if mainly interested the failure aspect. ⚖️🤷‍♂️ Would you prefer it return true/false?

…ples. Fix bool bikeshed issue in cumulativeSum.
@webmachinelearning webmachinelearning deleted a comment from huningxin Feb 18, 2025
@fdwr fdwr mentioned this pull request Feb 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants