diff --git a/.gitignore b/.gitignore deleted file mode 100644 index f9b3310846..0000000000 --- a/.gitignore +++ /dev/null @@ -1,6 +0,0 @@ -spec/index.html -spec/webgpu.idl -wgsl/index.html -explainer/index.html -tools/node_modules/ -.DS_Store diff --git a/.pr-preview.json b/.pr-preview.json deleted file mode 100644 index bd783695bb..0000000000 --- a/.pr-preview.json +++ /dev/null @@ -1,7 +0,0 @@ -{ - "src_file": "spec/index.bs", - "type": "bikeshed", - "params": { - "force": 1 - } -} diff --git a/LICENSE.md b/LICENSE.md deleted file mode 100644 index 0c80820210..0000000000 --- a/LICENSE.md +++ /dev/null @@ -1,12 +0,0 @@ -All Reports in this Repository are licensed by Contributors -under the -[W3C Software and Document License](http://www.w3.org/Consortium/Legal/2015/copyright-software-and-document). - -Contributions to Specifications are made under the -[W3C CLA](https://www.w3.org/community/about/agreements/cla/). - -Contributions to Test Suites are made under the -[W3C 3-clause BSD License](https://www.w3.org/Consortium/Legal/2008/03-bsd-license.html) - -Contributions to Software are made under the -[GPU for the Web 3-Clause BSD License](https://github.com/gpuweb/admin/blob/master/SourceCodeLicense/LICENSE.txt) diff --git a/README.md b/README.md deleted file mode 100644 index 32029ba505..0000000000 --- a/README.md +++ /dev/null @@ -1,40 +0,0 @@ -# W3C _GPU for the Web_ Community Group - -This is the repository for the W3C's [GPU for the Web](https://www.w3.org/community/gpu/) -Community Group. - -We'll use the [wiki](https://github.com/gpuweb/gpuweb/wiki) as the main source of information -related to the work. This repository will hold the actual specification, examples, etc. - -Work-in-progress specification: - -Work-in-progress WGSL specification: - -## Charter - -The [charter for this group](https://gpuweb.github.io/admin/cg-charter.html) is -maintained in a [separate repository](https://github.com/gpuweb/admin). - -## Membership - -[Membership in the Community Group](https://www.w3.org/community/gpu/) is open -to anyone. We especially encourage hardware vendors, browser engine developers, -3d software engineers and any Web Developers with expertise in graphics to -participate. You'll need a W3C account to join, and if you're affiliated with a -W3C member, your W3C representative will confirm your participation. If you're -not a W3C member, you're still welcome. All participants are required to agree -to the [Contributor License Agreement](https://www.w3.org/community/about/agreements/cla/). - -## Contributions - -You are not required to be a member of the group in order to -[file issues](https://github.com/gpuweb/gpuweb/issues), errors, fixes or make suggestions. -Just a github account. We simply require that any significant contribution of technology -come from members, so that we can ensure no IP complications down the line. - -All contributions must comply with the group's -[contribution guidelines](https://github.com/gpuweb/admin/blob/master/CONTRIBUTING.md). - -## Code of Conduct - -This group operates under [W3C's Code of Conduct Policy](http://www.w3.org/Consortium/cepc/). diff --git a/compile.sh b/compile.sh deleted file mode 100755 index c80fb6280e..0000000000 --- a/compile.sh +++ /dev/null @@ -1,27 +0,0 @@ -#!/bin/bash -set -e # Exit with nonzero exit code if anything fails - -echo 'Building spec' -make -C spec -echo 'Building wgsl' -make -C wgsl -echo 'Building explainer' -make -C explainer - -if [ -d out ]; then - mkdir out/wgsl out/explainer - - echo 'Copying wgsl/* -> out/wgsl/' - cp -r wgsl/* out/wgsl/ - rm out/wgsl/{Makefile,*.bs} - - echo 'Copying explainer/* -> out/explainer/' - cp -r explainer/* out/explainer/ - rm out/explainer/{Makefile,*.bs} - - echo 'Copying spec/* -> out/' - cp spec/* out/ - rm out/{README.md,Makefile,*.py,*.bs} - - echo '' > out/wgsl.html -fi diff --git a/design/APICorrespondence.md b/design/APICorrespondence.md deleted file mode 100644 index b0ba481524..0000000000 --- a/design/APICorrespondence.md +++ /dev/null @@ -1,12 +0,0 @@ -# API Term correspondence - -## Resources - -|D3D|Metal|OpenGL / Vulkan| -|---|-----|---------------| -|CBV (Constant Buffer View)|Constant memory buffer|Uniform Buffer| -|SRV (Shader Resource View)|Texture binding (textures and texture views are the same thing) with sample access|Sampled texture/image or texel buffers| -|UAV (Unordered Access View)|Device texture with read / write access, device memory buffers|Storage buffers, storage texel buffers and storage textures/images| -|Sub-resource index|Cube-face and array element|Cube-face and layer index| - - diff --git a/design/BufferOperations.md b/design/BufferOperations.md deleted file mode 100644 index 279cb329e3..0000000000 --- a/design/BufferOperations.md +++ /dev/null @@ -1,261 +0,0 @@ -# Buffer operations - -This explainer describes the operations that are available on the `GPUBuffer` object directly. -They are `mapWriteAsync`, `mapReadAsync` and `unmap` which are memory mapping operations. - -## Preliminaries: buffered / unbuffered commands - -Assuming there is a single queue, there are two types of commands in WebGPU: - - - "Buffered commands": any commands on a `GPUCommandBuffer`, `GPUComputePassEncoder` or `GPURenderPassEncoder`. - - "Unbuffered commands": all other commands. - -Assuming there is a single queue, there is a total order on the unbuffered commands: they all execute atomically in the order they were called. -`GPUQueue.submit` is special because it atomically executes all the commands stored in its `commands` argument. - -## Buffer mapping - -### `MAP_READ` and `MAP_WRITE` - -The `MAP_READ` and `MAP_WRITE` buffer creation usage flags need to be specified to create a buffer mappable for reading (resp. for writing). -An additional validation constraint is that the `MAP_READ` and `MAP_WRITE` may not be used in combination. - -```webidl -partial interface GPUBufferUsage { - const u32 MAP_READ = 1; - const u32 MAP_WRITE = 2; -} -``` - -**TODO**: should `MAP_WRITE` be allowed only with read-only usages? -It would allow clearing the buffer only on creation and not on every map. - -### The `GPUBuffer` state machine - -Buffers have an internal state machine that has three states: - - - **Unmapped**: where the buffer can be used in queue submits - - **Mapped**: after a map operation and the subsequent `unmap` where the buffer cannot be used in queue submits - - **Destroyed**: after a call to `GPUBuffer.destroy` where it is a validation error to do anything with the buffer. - -In the following a buffer's state is a shorthand for the buffer's state machine. -Buffers created with `GPUDevice.createBuffer` start in the unmapped state. -Buffers created with `GPUDevice.createBufferMapped` start in the mapped state. - -State transitions are the following: - - - Unmapped to destroyed: with `GPUBuffer.destroy` - - Mapped to destroyed: with `GPUBuffer.destroy` - - Unmapped to mapped: with any successful `mapReadAsync` or `mapWriteAsync` call. - - Mapped to unmapped: with any successful `unmap` call. - -### Buffer mapping operations - -The mapping operations for buffer mapping are: - -```webidl -partial interface GPUBuffer { - Promise mapReadAsync(); - Promise mapWriteAsync(); -}; -``` - -These calls return a promise of a "mapping" that is an `ArrayBuffer` that represents the content of the buffer for reading (for `mapReadAsync`) or writing (for `mapWriteAsync`). -The promise will settle before signals for the completion of follow-up unbuffered commands. -Upon success the buffer is put in the mapped state. - -The following must be true or the call fails and will return a promise that will reject: - - - `buffer` must have been created with the `MAP_READ` usage flag for `mapReadAsync` and the `MAP_WRITE` flag for `mapWriteAsync` - - `buffer` must be in the unmapped state. - -A buffer can be unmapped with: - -```webidl -partial interface GPUBuffer { - void unmap(); -}; -``` - -Upon success the buffer is put in the unmapped state. Any associated `ArrayBuffer`s are neutered, and any pending mapping promises are rejected. - -The following must be true or the unmapping call on `buffer` fails: - - - `buffer` must have been created with the `MAP_READ` or the `MAP_WRITE` usage flags. - - `buffer` must not be in the destroyed state (this means it is ok to call `unmap` on an unmapped buffer). - -Calling `GPUBuffer.destroy` on a buffer with the `MAP_READ` or `MAP_WRITE` usage flags contains an implicit call to `GPUBuffer.unmap`. -Note that the mapping isn't detached when the `GPUBuffer` is garbage-collected, so this means that mappings keep a reference to their buffer. - -What happens with the content of mappings depends of which function was used to create it: - - Mappings created with `mapReadAsync` represents the content of the buffer after all previous unbuffered operations before the call to `mapReadAsync` completed. - Nothing happens when the mapping is detached. - - Mappings created with `mapWriteAsync` are filled with zeros. - When they are detached, it is as if `buffer.setSubData(0, mapping)` was called. - -### Creating an already mapped buffer - -A buffer can be created already mapped: - -```webidl -partial interface GPUDevice { - (GPUBuffer, ArrayBuffer) createBufferMapped(GPUBufferDescriptor descriptor); -}; -``` - -`GPUDevice.createBufferMapped` returns a buffer in the mapped state along with an write mapping representing the whole range of the buffer. - -These entry points do not require the `MAP_WRITE` usage to be specified. -The `MAP_WRITE` usage may be specified if the buffer needs to be re-mappable later on. - -The mapping starts filled with zeros. - -## Examples - -### `GPUBuffer.mapReadAsync` - -```js -const readPixelsBuffer = device.createBuffer({ - size: 4, - usage: GPUBufferUsage.MAP_READ | GPUBufferUsage.COPY_DST, -}); - -// Commands copying a pixel from a texture into readPixelsBuffer are submitted - -readPixelsBuffer.mapReadAsync().then((data) => { - checkPixelValue(data); - - // Unmap if we want to reuse the buffer - readPixelsBuffer.unmap(); -}); -``` - -### `GPUBuffer.mapWriteAsync` - -```js -// model is some 3D framework resource. -const size = model.computeVertexBufferSize(); - -const stagingVertexBuffer = device.createBuffer({ - size: size, - usage: GPUBufferUsage.MAP_WRITE | GPUBufferUsage.COPY_SRC, -}); - -stagingVertexBuffer.mapWriteAsync().then((stagingData) => { - model.decompressVerticesIn(stagingData); - - stagingVertexBuffer.unmap(); - - // Enqueue copy from the staging buffer to the real vertex buffer. -}); -``` - -### Updating data to an existing buffer (like WebGL's `bufferSubData`) - -```js -function bufferSubData(device, destBuffer, destOffset, srcArrayBuffer) { - const byteCount = srcArrayBuffer.byteLength; - const [srcBuffer, arrayBuffer] = device.createBufferMapped({ - size: byteCount, - usage: GPUBufferUsage.COPY_SRC - }); - new Uint8Array(arrayBuffer).set(new Uint8Array(srcArrayBuffer)); // memcpy - srcBuffer.unmap(); - - const encoder = device.createCommandEncoder(); - encoder.copyBufferToBuffer(srcBuffer, 0, destBuffer, destOffset, byteCount); - const commandBuffer = encoder.finish(); - const queue = device.defaultQueue; - queue.submit([commandBuffer]); - - srcBuffer.destroy(); -} - -``` - -As usual, batching per-frame uploads through fewer (or a single) buffer reduces -overhead. - -Applications are free to implement their own heuristics for batching or reusing -upload buffers: - -```js -function AutoRingBuffer(device, chunkSize) { - const queue = device.defaultQueue; - let availChunks = []; - - function Chunk() { - const size = chunkSize; - const [buf, initialMap] = this.device.createBufferMapped({ - size: size, - usage: GPUBufferUsage.MAP_WRITE | GPUBufferUsage.COPY_SRC, - }); - - let mapTyped; - let pos; - let enc; - this.reset = function(mappedArrayBuffer) { - mapTyped = new Uint8Array(mappedArrayBuffer); - pos = 0; - enc = device.createCommandEncoder({}); - if (size == chunkSize) { - availChunks.push(this); - } - }; - this.reset(initialMap); - - this.push = function(destBuffer, destOffset, srcArrayBuffer) { - const byteCount = srcArrayBuffer.byteLength; - const end = pos + byteCount; - if (end > size) - return false; - mapTyped.set(new Uint8Array(srcArrayBuffer), pos); - enc.copyBufferToBuffer(buf, pos, destBuffer, destOffset, byteCount); - pos = end; - return true; - }; - - this.flush = async function() { - const cb = enc.finish(); - queue.submit([cb]); - const newMap = await buf.mapWriteAsync(); - this.reset(newMap); - }; - - this.destroy = function() { - buf.destroy(); - }; - }; - - this.push = function(destBuffer, destOffset, srcArrayBuffer) { - if (availChunks.length) { - const chunk = availChunks[0]; - if (chunk.push(destBuffer, destOffset, srcArrayBuffer)) - return; - chunk.flush(); - this.destroy(); - - while (true) { - chunkSize *= 2; - if (chunkSize >= srcArrayBuffer.byteLength) - break; - } - } - - new Chunk(); - availChunks[0].push(destBuffer, destOffset, srcArrayBuffer); - }; - - this.flush = function() { - if (availChunks.length) { - availChunks[0].flush(); - availChunks.shift(); - } - }; - - this.destroy = function() { - availChunks.forEach(x => x.destroy()); - availChunks = []; - }; -}; -``` diff --git a/design/CommandSubmission.md b/design/CommandSubmission.md deleted file mode 100644 index ac7e405be3..0000000000 --- a/design/CommandSubmission.md +++ /dev/null @@ -1,47 +0,0 @@ -# Command Submission - -Command buffers carry sequences of user commands on the CPU side. -They can be recorded independently of the work done on GPU, or each other. -They go through the following stages: - -creation -> "recording" -> "ready" -> "executing" -> done - -Command buffers are created from and submitted to a command queue. -Creation and submission do not have to follow the same order. -The queue is also used to signal fences, allowing the user to know when the command buffers are done. - -## Detailed Model - -Users issue rendering and compute commands (such as resource bindings, draw calls, etc) via command buffers. -The concept of `WebGPUCommandBuffer` matches the native graphics APIs. -Those command buffers go through the following stages in their life cycle. -It starts with creating a new `WebGPUCommandBuffer` from a `WebGPUCommandQueue` instance. -From this point, the command buffer is considered to be in "recording" state. - -Commands can be encoded independent of anything done on `WebGPUDevice` or the underlying GPU. -The recording is CPU-only operation, and multiple command buffers can be recorded independently on web workers. -(TODO: disallow recording multiple command buffers on the same thread/web worker?). -Recording usually consists of a number of passes, be it render or compute, with occasional copy operations inserted between them. - -Since a programmable pass defines the resource binding scope, synchronization rules, fixes the resource usage, and exposes a number of specific operations, we encapsulate the encoder of a pass into a separate object, such as `WebGPURenderPassEncoder` and `WebGPUComputePassEncoder`. -The pass encoder object can be obtained from a command buffer by calling `beginRenderPass` or `beginComputePass` correspondingly. -The command buffer is expected to be in "recording" state, or otherwise a synchronous error is triggered. -No operations may be done on the `WebGPUCommandBuffer` if there is an open pass being encoded to it. -Calling any methods on the command buffer with an open pass, or submitting it to the command queue, triggers a synchronous error. -A pass encoding consists of state setting code and draw/dispatch calls, which are all methods on the corresponding encoder object. -In order to close a pass, the user calls `WebGPUProgrammablePassEncoder::endPass`, which returns the owner `WebGPUCommandBuffer` object. -Passes cannot straddle command buffers, and a command buffer may contain multiple passes. - -In order to finish recording a command buffer, the user calls `WebGPUCommandBuffer::finish` method, which transitions it from "recording" to the "ready" state. -It is valid to transfer this object between web workers. -When "ready", a command buffer can only be submitted for execution via `WebGPUCommandQueue::submit`, and no recording operations are available. -This method gets a sequence of command buffers and submits them (in the given order) to the GPU driver. -There are a few hidden (from the user point of view) stages here before the command buffer actually reaches the GPU. - -Once submitted, the command buffer switches to "executing" state, which means the command buffer will execute (both on the CPU and GPU) in finite time. -If the WebGPU implementation fails to submit the command buffer due to a problem with recorded content (e.g. exceeding the limit for the instance count in a draw call), it is turned into an internally null object, and the asynchronous error is reported. -The feature to re-use command buffers for multiple submissions is still being discussed, and until this is clear, we consider the `WebGPUCommandBuffer` to be moved into submission. -Any operations on a command buffer in the "executing" state, other than dropping it (which is what the user is expected to do), would trigger a synchronous error. - -If the submission is successful, then at some point in time the GPU will be done processing it. -The WebGPU implementation takes the responsibility to detect this moment and gracefully recycle/destroy this command buffer, when it's safe to do so. diff --git a/design/ErrorConventions.md b/design/ErrorConventions.md deleted file mode 100644 index fff348123f..0000000000 --- a/design/ErrorConventions.md +++ /dev/null @@ -1,79 +0,0 @@ -# Error Synchronicity Conventions/Guidelines - -The behavior of every error case in WebGPU is defined by the spec on a -case-by-case basis. A given error has one of these behaviors: - -* Synchronously throws a JS exception. -* Occurs asynchronously - one of: - * Causes a WebGPU object to become internally null. Produces an error log entry. - * Causes an operation to no-op. Produces an error log entry. -* If the operation returns a Promise, it rejects (and maybe produces an error log entry). - -(If a "developer mode" is enabled, all validation errors are thrown -synchronously, as exceptions. Device loss may or may not be synchronous and -this behavior may be implementation-specific. -Out-of-memory errors should NOT be made synchronous if the application would -otherwise have an opportunity to recover from them.) - -**The guidelines below are meant to help choose the individual cases defined by -the spec, but every case must be specced. This does not allow for -"implementation-defined" behavior.** Note that an implementation can easily -surface a synchronous error to the application "as-if" it's asynchronous, but -it cannot do the opposite, so we prefer to err on the side of asynchronicity in -the spec. - -As a general rule, those error cases should follow the following guidelines, -but are allowed to deviate in individual cases. For WebGPU function call -`o.f(a, b, ...)`, let `A = {a: a, b: b, ...}` represent the object graph -passed into `o.f`. - -* If WebIDL's binding rules would throw an exception: Error **must** be synchronous. - E.g.: - * If a parameter is passed in which doesn't match the type declared by the WebIDL. - -* If the method `o.f` is part of a disabled feature: Error **must** be synchronous. - * If the feature is *known but disabled*, `o.f` can be called, but - throws an exception (in the implementation). - * If the feature is *unknown*, `o.f` is `undefined`; - calling `undefined` throws an exception. - -* If the method `o.f` is available, but would return an instance of an - interface defined in a disabled feature: Error **must** be synchronous. - * (We probably won't have this case anyway.) - -* If `o` **is** an interface (not an instance) that is defined in a disabled - feature: Error **must** be synchronous. However, note that the behavior - cannot match exactly: - * If the feature is *known but disabled*, `o.f` can be called, but - throws an exception (in the implementation). - * If the feature is *unknown*, `o` is not defined, so accessing `o` - throws an exception (and accessing `window.o` gives `undefined`). - -* If any object in `A` contains any key that is not core or part of an enabled - feature: Error **must** be synchronous. - * This is explicitly made more strict than the usual WebIDL dictionary - binding rules. - -* If any object in `A` is missing a required key (given current features): Error **must** be synchronous. - -* Validation which depends only on individual primitives (e.g. `Number`s) in - `A`, `device.limits`, and `device.features`: - Error **should (but may not)** be synchronous. - E.g.: - * A `Number` exceeds the associated entry in `limits`. - * Two arrays must match in length, but don't. - * A bitflag has two incompatible bits. - -* Validation which depends on state which *can be tracked* on the client-side: - Error **may (but usually won't)** be synchronous. - E.g.: - * `queue.signalFence(fence, 3); queue.signalFence(fence, 2);` - * Building an invalid command buffers (e.g. resource used in conflicting - ways inside a pass): Probably will not be synchronous. - -* Validation which depends on state which is *not synchronously known* on the client-side: - Error **must not** be synchronous. - E.g.: - * A WebGPU interface object argument is internally null. - * The device is lost. - * There is an out-of-memory condition. diff --git a/design/ErrorHandling.md b/design/ErrorHandling.md deleted file mode 100644 index c041c71be3..0000000000 --- a/design/ErrorHandling.md +++ /dev/null @@ -1,525 +0,0 @@ -# Error Handling - -The simplest design for error handling would be synchronous, for example with Javascript exceptions. -However, this would introduce a lot of round-trip synchronization points for multi-threaded/multi-process WebGPU implementations, making it too slow to be useful. - -There are a number of cases that developers or applications need error handling for: - - - *Debugging*: Getting errors synchronously during development, to break in to the debugger. - - *Fatal Errors*: Handling device/adapter loss, either by restoring WebGPU or by fallback to non-WebGPU content. - - *Fallible Allocation*: Making fallible resource allocations (detecting out-of-memory). - - *Testing*: Checking success of WebGPU calls, for conformance testing or application unit testing. - - *Telemetry*: Collecting error logs in deployment, for bug reporting and telemetry. - -There is one other use case that is closely related to error handling: - - - *Waiting for Completion*: Waiting for completion of off-queue GPU operations (like object creation). - -Meanwhile, error handling should not make the API clunky to use. - -## *Debugging*: Dev Tools - -Implementations should provide a way to enable synchronous validation, for example via a "break on WebGPU error" option in the developer tools. -The extra overhead needs to be low enough that applications can still run while being debugged. - -## *Fatal Errors*: requestAdapter, requestDevice, and device.lost - - - -```webidl -interface GPU { - Promise requestAdapter(optional GPURequestAdapterOptions options = {}); -}; -``` - -```webidl -interface GPUDeviceLostInfo { - readonly attribute DOMString message; -}; - -partial interface GPUDevice { - readonly attribute Promise lost; -}; -``` - -`GPU.requestAdapter` requests an adapter from the user agent. -It returns a Promise which resolves when an adapter is ready. -The Promise may not resolve for a long time - for example, the browser -could delay until a background tab is foregrounded, to make sure the right -adapter is chosen at the time the tab is foregrounded (in case the system -state, such as battery state, has changed). -If it returns `null`, the app knows for sure that its request could not be fulfilled -(at least, in the current system state...); it does not need to retry with the -same `GPURequestAdapterOptions`. -If the `options` are invalid (currently impossible), `requestAdapter()` rejects. - -`GPUAdapter.requestDevice` requests a device from the adapter. -It returns a Promise which resolves when a device is ready. -The Promise may not resolve for a long time - for example, even if the -adapter is still valid, the browser could delay until a background tab is -foregrounded, to make sure that system resources are conserved until then. -If the adapter is lost and therefore unable to create a device, `requestDevice()` -returns an already-lost device. -If the `descriptor` is invalid (e.g. it exceeds the limits of the adapter), `requestDevice()` rejects. - -The `GPUDevice` may be lost if something goes fatally wrong on the device -(e.g. unexpected driver error, crash, or native device loss). -The `GPUDevice` provides a promise, `device.lost`, which resolves when the device is lost. -It will **never** reject and may be pending forever. - -Once `lost` resolves, the `GPUDevice` cannot be used anymore. -The device and all objects created from the device have become invalid. -All further operations on the device and its objects are errors. -The `"validationerror"` event will no longer fire. (This makes all further operations no-ops.) - -An app should never give up on getting WebGPU access due to `GPUDevice.lost` resolving. -Instead of giving up, the app should try again starting with `requestAdapter`. - -It *should* give up based on a `requestAdapter` returning `null` or rejecting. -(It should also give up on a `requestDevice` rejection, as that indicates an app -programming error - the request was invalid, e.g. not compatible with the adapter.) - -### Example Code - -**NOTE:** this example (and possibly the init API) still needs significant rework! - -```js -class MyRenderer { - constructor() { - this.adapter = null; - this.device = null; - } - async begin() { - const usingWebGPU = await this.initWebGPU(); - if (!usingWebGPU) { - this.initFallback(); - } - } - initFallback() { - // Try WebGL, 2D Canvas, or other fallback. - } - async initWebGPU() { - // Stop rendering. (If there was already a device, WebGPU calls made before - // the app notices the device is lost are okay - they are no-ops.) - this.device = null; - - // Keep current adapter (but make a new one if there isn't a current one.) - await tryEnsureDeviceOnCurrentAdapter(); - if (!this.adapter) return false; - // If the device is null, the adapter was lost. Try a new adapter. - // Continue doing this until one is found or an error is thrown. - while (!this.device) { - this.adapter = null; - await tryEnsureDeviceOnCurrentAdapter(); - if (!this.adapter) return false; - } - - // ... Upload resources, etc. - return true; - } - // TODO: This example should not retry on the current adapter, it should get a new adapter. - async tryEnsureDeviceOnCurrentAdapter() { - // If no adapter, get one. - // If we can't, rejects and the app falls back. - if (!this.adapter) { - // If no adapter, get one. - this.adapter = await gpu.requestAdapter({ /* options */ }); - // If requestAdapter resolves to null, no matching adapter is available. - // Exit to fallback. - if (!this.adapter) return; - } - - // Try to get a device. - // rejection => options were invalid (app programming error) - this.device = await this.adapter.requestDevice({ /* options */ }); - - // When the device is lost, just try to get a device again. - device.lost.then((info) => { - console.error("Device was lost.", info); - this.initWebGPU(); - }); - } -} -``` - -### Case Studies - -*What signals should the app get, and when?* - -Two independent applications are running on the same webpage against two devices on the same adapter. -The tab is in the background, and one device is using a lot of resources. - - The browser chooses to lose the heavier device. - - `device.lost` resolves, message = reclaiming device resources - - (If the app calls `requestDevice` on the same adapter, or `requestAdapter`, - it does not resolve until the tab is foregrounded.) - - Later, the browser might choose to lose the smaller device too. - - `device.lost` resolves, message = reclaiming device resources - - (If the app calls `requestDevice` on the same adapter, or `requestAdapter`, - it does not resolve until the tab is foregrounded.) - - The system configuration changes (e.g. laptop is unplugged). - - Since the adapter is no longer used, the UA may choose to lose it and - reject any outstanding `requestDevice` promises. - (Perhaps not until the tab is foregrounded.) - - (If the app calls `requestAdapter`, it does not resolve until the tab is foregrounded.) - -A page begins loading in a tab, but then the tab is backgrounded. - - On load, the page attempts creation of an adapter. - - The browser may or may not provide a WebGPU adapter yet - if it doesn't, - then when the page is foregrounded, the `requestAdapter` Promise will resolve. - (This allows the browser to choose an adapter based on the latest system state.) - -A device's adapter is physically unplugged from the system (but an integrated GPU is still available). - - The same adapter, or a new adapter, is plugged back in. - - A later `requestAdapter` call may return the new adapter. (In the future, it might fire a "gpuadapterschanged" event.) - -An app is running on an integrated adapter. - - A new, discrete adapter is plugged in. - - A later `requestAdapter` call may return the new adapter. (In the future, it might fire a "gpuadapterschanged" event.) - -An app is running on a discrete adapter. - - The adapter is physically unplugged from the system. An integrated GPU is still available. - - `device.lost` resolves, `requestDevice` on same adapter rejects, `requestAdapter` gives the new adapter. - - The same adapter, or a new adapter, is plugged back in. - - A later `requestAdapter` call may return the new adapter. (In the future, it might fire a "gpuadapterschanged" event.) - -The device is lost because of an unexpected error in the implementation. - - `device.lost` resolves, message = whatever the unexpected thing was. - -A TDR-like scenario occurs. - - The adapter is lost, which loses all devices on the adapter. - `device.lost` resolves on every device, message = adapter reset. Application must request adapter again. - - (TODO: alternatively, adapter could be retained, but all devices on it are lost.) - -All devices and adapters are lost (except for software?) because GPU access has been disabled by the browser (for this page or globally, e.g. due to unexpected GPU process crashes). - - `device.lost` resolves on every device, message = whatever - -WebGPU access has been disabled for the page. - - `requestAdapter` returns null (or a software adapter). - -The device is lost right as it's being returned by requestDevice, or otherwise couldn't be -created due to non-deterministic/internal reasons. - - `device.lost` resolves. - -## Operation Errors and Internal Nullability - -WebGPU objects are opaque handles. -On creation, such a handle is "pending" until the backing object is created by the implementation. -Asynchronously, a handle may refer to a successfully created object (called a "valid object"), or an internally-empty/unsuccessful object (called an "invalid object"). -The status of an object is opaque to JavaScript, except that any errors during object creation can be captured (see below). - -If a WebGPU object handle A is passed to a WebGPU API call C that requires a valid object, that API call opaquely accepts the object regardless of its status (pending, valid, or invalid). -However, internally and asynchronously, C will not be validated and executed until A's status has resolved. -If A resolves to invalid, C will fail (asynchronously). - -Errors in operations or creation will generate an error **into the current scope**. -An error may be captured by a surrounding Error Scope (described below). -If an error is not captured, it may fire the Device's "unhandlederror" event (below). - -### Categories of WebGPU Calls - -#### Initialization - -Creation of the adapter and device. - - - `gpu.requestAdapter` - - `GPUAdapter.requestDevice` - -Handled by "Fatal Errors" above. - -#### Object-Returning - -WebGPU Object creation and getters. - - - `GPUDevice.createTexture` - - `GPUDevice.createBuffer` - - `GPUDevice.createBufferMapped` - - `GPUTexture.createView` - - `GPUTexture.createDefaultView` - - `GPUCommandEncoder.finish` - - `GPUDevice.getQueue` - - `GPUSwapChain.getCurrentTexture` - -If there is an error, the returned object is invalid, and an error is generated into the current scope. - -#### Encoding - -Recording of GPU commands in `GPUCommandEncoder`. - - - `GPUCommandEncoder.*` - - `GPURenderPassEncoder.*` - - `GPUComputePassEncoder.*` - -These commands do not report errors. -Instead, `GPUCommandEncoder.finish` returns an invalid object and generates an error into the current scope. - -#### Promise-Returning - - - `GPUDevice.createBufferMappedAsync` - - `GPUCanvasContext.getSwapChainPreferredFormat` - - `GPUFence.onCompletion` - - `GPUBuffer.mapReadAsync` - - `GPUBuffer.mapWriteAsync` - -If there is an error, the returned Promise rejects. - -#### Void-Returning - - - `GPUQueue.submit` - - `GPUQueue.signal` - - `GPUBuffer.unmap` - - `GPUBuffer.destroy` - - `GPUTexture.destroy` - -If there is an error, an error is generated into the current scope. - -#### Infallible - - - `GPUFence.getCompletedValue` - -This call cannot fail. - -## Error Scopes - -Each device\* maintains a persistent "error scope" stack state. -Initially, the device's error scope stack is empty. -`GPUDevice.pushErrorScope(filter)` creates an error scope and pushes it onto the stack. - -`GPUDevice.popErrorScope()` pops an error scope from the stack, and returns a `Promise`, which resolves once the enclosed operations are complete. -It resolves to null if no errors were captured, and otherwise resolves to the first error that occurred in the scope - -either a `GPUOutOfMemoryError` or a `GPUValidationError` object containing information about the validation failure. - -An error scope captures an error if its filter matches the type of the error scope: -`pushErrorScope('out-of-memory')` captures `GPUOutOfMemoryError`s; -`pushErrorScope('validation')` captures `GPUValidationError`s. -The filter mechanism prevents developers from, e.g., accidentally silencing validation errors when trying to do fallible allocation. - -If an error scope captures an error, the error is not passed down to the enclosing error scope. -Each error scope stores only the **first error** it captures, and returns that error when the scope is popped. -Any further errors it captures are **silently ignored**. - -If an error is not captured by an error scope, it is passed out to the enclosing error scope. - -If there are no error scopes on the stack, `popErrorScope()` throws OperationError. - -If the device is lost, `popErrorScope()` always rejects with OperationError. - -\* Error scope state is **per-device, per-execution-context**. -That is, when a `GPUDevice` is posted to a Worker for the first time, the new `GPUDevice` copy's error scope stack is empty. -(If a `GPUDevice` is copied *back* to an execution context it already existed on, it shares its error scope state with all other copies on that execution context.) - -```webidl -enum GPUErrorFilter { - "out-of-memory", - "validation" -}; - -interface GPUOutOfMemoryError {}; - -interface GPUValidationError { - readonly attribute DOMString message; -}; - -typedef (GPUOutOfMemoryError or GPUValidationError) GPUError; - -partial interface GPUDevice { - void pushErrorScope(GPUErrorFilter filter); - Promise popErrorScope(); -}; -``` - -### *Fallible Allocation* - -An `out-of-memory` error scope can be used to detect allocation failure. - -#### Example: tryCreateBuffer - -```js -async function tryCreateBuffer(device, desc) { - device.pushErrorScope('out-of-memory'); - const buffer = device.createBuffer(desc); - if (await device.popErrorScope() !== null) { - return null; - } - return buffer; -} -``` - -### *Waiting for Completion* - -Using a `validation` error scope can tell an application when validation has -completed, but is otherwise not intended to signal completion. - -(On-queue operation completion can be detected with `GPUFence`.) - -For pipeline creation, there are `createReadyComputePipeline` and -`createReadyRenderPipeline`. - -#### Example: createReadyRenderPipeline - -`createReadyRenderPipeline` is asynchronous. -Note `requestAnimationFrame`'s callback is not treated as asynchronous - only the first task is guaranteed to occur before the frame is displayed. - -```js -class Renderer { - init() { - const fastPipeline = createRenderPipeline(...); - this.pipeline = fastPipeline; - } - - prepareSlowPipeline() { - createReadyRenderPipeline(...).then((slowPipeline) => { - this.pipeline = slowPipeline; - }); - } - - draw() { - if (wantSlowPipeline) { - prepareSlowPipeline(); - } - // draw object with this.pipeline. - // It switches to the "slowPipeline" when it becomes available. - } -} - -renderer.init(); -const frame = () => { - requestAnimationFrame(frame); - renderer.draw(); -}; -requestAnimationFrame(frame); -``` - -### *Testing* - -Tests need to be able to reliably detect both expected and unexpected errors. - -### Example - -```js -device.pushErrorScope('out-of-memory'); -device.pushErrorScope('validation'); - -{ - // Do stuff that shouldn't produce errors. - { - device.pushErrorScope('validation'); - device.doOperationThatErrors(); - device.popErrorScope().then(error => { assert(error !== null); }); - } - // More stuff that shouldn't produce errors -} - -// Detect unexpected errors. -device.popErrorScope().then(error => { assert(error === null); }); -device.popErrorScope().then(error => { assert(error === null); }); -``` - -## *Telemetry* - -If an error is not captured by an explicit error scope, it bubbles up to the device and **may** fire its `uncapturederror` event. - -This mechanism is like a programmatic way to access the warnings that appear in the developer tools. -Errors reported via the validation error event *should* also appear in the developer tools console as warnings (like in WebGL). -However, some developer tools warnings might not necessarily fire the event, and message strings could be different (e.g. some details omitted for security). - -The WebGPU implementation may choose not to fire the `uncapturederror` event for a given error, for example if it has fired too many times, too many times in a row, or with too many errors of the same kind. -This is similar to how console warnings would work, and work today for WebGL. -(In badly-formed applications, this mechanism can prevent the events from having a significant performance impact on the system.) - -**Unlike** error scoping, the `uncapturederror` event can only fire on the main thread (Window) event loop. - -```webidl -[ - Constructor(DOMString type, GPUUncapturedErrorEventInit gpuUncapturedErrorEventInitDict), - Exposed=Window -] -interface GPUUncapturedErrorEvent : Event { - readonly attribute GPUError error; -}; - -dictionary GPUUncapturedErrorEventInit : EventInit { - required DOMString message; -}; - -// TODO: is it possible to expose the EventTarget only on the main thread? -partial interface GPUDevice : EventTarget { - [Exposed=Window] - attribute EventHandler onuncapturederror; -}; -``` - -#### Example - -```js -const device = await adapter.requestDevice({}); -device.addEventListener('uncapturederror', (event) => { - appendToTelemetryReport(event.message); -}); -``` - -## Open Questions and Considerations - - - Is there a need for synchronous, programmatic capture of errors during development? - (E.g. an option to throw an exception on error instead of surfacing the error asynchronously. - Asynchronous error handling APIs are not enough to polyfill this.) - This would only be needed for printf-style debugging; a "break on WebGPU error" would be used for Dev Tools debugging. - - - How can a synchronous application (e.g. WASM port) handle all of these asynchronous errors? - A synchronous version of `popErrorState` and other entry points would need to be exposed on Workers. - (A more general solution for using asynchronous APIs synchronously would also solve this.) - - - Should there be a maximum error scope depth? - - - Or should error scope balance be enforced by changing the API to e.g. `device.withErrorScope('validation', () => { device.stuff(); /*...*/ })`? - - - Should the error scope filter be a bitfield? - - - Should the error scope filter have a default value? - - - Should errors beyond the first in an error scope be silently ignored, bubble up to the parent error scope, or be immediately given to the `uncapturederror` event? - - (Currently, it is silently ignored.) - - - Should there be codes for different error types, to slightly improve testing fidelity? (e.g. `invalid-object`, `invalid-value`, `invalid-state`) - - - Should developers be able to self-impose a memory limit (in order to emulate lower-memory devices)? - Should implementations automatically impose a lower memory limit (to improve stability and portability)? - - - To help developers, should `GPUUncapturedErrorEvent.message` contain some sort of "stack trace" taking advantage of object debug labels? - For example: - - ``` - .submit failed: - - commands[0] () was invalid: - - in setIndexBuffer, indexBuffer () was invalid: - - in createBuffer, desc.usage was invalid (0x89) - ``` - - - How do applications handle the case where they've allocated a lot of optional memory, but want to make another required allocation (which could fail due to OOM)? - How do they know when to free an optional allocation first? - - For now, applications wanting to handle this kind of case must always use fallible allocations. - - (We will likely improve this with a `GPUResourceHeap`, once we figure out what that looks like.) - - - Should attempting to use a buffer or texture in the `"out-of-memory"` state (a) result in immediate device loss, (b) result in device loss when used in a device-level operation (submit, map, etc.), or (c) just produce a validation error? - - Currently described: none, implicitly (c) - -## Resolved Questions - - - In a world with persistent object "usage" state: - If an invalid command buffer is submitted, and its transitions becomes no-ops, the usage state won't update. - Will this cause future command buffer submits to become invalid because of a usage validation error? - - Tentatively resolved: WebGPU is expected not to require explicit usage transitions. - - - Should an object creation error immediately log an error to the error log? - Or should it only log if the error propagates to a device-level operation? - - Tentatively resolved: errors should be logged immediately. - - - Should applications be able to intentionally create graphs of potentially-invalid objects, and recover from this late? - E.g. create a large buffer, create a bind group from that, create a command buffer from that, then choose whether to submit based on whether the buffer was successfully allocated. - - For non-OOM, tentatively resolved: They can, inside of an error scope. Any subsequent errors can be suppressed. Not sure if it's useful. - - For OOM, see other questions about OOM. - - - Should there be an API that exposes object status? - - Resolved: No, but errors during object creation can be detected. - - - Should there be a way to capture out-of-memory errors without capturing validation errors? (And vice versa?) - - Resolved: Yes, so applications don't accidentally silence validation errors. diff --git a/design/ImageBitmapToTexture.md b/design/ImageBitmapToTexture.md deleted file mode 100644 index 83c571d002..0000000000 --- a/design/ImageBitmapToTexture.md +++ /dev/null @@ -1,39 +0,0 @@ -# WebGPU + ImageBitmap - -```webidl -dictionary GPUImageBitmapCopyView { - required ImageBitmap imageBitmap; - GPUOrigin2D origin; -}; - -partial interface GPUQueue { - void copyImageBitmapToTexture( - GPUImageBitmapCopyView source, - GPUTextureCopyView destination, - // For now, copySize.z must be 1. - GPUExtent3D copySize); -}; -``` - -`copyImageBitmapToTexture` submits a copy from a source sub-rectangle of an `ImageBitmap` into a destination sub-resource of a `GPUTexture`. -The `ImageBitmap` must not be detached, if it is, a validation error is generated. - -## Alternatives Considered - - * Creating a `GPUTexture` directly from an `ImageBitmap`, attempting to avoid copies, is impractical because it requires the GPUTexture's format to match the internal representation of the `ImageBitmap`, which is not exposed to the Web platform. - Additionally, `ImageBitmap`s may be GPU- or CPU-backed, and wrapping a CPU-backed `ImageBitmap` is a significant meta-operation that requires an additional copy to be submitted. - * Having `copyImageBitmapToTexture` on `GPUCommandEncoder`: this makes implementations much more complicated because they can't know when the copy will be effectively submitted. - It also allows having multiple `copyImageBitmapToTexture` at different sports in the `GPUCommandEncoder` which would require splicing the encoder and keeping track of all the chunks. - Realistically, copying `ImageBitmap`s will be during loading to copy from `` elements, or at most a couple times per frame for example to copy a camera frame, so an immediate copy is fine. - -## Issues - - * Some of the `ImageBitmap` creation options, such as `"flipY"`, have semantics that have to match the target graphics API where the data is intended to be used. - For WebGL, `imageOrientation: "flipY"` is necessary to ensure that the resulting `WebGLTexture` is oriented correctly. - For WebGPU, it may be the case that texture origins are defined differently from WebGL, necessitating `imageOrientation: "none"`. - These cases will have to be thoroughly tested. - - * The browser may choose an internal representation for an `ImageBitmap` which is not ideal for usage by WebGPU (or, for that matter, [by WebGL](https://crbug.com/831740)). - This could result in texture uploads being significantly more expensive than necessary due to per-pixel data swizzling during upload. - Providing any hint about the intended usage of the `ImageBitmap` during its construction (for example "for use with WebGL" or "for use with this WebGPU adapter") would require changes to the HTML specification. - Attempts to change Chrome's internal representation of `ImageBitmap` have not yet been successful; it's not clear how feasible it would be in other browsers. diff --git a/design/Limits.md b/design/Limits.md deleted file mode 100644 index 541305cbd7..0000000000 --- a/design/Limits.md +++ /dev/null @@ -1,43 +0,0 @@ -# GPULimits Explainer - -This document lists the citations for the "limits" in the WebGPU API that decide the minimum capabilities of a compliant WebGPU implementation. - -## The GPULimits Dictionary (last updated 2021-04-13) - -```javascript -dictionary GPULimits { - unsigned long maxBindGroups = 4; - unsigned long maxDynamicUniformBuffersPerPipelineLayout = 8; - unsigned long maxDynamicStorageBuffersPerPipelineLayout = 4; - unsigned long maxSampledTexturesPerShaderStage = 16; - unsigned long maxSamplersPerShaderStage = 16; - unsigned long maxStorageBuffersPerShaderStage = 4; - unsigned long maxStorageTexturesPerShaderStage = 4; - unsigned long maxUniformBuffersPerShaderStage = 12; - unsigned long maxVertexBuffers = 8; - unsigned long maxVertexAttributes = 16; - unsigned long maxVertexArrayStride = 2048; - unsigned long maxTextureDimension1D = 8192; - unsigned long maxTextureDimension2D = 8192; - unsigned long maxTextureDimension3D = 2048; - unsigned long maxTextureArrayLayers = 2048; -}; -``` - -Limit | API Doc | gpuweb issue/PR | Notes ---- | --- | --- | --- -`maxBindGroups = 4;` | [Vulkan](https://vulkan.lunarg.com/doc/view/1.2.170.0/linux/chunked_spec/chap42.html#limits) `maxBoundDescriptorSets` | | -`maxDynamicUniformBuffersPerPipelineLayout = 8;` | [Vulkan](https://vulkan.lunarg.com/doc/view/1.2.170.0/linux/chunked_spec/chap42.html#limits) `maxDescriptorSetUniformBuffersDynamic` | [#406](https://github.com/gpuweb/gpuweb/issues/406) | -`maxDynamicStorageBuffersPerPipelineLayout = 4;` | [Vulkan](https://vulkan.lunarg.com/doc/view/1.2.170.0/linux/chunked_spec/chap42.html#limits) `maxDescriptorSetStorageBuffersDynamic` | [#406](https://github.com/gpuweb/gpuweb/issues/406) | -`maxSampledTexturesPerShaderStage = 16;` | [Vulkan](https://vulkan.lunarg.com/doc/view/1.2.170.0/linux/chunked_spec/chap42.html#limits) `maxPerStageDescriptorSampledImages` | [#409](https://github.com/gpuweb/gpuweb/issues/409) | -`maxSamplersPerShaderStage = 16;` | [Vulkan](https://vulkan.lunarg.com/doc/view/1.2.170.0/linux/chunked_spec/chap42.html#limits) `maxPerStageDescriptorSamplers` | [#409](https://github.com/gpuweb/gpuweb/issues/409) | -`maxStorageBuffersPerShaderStage = 4;` | [Vulkan](https://vulkan.lunarg.com/doc/view/1.2.170.0/linux/chunked_spec/chap42.html#limits) `maxPerStageDescriptorStorageBuffers` | [#409](https://github.com/gpuweb/gpuweb/issues/409) | -`maxStorageTexturesPerShaderStage = 4;` | [Vulkan](https://vulkan.lunarg.com/doc/view/1.2.170.0/linux/chunked_spec/chap42.html#limits) `maxPerStageDescriptorStorageImages` | [#409](https://github.com/gpuweb/gpuweb/issues/409) | -`maxUniformBuffersPerShaderStage = 12;` | [Vulkan](https://vulkan.lunarg.com/doc/view/1.2.170.0/linux/chunked_spec/chap42.html#limits) `maxPerStageDescriptorUniformBuffers` | [#409](https://github.com/gpuweb/gpuweb/issues/409) | -`maxVertexBuffers = 8;` | [Vulkan](https://vulkan.lunarg.com/doc/view/1.2.170.0/linux/chunked_spec/chap42.html#limits) `maxVertexInputBindings` | [#693](https://github.com/gpuweb/gpuweb/issues/693) | -`maxVertexAttributes = 16;` | [Vulkan](https://vulkan.lunarg.com/doc/view/1.2.170.0/linux/chunked_spec/chap42.html#limits) `maxVertexInputAttributes` | [#693](https://github.com/gpuweb/gpuweb/issues/693) | -`maxVertexArrayStride = 2048;` | [Vulkan](https://vulkan.lunarg.com/doc/view/1.2.170.0/linux/chunked_spec/chap42.html#limits) `maxVertexInputBindingStride` | [#693](https://github.com/gpuweb/gpuweb/issues/693) | -`maxTextureDimension1D = 8192;` | [Vulkan](https://vulkan.lunarg.com/doc/view/1.2.170.0/linux/chunked_spec/chap42.html#limits) `maxImageDimension1D` | [#1327](https://github.com/gpuweb/gpuweb/issues/1327) | Vulkan's limit is 4096. We expand the limit to 8192 because [the vast majority of devices in market can support 8192 or a higher limit](https://vulkan.gpuinfo.org/displaydevicelimit.php?name=maxImageDimension1D). The devices that cannot support this limit are pretty rare and old. -`maxTextureDimension2D = 8192;` | [Vulkan](https://vulkan.lunarg.com/doc/view/1.2.170.0/linux/chunked_spec/chap42.html#limits) `maxImageDimension2D` | [#1327](https://github.com/gpuweb/gpuweb/issues/1327) | Vulkan's limit is 4096. We expand the limit to 8192 because [the vast majority of devices in market can support 8192 or a higher limit](https://vulkan.gpuinfo.org/displaydevicelimit.php?name=maxImageDimension2D). The devices that cannot support this limit are pretty rare and old. -`maxTextureDimension3D = 2048;` | [Vulkan](https://vulkan.lunarg.com/doc/view/1.2.170.0/linux/chunked_spec/chap42.html#limits) `maxImageDimension3D` | [#1327](https://github.com/gpuweb/gpuweb/issues/1327) | Vulkan's limit is 256. We expand the limit to 2048 because [the vast majority of devices in market can support 2048 or a higher limit](https://vulkan.gpuinfo.org/displaydevicelimit.php?name=maxImageDimension3D). The devices that cannot support this limit are pretty rare and old. -`maxTextureArrayLayers = 2048;` | [Vulkan](https://vulkan.lunarg.com/doc/view/1.2.170.0/linux/chunked_spec/chap42.html#limits) `maxImageArrayLayers` | [#1327](https://github.com/gpuweb/gpuweb/issues/1327) | Vulkan's limit is 256. We expand the limit to 2048 because [the vast majority of devices in market can support 2048 or a higher limit](https://vulkan.gpuinfo.org/displaydevicelimit.php?name=maxImageArrayLayers). The devices that cannot support this limit are pretty rare and old. diff --git a/design/Pipelines.md b/design/Pipelines.md deleted file mode 100644 index 59495866ba..0000000000 --- a/design/Pipelines.md +++ /dev/null @@ -1,295 +0,0 @@ -# Pipeline objects - -This is a document with pseudo-code for the parts of the API related to pipeline objects. -Naming and the UX of the API are just an example and could change, the important parts are what’s available and what goes where. -For example this document uses C structures when a C++ API might want to use builder objects, and a Javascript API could use dictionaries. - -## Creating pipeline objects - -For type safety, compute and graphics pipeline are separate types. -To create a pipeline, a structure containing all the relevant information is passed to `DeviceCreatePipeline`. - -## Compute pipeline creation - -To create a compute pipeline the only things needed are some shader code present in a `ShaderModule` object, and a `PipelineLayout` object describing how the pipeline interacts with the binding model. - -```cpp -struct ComputePipelineDescriptor { - ShaderModule module; - const char* entryPoint; - PipelineLayout layout; -}; - -ComputePipeline CreateComputePipeline(Device device, const ComputePipelineDescriptor* descriptor); -``` - -Translation to the backing APIs would be the following: - - **D3D12**: Translates to `ID3D12::CreateComputePipelineState`, a `D3D12_SHADER_BYTECODE` is created from the `(module, entryPoint)` pair, and the `ID3D12RootSignature` is equivalent to the `PipelineLayout`. - - **Metal**: Translates to `MTLDevice::makeComputePipelineState`, the `MTLFunction` is created from the `(module, entryPoint, layout)` tuple by adapting the generated MSL to the resource slot allocation done in `layout`. - - **Vulkan**: Translates to `vkCreateComputePipelines` with one pipeline. The `vkShaderStageInfo` corresponds to `(module, entryPoint)` and the `vkPipelineLayout` corresponds to `layout`. - -Question: How do we take advantage of the pipeline caching present in D3D12 and Vulkan? Do we expose it to the application or is it done magically in the WebGPU implementation? - -Answer: deferred to post-MVP. - -## Render pipeline creation - -Render pipelines need `ShaderModule` and a `PipelineLayout` like compute pipelines and in addition require information about: - - Layout for vertex inputs - - Layout for fragment outputs - - All the fixed-function state - -For simplicity we assume most fixed-function state is created in separate object. -For example a `DepthStencilState` object would be allocated and a pointer to it would be stored in the `RenderPipelineDescriptor`. This is part of the UX of the API and could be replaced with chained structure like Vulkan or member structure like D3D12. - -Mismatch: - - Metal has primitive restart always enabled. - - D3D12 needs to know whether the primitive restart index is `0xFFFF` or `0xFFFFFFFF` at pipeline creation time. - - Metal doesn’t have a sample mask. - - Vulkan can have some state like scissor and viewport set on the pipeline as an optimization on some GPUs. - - Vulkan allows creating pipelines in bulk, this is not only a UX things but allows reusing some results for faster creation. - -```cpp -enum IndexFormat { - IndexFormatUint16, - IndexFormatUint32, -}; - -struct RenderPipelineDescriptor { - // Same translation as for compute pipelines - ShaderModule vsModule; - const char* vsEntryPoint; - ShaderModule fsModule; - const char* fsEntryPoint; - PipelineLayout layout; - - // Pipeline input / outputs - InputState* inputState; - IndexFormat indexFormat; - RenderPass* renderPass; - int subpassIndex; - - // Fixed function state - DepthStencilState* depthStencil; - BlendState* blend[kMaxColorAttachments]; - PrimitiveTopology topology; - // TODO other state: rasterizer state, “multisample state” -}; - - -RenderPipeline CreateRenderPipeline(Device device, const RenderPipelineDescriptor* descriptor); -``` - -Translation to the backing APIs would be the following: - - **D3D12**: Translates to `ID3D12::CreateGraphicsPipelineState`. `IBStripCutValue` will always be set with its value being chosen depending on `indexFormat`. - - **Metal**: Translates to `MTLDevice::makeRenderPipelineState` - - **Vulkan**: Translates to `vkCreateGraphicsPipelines`. `VkPipelineInputAssemblyStateCreateInfo`'s `primitiveRestartEnable` is always set to true. All dynamic states are set on all pipelines. - -Question: Should the type of the indices be set in `RenderPipelineDescriptor`? If not, how is the D3D12 `IBStripCutValue` chosen? - -Answer: While `indexFormat` isn't necessary in any of the three APIs, we chose to include it in the pipeline state because primitive restart must always be enabled (because of Metal) and a D3D12 needs to choose the correct `IBStripCutValue`. The alternative would have been to compile two D3D12 pipelines for every WebGPU pipelines, or defer compilation. - -The translation of individual members of RenderPipelineDescriptor is described below. - -### Input state - -This describes how the vertex buffers are stepped through (stride, instance vs. vertex, instance divisor), and how the attributes are extracted from the buffers (buffer index, format, offset). - -Mismatches: - - D3D12 takes the stride along with the vertex buffers in `ID3D12GraphicsCommandList::IASetVertexBuffers` whereas Vulkan and Metal take it at pipeline compilation time. - - Vulkan doesn’t support a divisor for its step rate. - -```cpp -enum StepRate { - StepRateVertex, - StepRateInstance, -}; - -Enum VertexFormat { - // TODO make a list of portable vertex formats -}; - -struct InputStateDescriptor { - struct { - bool enabled; - VertexFormat format; - int offsetInBuffer; - int bufferIndex; - } attributes[MAX_ATTRIBUTES]; - - struct { - StepRate rate; - int stride; - } buffers[MAX_VERTEX_BUFFERS]; -}; - -InputState* CreateInputState(Device* device, InputStateDescriptor* descriptor); -``` - -Translation to the backing APIs would be the following: - - **D3D12**: Translates to a `D3D12_INPUT_DESC`. - Each enabled attribute corresponds to a `D3D12_INPUT_ELEMENT_DESC` with `InputSlot` being the index of the attribute. - Other members of the `D3D12_INPUT_ELEMENT_DESC` are translated trivially. - The stride is looked up in the pipeline state before calls to `ID3D12GraphicsCommandList::IASetVertexBuffers`. - `IASetVertexBuffers` might be deferred until before a draw and vertex buffers might be invalidated by pipeline changes. - - **Metal**: Translates to a `MTLVertexDescriptor`, with attributes corresponding to `MTLVertexDescriptor::attributes` and buffers corresponding to `MTLVertexDescriptor::layouts`. - Attributes translate trivially to `MTLVertexAttributeDescriptor` structures and buffers to `MTLVertexBufferLayoutDescriptor` structures. - Extra care only needs to be taken to translate a zero stride to a constant step rate. - - **Vulkan**: Translates to a `VkPipelineVertexInputStateCreateInfo`. - Buffers translate trivially to `VkVertexInputBindingDescription` and attributes to `VkVertexInputAttributeDescription`. - -Question: Should the vertex attributes somehow be included in the PipelineLayout so vertex buffers are treated as other resources and changed in bulk with them? - -Answer: We decided against innovating in this area. - -### Render pass / render target format - -The `RenderPass` will contain for each subpass a list of the attachment formats for color attachments and depth-stencil attachments. -Information from the `RenderPass` is used to fill the following: - - **D3D12**: `RTVFormats`, `DSVFormats` and `NumRenderTargets` in `D3D12_GRAPHICS_PIPELINE_STATE_DESC`. - - **Metal**: `colorAttachments[N].pixelFormat`, `depthAttachmentPixelFormat` and `stencilAttachmentPixelFormat` in `MTLRenderPipelineDescriptor`. - - **Vulkan**: `renderPass` and `subpass` in `VkGraphicsPipelineCreateInfo`. - -Question: does the sample count of the pipeline state come from the RenderPass too? - -Answer: deferred post-MVP. - -### Primitive topology - -Mismatch: - - Metal and D3D12 only require “point vs. line vs. triangle” at pipeline compilation time, the exact topology is set via `ID3D12GraphicsCommandList::IASetPrimitiveTopology` or passed in the `MTLRenderCommandEncoder::draw*`. - Vulkan requires the exact topology at compilation time. - - Vulkan supports triangle fans but Metal and D3D12 don’t. - -```cpp -enum PrimitiveTopology { - PrimitiveTopologyPoints, - PrimitiveTopologyLineList, - PrimitiveTopologyLineStrip, - PrimitiveTopologyTriangleList, - PrimitiveTopologyTriangleStrip, -}; -``` - -Translation to the backing APIs would be the following: - - **D3D12 and Metal**: The primitive topology type is set on the `D3D12_GRAPHICS_PIPELINE_STATE_DESC` and `MTLRenderPipelineDescriptor`. - At draw-time, the exact topology is queried from the pipeline. - - **Vulkan**: The primitive topology type is set in the `VkGraphicsPipelineCreateInfo`. - -### Blend state - -Mismatch: - - In Vulkan per-attachment blending and dual source blending are exposed as optional features. - `independentBlend` is supported almost everywhere but Adreno 4XX while `dualSrcBlend` is also not supported on Mali GPUs. - - Metal doesn’t have logic ops. - -```cpp -enum BlendOperation { - BlendOperationAdd, - BlendOperationSubtract, - BlendOperationReverseSubtract, - BlendOperationMin, - BlendOperationMax, -}; - -enum BlendFactor { - BlendFactorOne, - BlendFactorSrcColor, - BlendFactorOneMinusSrcColor, - BlendFactorSrcAlpha, - BlendFactorOneMinusSrcAlpha, - BlendFactorDstColor, - BlendFactorOneMinusDstColor, - BlendFactorDstAlpha, - BlendFactorOneMinusDstAlphe, - BlendFactorSrcAlphaSaturated, - BlendFactorBlendColor, - BlendFactorOneMinusBlendColor, -}; - -struct BlendStateDescriptor { - bool enabled; - BlendFactor srcColorFactor; - BlendFactor dstColorFactor; - BlendFactor srcAlphaFactor; - BlendFactor dstAlphaFactor; - BlendOperation colorOperation; - BlendOperation alphaOperation; - int writeMask; -}; - -BlendState* CreateBlendState(Device* device, BlendStateDescriptor* descriptor); -``` - -Translation to backing APIs would be the following: - - **D3D12**: when filling the `D3D12_GRAPHICS_PIPELINE_DESC`, `BlendState` will be filled with data coming from the `BlendStates` referenced in the `RenderPipelineDescriptor`. - Translation from a `BlendState` to a `D3D12_RENDER_TARGET_BLEND_DESC` is trivial. - - **Metal**: the `BlendStates` will be used to fill all of the data for a `MTLRenderPipelineColorAttachmentDescriptor` but `pixelFormat`. - Translation of individual members is trivial. - - **Vulkan**: the `BlendStates` will be translated to elements of `pAttachments` in the `VkPipelineColorBlendStateCreateInfo`. - Translation of individual members is trivial. - -Open question: Should enablement of independent attachment blend state be explicit like in D3D12 or explicit? - -Open question: Should alpha to coverage be part of the multisample state or the blend state? - -### Depth stencil state - -Mismatch: - - D3D12 doesn’t have per-face stencil read and write masks. - - In Metal the depth stencil state is built and bound separately from the pipeline state. - -```cpp -enum CompareFunction { - CompareFunctionNever, - CompareFunctionLess, - CompareFunctionLessEqual, - CompareFunctionGreater, - CompareFunctionGreaterEqual, - CompareFunctionEqual, - CompareFunctionNotEqual, - CompareFunctionAlways, -}; - -enum StencilOperation { - StencilOperationKeep, - StencilOperationZero, - StencilOperationReplace, - StencilOperationInvert, - StencilOperationIncrementClamp, - StencilOperationDecrementClamp, - StencilOperationIncrementWrap, - StencilOperationDecrementWrap, -}; - -struct StencilFaceDescriptor { - CompareFunction stencilCompare; - StencilOperation stencilPass; - StencilOperation stencilFail; - StencilOperation depthFail; -}; - -struct DepthStencilStateDescriptor { - CompareFunction depthCompare; - StencilFaceDescriptor front; - StencilFaceDescriptor back; - int stencilReadMask; - Int stencilWriteMask; -}; - -DepthStencilState* CreateDepthStencilState(Device* device, DepthStencilDescriptor* descriptor); -``` - -Translation to backing APIs would be the following: - - **D3D12**: `DepthStencilState` translates trivially to a `D3D12_DEPTH_STENCIL_DESC`. - `DepthEnable` would be set as `depthCompare != Always`. - - **Metal**: `DepthStencilState` translates trivially to `MTLDepthStencilDescriptor` except that front and back stencil masks have to be set to the single stencil mask value from WebGPU. - When a pipeline is bound, the corresponding depth-stencil state is bound at the same time. - - **Vulkan**: `DepthStencilState` translates trivially to `VkPipelineDepthStencilStateCreateInfoxcept` except that front and back stencil masks have to be set to the single stencil mask value from WebGPU. - `depthTestEnable` would be set to `depthCompare != Always`. - -Question: What about Vulkan’s `VkPipelineDepthStencilStateCreateInfo::depthBoundTestEnable` and D3D12's `D3D12_DEPTH_STENCIL_DESC1::DepthBoundsTestEnable`? - -Answer: deferred post-MVP. - -Open question: Should “depth test enable” be implicit or explicit? diff --git a/design/RejectedErrorHandling.md b/design/RejectedErrorHandling.md deleted file mode 100644 index a0e5b3698f..0000000000 --- a/design/RejectedErrorHandling.md +++ /dev/null @@ -1,61 +0,0 @@ -# Rejected Fatal Error Handling Revisions - -Appendix document for [ErrorHandling.md](ErrorHandling.md). - -Revisions in this document were rejected by the author (@kainino0x) before publishing. -They are kept for posterity, as examples of previous ideas. - -## Revision 3-ish - -The `GPUAdapter` and `GPUDevice` are event targets which receive events about adapter and device status. - -```webidl -partial interface GPUAdapter : EventTarget { - readonly attribute boolean isReady; -}; - -interface GPUAdapterLostEvent : Event { - readonly attribute DOMString reason; -}; - -interface GPUAdapterReadyEvent : Event {}; -``` - -```webidl -partial interface GPUDevice : EventTarget {}; - -interface GPUDeviceLostEvent : Event { - readonly attribute boolean recoverable; - readonly attribute DOMString reason; -}; -``` - -If `GPUAdapter`'s `isReady` attribute is false, `createDevice` will fail. -`isReady` may be set to `false` when a `"gpu-device-lost"` event fires. -It will always be set to `true` when a `"gpu-adapter-ready"` event fires. - - - `GPUAdapter` `"gpu-adapter-lost" -> GPUAdapterLostEvent`: - Signals that the `GPUAdapter` cannot be used anymore. - Sets the adapter's status to `"invalid"`. - Any further `createDevice` calls will return invalid objects. - - - `GPUAdapter` `"gpu-adapter-ready" -> GPUAdapterReadyEvent`: - Signals when it is okay to create new devices on this adapter. - It may fire only if: - - the adapter is still valid, - - the adapter's `isReady` attribute is `true`, and - - the adapter's `isReady` attribute was `false`. - - - `GPUDevice` `"gpu-device-lost" -> GPUDeviceLostEvent`: - Signals that the `GPUDevice` cannot be used anymore. - Sets the status of the device and its objects to `"invalid"`. - (The `"gpulogentry"` event will not fire after a device loss, so this makes all further operations on the device effectively no-ops.) - This may happen if something goes fatally wrong on the device (e.g. unexpected out-of-memory, crash, or native device loss). - When this event is handled, the adapter's `isReady` attribute may be `false`, which indicates the application cannot make new devices. - This event **may** cause the adapter's `isReady` attribute to become `false`. - - -### Rejected - -This scheme requires apps to do a spaghettical incantation in order to know what to do, and when. -It involves listening to all of these events, diligently checking flags in the event handlers, and understanding weird races (like an adapter became ready and then was immediately lost, or an adapter became ready and then vends an immediately lost device). diff --git a/design/TimelineFences.md b/design/TimelineFences.md deleted file mode 100644 index 22c67234b1..0000000000 --- a/design/TimelineFences.md +++ /dev/null @@ -1,95 +0,0 @@ -# Timeline fences - -Having fences store a number internally and wait / signal numbers is what we call numerical fences. -This is the design that D3D12 chose for `ID3D12Fence`. -D3D12 allows waiting on a fence before it is signaled, but Vulkan disallows doing this on `VkSemaphore` because some OSes lack kernel primitive to wait-before-signal. -This means that WebGPU will have to validate that any wait on a fence will be unblocked by a prior signal operation that has been enqueued. -To simplify the validation of signal-before-wait, we can force signaled number to be strictly increasing. - -Thus each fence has two pieces of internal state: - - The signaled value, the latest value passed to a signal to the fence, which is also the greatest thanks to the monotonicity - - The completed value, the value corresponding to the latest signal operation that has been executed. - -The fences will require additional restrictions and operations if WebGPU has multiple queues. -These changes will be tagged with [multi-queue] - -# Creating fences - -To mirror native APIs, fences are created directly on the `WebGPUDevice`. - -```webidl -dictionary WebGPUFenceDescriptor { - u64 initialValue = 0; - - // [multi-queue] - WebGPUQueue signalQueue = null; -}; - -partial interface WebGPUFence {}; - -partial interface WebGPUDevice { - WebGPUFence createFence(WebGPUFenceDescriptor descriptor); -}; -``` - -The fence is created with both internal values set to `initialValue` and [multi-queue] is restricted to be signaled on `signalQueue`. -If `signalQueue` is set to `null`, it will act as if it was set to the "default queue" (pending multi-queue definition of what that is). - -# Signaling - -Signaling a fence is done like via this method: - -```webidl -partial interface WebGPUQueue { - void signal(WebGPUFence fence, u64 signalValue); -}; -``` - -A Javascript exception is generated if: - - `value` is smaller or equal to the signaled value of the fence. - - [multi-queue] the fence must be signaled on the queue passed as the `signalQueue` of its descriptor. - This restriction is to make sure the signal operations are well-ordered, which would be more complicated if you could signal on multiple queues. - -After the call the signal value is updated to `signalValue`. - -## Observing fences on the CPU - -Observing the state of a fence on the CPU can be done by the following synchronous and non-blocking call: - -```webidl -partial interface WebGPUFence { - u64 getCompletedValue(); -}; -``` - -Alternatively it is possible to wait for a given value to be completed: - -```webidl -partial interface WebGPUFence { - Promise onCompletion(u64 value); -}; -``` - -This call generates a Javascript exception if `value` is greater than the fence's signaled value (to make sure the promise completes in finite time). -There is no way to synchronously wait on a fence in this proposal (it is better handled via the requestMainLoop proposal). -The promise is completed as soon as the fence's completed value is higher than `value`. -The promise can be rejected for example if the device is lost. - -## Waiting on fences on the GPU [multi-queue] - -If there are multiple queues, it would be possible to wait on a fence on a different queue: - -```webidl -partial interface WebGPUQueue { - void wait(WebGPUFence fence, u64 value); -}; -``` - -This call generates a Javascript exception if `value` is greater than the fence's signaled value. -It makes further execution on the queue wait until the value is passed on the fence. - -## Questions - - - Should we call fences "timelines" and have them created on queues like so `queue.createTimeline()`? - - How do we wait synchronously on fences? - Maybe it could be similar to `Atomics.wait`? diff --git a/design/UsageValidationRules.md b/design/UsageValidationRules.md deleted file mode 100644 index d649a8f4c4..0000000000 --- a/design/UsageValidationRules.md +++ /dev/null @@ -1,82 +0,0 @@ -## Definition of "resources being used" - -Resources in WebGPU can be used in different ways that are declared at resource creation time and interact with various validation rules. - -Buffers have the following usages (and commands inducing that usage): - - `VERTEX` for the `buffers` in calls to `WebGPUCommandEncoder.setVertexBuffers` - - `INDEX` for `buffer` in calls to `WebGPUCommandEncoder.setIndexBuffer` - - `INDIRECT` for `indirectBuffer` in calls to `WebGPUCommandEncode.{draw|drawIndexed|dispatch}Indirect` - - `UBO` and `STORAGE` for buffers referenced by bindgroups passed to `setBindGroup`, with the usage corresponding to the binding's type. - - `COPY_SRC` for buffers used as the copy source of various commands. - - `COPY_DST` for buffers used as the copy destination of various commands. - - (Maybe `STORAGE_TEXEL` and `UNIFORM_TEXEL`?) - -Textures have the following usages: - - `RENDER_ATTACHMENT` for the subresources referenced by `WebGPURenderPassDescriptor` - - `SAMPLED` and `STORAGE` for subresources corresponding to the image views referenced by bindgroups passed to `setBindGroup`, with the usage corresponding to the binding's type. - - `COPY_SRC` for textures used as the copy source of various commands. - - `COPY_DST` for textures used as the copy destination of various commands. - -Read only usages are `VERTEX`, `INDEX`, `INDIRECT`, `UBO`, `COPY_SRC` and `SAMPLED`. - -## Render passes - -In render passes the only writable resources are textures used as `RENDER_ATTACHMENT` and resources used as `STORAGE`. - -To avoid data hazards the simplest validation rules would be to check for every subresource that it is used as either: - - `RENDER_ATTACHMENT` - - A combination of read-only usages - - `STORAGE` - -If there is no usage of `STORAGE` then there are no data-races as everything is read-only, except `RENDER_ATTACHMENT` which is well-ordered. -For `STORAGE` resources, the order of reads and writes to the same memory location is undefined and it is up to the application to ensure data-race freeness. - -## Compute passes - -They are similar to render passes, every subresource (alternatively every resource) must be used as either: - - A combination of read-only usages - - `STORAGE` - -In a single `dispatch` command, the order of reads and writes to the same memory location of a `STORAGE` resource is undefined. -It is up to the application to ensure data-race freeness. - -## Other operations - -Assuming we don't have "copy passes" and that other operations are top-level, then a command buffers looks like a sequence of: - - Compute passes - - Render passes - - Copies and stuff - -There are no particular constraints on the usage of the operations (apart from resources having been created with the proper usage flags). -Implementations ensure that there are no-data races between any of these operations. -There are validation rules to check for example that the source and destinations regions of buffer copies don't have overlap. - -## Discussion - -In this proposal the only sources of data races come from the `STORAGE` usage: - - In render passes inside a single drawcall and between drawcalls for accesses to memory written to as `STORAGE`. - - In any compute passes inside a single dispatch for accesses to memory written to as `STORAGE` - -There are cases where the resource usage tracking could get expensive for example if subresources of a texture with an odd index use `STORAGE` while even ones use `SAMPLED`. -That said such a case is unlikely to happen in real-world applications. - -The cost of the state tracking necessary to add memory barriers inside compute passes and at command buffer top-level seems manageable (though difficult to describe in the 30 minutes I have before the meeting). - -There is no read-only storage that could be used in combination with other read-only usages because some APIs don't support it. -For example D3D12 assumes UAV is always writeable and disallows transitioning to a combination of UAV and a read-only usage. - -## Open questions - -### More constrained texture usage validation - -Each layer and mip-level of textures can have an independent usage which means implementations might need to track usage per mip-level per layer of a resource. -If this is deemed too costly, we could only have two sub-resources tracked in textures: the part as `RENDER_ATTACHMENT` and the rest. -This would mean for example that a texture couldn't be used as both `STORAGE` and a read-only usage inside a render pass. -The state tracking required in implementation would become significantly simpler at the cost of flexibility for the applications. -When using the more constrained version of usage validation for textures, the cost of validation is O(commands). - -### Post MVP: Opt-in for cross-dispatch data-races - -Enforcing no data-races between dispatches could potentially result in a lot of ALU time being wasted on big GPUs. -We could allow applications to opt into cross-dispatch data-races to get some of the performance back by manually placing "UAV barriers". -A `WebGPUComputePassDescriptor.manualStorageBarriers` option could be added as well as a `WebGPUCommandEncoder.storageBarrier(...)` command. diff --git a/explainer/Makefile b/explainer/Makefile deleted file mode 100644 index 06cd95a772..0000000000 --- a/explainer/Makefile +++ /dev/null @@ -1,8 +0,0 @@ -all: index.html - -index.html: index.bs - bikeshed --die-on=everything spec index.bs - -online: - curl https://api.csswg.org/bikeshed/ -F file=@index.bs -F output=err - curl https://api.csswg.org/bikeshed/ -F file=@index.bs -F force=1 > index.html diff --git a/explainer/index.bs b/explainer/index.bs deleted file mode 100644 index 47369de203..0000000000 --- a/explainer/index.bs +++ /dev/null @@ -1,1143 +0,0 @@ - - -Issue(tabatkins/bikeshed#2006): Set up cross-linking into the WebGPU and WGSL specs. - - -# Motivation # {#motivation} - -See [Introduction](https://gpuweb.github.io/gpuweb/#introduction). - - -# Security/Privacy # {#security} - -See [Malicious use considerations](https://gpuweb.github.io/gpuweb/#malicious-use). - - -# Additional Background # {#background} - - -## Sandboxed GPU Processes in Web Browsers ## {#gpu-process} - -A major design constraint for WebGPU is that it must be implementable and efficient in browsers that use a GPU-process architecture. -GPU drivers need access to additional kernel syscalls than what's otherwise used for Web content, and many GPU drivers are prone to hangs or crashes. -To improve stability and sandboxing, browsers use a special process that contains the GPU driver and talks with the rest of the browser through asynchronous IPC. -GPU processes are (or will be) used in Chromium, Gecko, and WebKit. - -GPU processes are less sandboxed than content processes, and they are typically shared between multiple origins. -Therefore, they must validate all messages, for example to prevent a compromised content process from being able to look at the GPU memory used by another content process. -Most of WebGPU's validation rules are necessary to ensure it is secure to use, so all the validation needs to happen in the GPU process. - -Likewise, all GPU driver objects only live in the GPU process, including large allocations (like buffers and textures) and complex objects (like pipelines). -In the content process, WebGPU types (`GPUBuffer`, `GPUTexture`, `GPURenderPipeline`, ...) are mostly just "handles" that identify objects that live in the GPU process. -This means that the CPU and GPU memory used by WebGPU object isn't necessarily known in the content process. -A `GPUBuffer` object can use maybe 150 bytes of CPU memory in the content process but hold a 1GB allocation of GPU memory. - -See also the description of [the content and device timelines in the specification](https://gpuweb.github.io/gpuweb/#programming-model-timelines). - - -# JavaScript API # {#api} - - -## Adapter Selection and Device Init ## {#initialization} - -Issue: Some changes are expected here. - -A WebGPU "adapter" (`GPUAdapter`) is an object which provides a connection to a particular WebGPU -implementation on the system (e.g. a hardware accelerated implementation on an integrated or -discrete GPU, or software implementation). -To get a `GPUAdapter` is to select which implementation to use (if multiple are available). - -Each adapter may have different optional capabilities called "features" and "limits". -These are the maximum possible capabilities that can be requested when a device is created. - -To get an adapter, an application calls `navigator.gpu.requestAdapter()`, optionally passing -options which may influence what adapter is chosen. - -`requestAdapter()` always resolves, but may resolve to null if an adapter can't be returned with -the specified options. - -
-const adapter = await navigator.gpu.requestAdapter(options);
-if (!adapter) return goToFallback();
-
- -A WebGPU "device" (`GPUDevice`) represents a connection to a WebGPU implementation, as well as -an arena for all WebGPU objects created from it (textures, command buffers, etc.) -All WebGPU usage is done through a WebGPU "device" (`GPUDevice`) or objects created from it. -In this sense, it serves a subset of the purpose of `WebGLRenderingContext`; however, unlike -`WebGLRenderingContext`, it is not associated with a canvas object, and most commands are -issued through "child" objects. - -To get a device, an application calls `adapter.requestDevice()`, optionally passing a descriptor -which enables additional optional capabilities (features and limits). -When any work is issued to the device, it is strictly validated against the capabilities passed -to `requestDevice()` - not the capabilities of the adapter. - -`requestDevice()` will reject (only) if the request exceeds the capabilities of the adapter. -It may *not* resolve to `null`; instead, to simplify the number of different cases an app must -handle, it may resolve to a `GPUDevice` which has already been lost - see [[#device-loss]]. - -
-const device = await adapter.requestDevice(descriptor);
-device.lost.then(recoverFromDeviceLoss);
-
- -An adapter may become unavailable, e.g. if it is unplugged from the system, disabled to save -power, or marked "stale" (`[[current]]` becomes false). -Such an adapter can no longer vend valid devices, and always returns already-lost `GPUDevice`s. - - -## Object Validity and Destroyed-ness ## {#invalid-and-destroyed} - -### WebGPU's Error Monad ### {#error-monad} - -A.k.a. Contagious Internal Nullability. -A.k.a. transparent [promise pipelining](http://erights.org/elib/distrib/pipeline.html). - -WebGPU is a very chatty API, with some applications making tens of thousands of calls per frame to render complex scenes. -We have seen that the GPU processes needs to validate the commands to satisfy their security property. -To avoid the overhead of validating commands twice in both the GPU and content process, WebGPU is designed so Javascript calls can be forwarded directly to the GPU process and validated there. -See the error section for more details on what's validated where and how errors are reported. - -At the same time, during a single frame WebGPU objects can be created that depend on one another. -For example a `GPUCommandBuffer` can be recorded with commands that use temporary `GPUBuffer`s created in the same frame. -In this example, because of the performance constraint of WebGPU, it is not possible to send the message to create the `GPUBuffer` to the GPU process and synchronously wait for its processing before continuing Javascript execution. - -Instead, in WebGPU all objects (like `GPUBuffer`) are created immediately on the content timeline and returned to JavaScript. -The validation is almost all done asynchronously on the "device timeline". -In the good case, when no errors occur (validation or out-of-memory), everything looks to JS as if it is synchronous. -However, when an error occurs in a call, it becomes a no-op (aside from error reporting). -If the call returns an object (like `createBuffer`), the object is tagged as "invalid" on the GPU process side. - -All WebGPU calls validate that all their arguments are valid objects. -As a result, if a call takes one WebGPU object and returns a new one, the new object is also invalid (hence the term "contagious"). - -
-
- Timeline diagram of messages passing between processes, demonstrating how errors are propagated without synchronization. -
- -
- -
- Using the API when doing only valid calls looks like a synchronous API: - -
-        const srcBuffer = device.createBuffer({
-            size: 4,
-            usage: GPUBufferUsage.COPY_SRC
-        });
-
-        const dstBuffer = ...;
-
-        const encoder = device.createCommandEncoder();
-        encoder.copyBufferToBuffer(srcBuffer, 0, dstBuffer, 0, 4);
-
-        const commands = encoder.finish();
-        device.queue.submit([commands]);
-    
-
- -
- Errors propagate contagiously when creating objects: - -
-        // The size of the buffer is too big, this causes an OOM and srcBuffer is invalid.
-        const srcBuffer = device.createBuffer({
-            size: BIG_NUMBER,
-            usage: GPUBufferUsage.COPY_SRC
-        });
-
-        const dstBuffer = ...;
-
-        // The encoder starts as a valid object.
-        const encoder = device.createCommandEncoder();
-        // Special case: an invalid object is used when encoding commands so the encoder
-        // becomes invalid.
-        encoder.copyBufferToBuffer(srcBuffer, 0, dstBuffer, 0, 4);
-
-        // commands, the this argument to GPUCommandEncoder.finish is invalid
-        // so the call returns an invalid object.
-        const commands = encoder.finish();
-        // The command references an invalid object so it becomes a noop.
-        device.queue.submit([commands]);
-    
-
- -#### Mental Models #### {#error-monad-mental-model} - -One way to interpret WebGPU's semantics is that every WebGPU object is actually a `Promise` internally and that all WebGPU methods are `async` and `await` before using each of the WebGPU objects it gets as argument. -However the execution of the async code is outsourced to the GPU process (where it is actually done synchronously). - -Another way, closer to actual implementation details, is to imagine that each `GPUFoo` JS object maps to a `gpu::InternalFoo` C++/Rust object on the GPU process that contains a `bool isValid`. -Then during the validation of each command on the GPU process, the `isValid` are all checked and a new, invalid object is returned if validation fails. -On the content process side, the `GPUFoo` implementation doesn't know if the object is valid or not. - -### Early Destruction of WebGPU Objects ### {#early-destroy} - -Most of the memory usage of WebGPU objects is in the GPU process: it can be GPU memory held by objects like `GPUBuffer` and `GPUTexture`, serialized commands held in CPU memory by `GPURenderBundles`, or complex object graphs for the WGSL AST in `GPUShaderModule`. -The JavaScript garbage collector (GC) is in the renderer process and doesn't know about the memory usage in the GPU process. -Browsers have many heuristics to trigger GCs but a common one is that it should be triggered on memory pressure scenarios. -However a single WebGPU object can hold on to MBs or GBs of memory without the GC knowing and never trigger the memory pressure event. - -It is important for WebGPU applications to be able to directly free the memory used by some WebGPU objects without waiting for the GC. -For example applications might create temporary textures and buffers each frame and without the explicit `.destroy()` call they would quickly run out of GPU memory. -That's why WebGPU has a `.destroy()` method on those object types which can hold on to arbitrary amount of memory. -It signals that the application doesn't need the content of the object anymore and that it can be freed as soon as possible. -Of course, it becomes a validation to use the object after the call to `.destroy()`. - -
-
-        const dstBuffer = device.createBuffer({
-            size: 4
-            usage: GPUBufferUsage.COPY_DST
-        });
-
-        // The buffer is not destroyed (and valid), success!
-        device.queue.writeBuffer(dstBuffer, 0, myData);
-
-        buffer.destroy();
-
-        // The buffer is now destroyed, commands using that would use its
-        // content produce validation errors.
-        device.queue.writeBuffer(dstBuffer, 0, myData);
-    
-
- -Note that, while this looks somewhat similar to the behavior of an invalid buffer, it is distinct. -Unlike invalidity, destroyed-ness can change after creation, is not contagious, and is validated only when work is actually submitted (e.g. `queue.writeBuffer()` or `queue.submit()`), not when creating dependent objects (like command encoders, see above). - - -## Errors ## {#errors} - -In a simple world, error handling in apps would be synchronous with JavaScript exceptions. -However, for multi-process WebGPU implementations, this is prohibitively expensive. - -See [[#invalid-and-destroyed]], which also explains how the *browser* handles errors. - -### Problems and Solutions ### {#errors-solutions} - -Developers and applications need error handling for a number of cases: - -- *Debugging*: - Getting errors synchronously during development, to break in to the debugger. -- *Fatal Errors*: - Handling device/adapter loss, either by restoring WebGPU or by fallback to non-WebGPU content. -- *Fallible Allocation*: - Making fallible GPU-memory resource allocations (detecting out-of-memory conditions). -- *Fallible Validation*: - Checking success of WebGPU calls, for applications' unit/integration testing, WebGPU - conformance testing, or detecting errors in data-driven applications (e.g. loading glTF - models that may exceed device limits). -- *Telemetry*: - Collecting error logs in deployment, for bug reporting and telemetry. - -The following sections go into more details on these cases and how they are solved. - -#### Debugging #### {#errors-cases-debugging} - -**Solution:** Dev Tools. - -Implementations should provide a way to enable synchronous validation, -for example via a "break on WebGPU error" option in the developer tools. - -This can be achieved with a content-process-gpu-process round-trip in every validated WebGPU -call, though in practice this would be very slow. -It can be optimized by running a second, approximated mirror of the validation steps in the -content process (it will not always have the same results since it cannot immediately know about -out-of-memory errors). - -#### Fatal Errors: Adapter and Device Loss #### {#errors-cases-fatalerrors} - -**Solution:** [[#device-loss]]. - -#### Fallible Allocation, Fallible Validation, and Telemetry #### {#errors-cases-other} - -**Solution:** *Error Scopes*. - -For important context, see [[#invalid-and-destroyed]]. In particular, all errors (validation and -out-of-memory) are detected asynchronously, in a remote process. -In the WebGPU spec, we refer to the thread of work for each WebGPU device as its "device timeline". - -As such, applications need a way to instruct the device timeline on what to do with any errors -that occur. To solve this, WebGPU uses *Error Scopes*. - -### Error Scopes ### {#errors-errorscopes} - -WebGL exposed errors using a `getError` function returning the first error the last `getError` call. -This is simple, but has two problems. - -- It is synchronous, incurring a round-trip and requiring all previously issued work to be finished. - We solve this by returning errors asynchronously. -- Its flat state model composes poorly: errors can leak to/from unrelated code, possibly in - libraries/middleware, browser extensions, etc. We solve this with a stack of error "scopes", - allowing each component to hermetically capture and handle its own errors. - -Each device1 maintains a persistent "error scope" stack state. -Initially, the device's error scope stack is empty. -`GPUDevice.pushErrorScope('validation')` or `GPUDevice.pushErrorScope('out-of-memory')` -begins an error scope and pushes it onto the stack. -This scope captures only errors of a particular type depending on the type of error the application -wants to detect. - -`GPUDevice.popErrorScope()` ends an error scope, popping it from the stack and returning a -`Promise`, which resolves once all enclosed fallible operations have completed and -reported back. -It resolves to `null` if no errors were captured, and otherwise resolves to an object describing -the first error that was captured by the scope - either a `GPUValidationError` or a -`GPUOutOfMemoryError`. - -Any device-timeline error from an operation is passed to the top-most error scope on the stack at -the time it was issued. - -- If an error scope captures an error, the error is not passed down the stack. - Each error scope stores only the **first** error it captures; any further errors it captures - are **silently ignored**. -- If not, the error is passed down the stack to the enclosing error scope. -- If an error reaches the bottom of the stack, it **may**2 fire the `uncapturederror` - event on `GPUDevice`3 (and could issue a console warning as well). - -1 -In the plan to add [[#multithreading]], error scope state to actually be **per-device, per-realm**. -That is, when a GPUDevice is posted to a Worker for the first time, the error scope stack for -that device+realm is always empty. -(If a GPUDevice is copied *back* to an execution context it already existed on, it shares its -error scope state with all other copies on that execution context.) - -2 -The implementation may not choose to always fire the event for a given error, for example if it -has fired too many times, too many times rapidly, or with too many errors of the same kind. -This is similar to how Dev Tools console warnings work today for WebGL. -In poorly-formed applications, this mechanism can prevent the events from having a significant -performance impact on the system. - -3 -More specifically, with [[#multithreading]], this event would only exists on the *originating* -`GPUDevice` (the one that came from `createDevice`). -It doesn't exist on `GPUDevice`s produced by sending messages. - -```webidl -enum GPUErrorFilter { - "out-of-memory", - "validation" -}; - -interface GPUOutOfMemoryError { - constructor(); -}; - -interface GPUValidationError { - constructor(DOMString message); - readonly attribute DOMString message; -}; - -typedef (GPUOutOfMemoryError or GPUValidationError) GPUError; - -partial interface GPUDevice { - undefined pushErrorScope(GPUErrorFilter filter); - Promise popErrorScope(); -}; -``` - -#### How this solves *Fallible Allocation* #### {#errors-errorscopes-allocation} - -If a call that fallibly allocates GPU memory (e.g. `createBuffer` or `createTexture`) fails, the -resulting object is invalid (same as if there were a validation error), but an `'out-of-memory'` -error is generated. -An `'out-of-memory'` error scope can be used to detect it. - -**Example: tryCreateBuffer** - -```ts -async function tryCreateBuffer(device: GPUDevice, descriptor: GPUBufferDescriptor): Promise { - device.pushErrorScope('out-of-memory'); - const buffer = device.createBuffer(descriptor); - if (await device.popErrorScope() !== null) { - return null; - } - return buffer; -} -``` - -This interacts with buffer mapping in subtle ways, but they are not explained here. -The principle used to design the interaction is that app code should need to handle as few -different edge cases as possible, so multiple kinds of situations should result in the same -behavior. - -#### How this solves *Fallible Validation* #### {#errors-errorscopes-validation} - -A `'validation'` error scope can be used to detect validation errors, as above. - -**Example: Testing** - -```ts -device.pushErrorScope('out-of-memory'); -device.pushErrorScope('validation'); - -{ - // (Do stuff that shouldn't produce errors.) - - { - device.pushErrorScope('validation'); - device.doOperationThatIsExpectedToError(); - device.popErrorScope().then(error => { assert(error !== null); }); - } - - // (More stuff that shouldn't produce errors.) -} - -// Detect unexpected errors. -device.popErrorScope().then(error => { assert(error === null); }); -device.popErrorScope().then(error => { assert(error === null); }); -``` - -#### How this solves *Telemetry* #### {#errors-errorscopes-telemetry} - -As mentioned above, if an error is not captured by an error scope, it **may** fire the -originating device's `uncapturederror` event. -Applications can either watch for that event, or encapsulate parts of their application with -error scopes, to detect errors for generating error reports. - -`uncapturederror` is not strictly necessary to solve this, but has the benefit of providing a -single stream for uncaptured errors from all threads. - -#### Error Messages and Debug Labels #### {#errors-errorscopes-labels} - -Every WebGPU object has a read-write attribute, `label`, which can be set by the application to -provide information for debugging tools (error messages, native profilers like Xcode, etc.) -Every WebGPU object creation descriptor has a member `label` which sets the initial value of the -attribute. - -Additionally, parts of command buffers can be labeled with debug markers and debug groups. -See [[#command-encoding-debug]]. - -For both debugging (dev tools messages) and telemetry, implementations can choose to report some -kind of "stack trace" in their error messages, taking advantage of object debug labels. -For example: - -``` -.submit failed: -- commands[0] () was invalid: -- in the debug group : -- in the debug group : -- in setIndexBuffer, indexBuffer () was invalid: -- in createBuffer, desc.usage was invalid (0x89) -``` - - -## Device Loss ## {#device-loss} - -Any situation that prevents further use of a `GPUDevice`, regardless of whether it is caused by a -WebGPU call (e.g. `device.destroy()`, unrecoverable out-of-memory, GPU process crash, or GPU -reset) or happens externally (e.g. GPU unplugged), results in a device loss. - -**Design principle:** -There should be as few different-looking error behaviors as possible. -This makes it easier for developers to test their app's behavior in different situations, -improves robustness of applications in the wild, and improves portability between browsers. - -Issue: Finish this explainer (see [ErrorHandling.md](https://github.com/gpuweb/gpuweb/blob/main/design/ErrorHandling.md#fatal-errors-requestadapter-requestdevice-and-devicelost)). - - -## Buffer Mapping ## {#buffer-mapping} - -A `GPUBuffer` represents a memory allocations usable by other GPU operations. -This memory can be accessed linearly, contrary to `GPUTexture` for which the actual memory layout of sequences of texels are unknown. Think of `GPUBuffers` as the result of `gpu_malloc()`. - -**CPU→GPU:** When using WebGPU, applications need to transfer data from JavaScript to `GPUBuffer` very often and potentially in large quantities. -This includes mesh data, drawing and computations parameters, ML model inputs, etc. -That's why an efficient way to update `GPUBuffer` data is needed. `GPUQueue.writeBuffer` is reasonably efficient but includes at least an extra copy compared to the buffer mapping used for writing buffers. - -**GPU→CPU:** Applications also often need to transfer data from the GPU to Javascript, though usually less often and in lesser quantities. -This includes screenshots, statistics from computations, simulation or ML model results, etc. -This transfer is done with buffer mapping for reading buffers. - -### Background: Memory Visibility with GPUs and GPU Processes ### {#buffer-mapping-background} - -The two major types of GPUs are called "integrated GPUs" and "discrete GPUs". -Discrete GPUs are separate from the CPU; they usually come as PCI-e cards that you plug into the motherboard of a computer. -Integrated GPUs live on the same die as the CPU and don't have their own memory chips; instead, they use the same RAM as the CPU. - -When using a discrete GPU, it's easy to see that most GPU memory allocations aren't visible to the CPU because they are inside the GPU's RAM (or VRAM for Video RAM). -For integrated GPUs most memory allocations are in the same physical places, but not made visible to the GPU for various reasons (for example, the CPU and GPU can have separate caches for the same memory, so accesses are not cache-coherent). -Instead, for the CPU to see the content of a GPU buffer, it must be "mapped", making it available in the virtual memory space of the application (think of mapped as in `mmap()`). -GPUBuffers must be specially allocated in order to be mappable - this can make it less efficient to access from the GPU (for example if it needs to be allocate in RAM instead of VRAM). - -All this discussion was centered around native GPU APIs, but in browsers, the GPU driver is loaded in the _GPU process_, so native GPU buffers can be mapped only in the GPU process's virtual memory. -In general, it is not possible to map the buffer directly inside the _content process_ (though some systems can do this, providing optional optimizations). -To work with this architecture an extra "staging" allocation is needed in shared memory between the GPU process and the content process. - -The table below recapitulates which type of memory is visible where: - - - - - - - - - -
- Regular `ArrayBuffer` - Shared Memory - Mappable GPU buffer - Non-mappable GPU buffer (or texture) -
CPU, in the content process - **Visible** - **Visible** - Not visible - Not visible -
CPU, in the GPU process - Not visible - **Visible** - **Visible** - Not visible -
GPU - Not visible - Not visible - **Visible** - **Visible** -
- -### CPU-GPU Ownership Transfer ### {#buffer-mapping-ownership} - -In native GPU APIs, when a buffer is mapped, its content becomes accessible to the CPU. -At the same time the GPU can keep using the buffer's content, which can lead to data races between the CPU and the GPU. -This means that the usage of mapped buffer is simple but leaves the synchronization to the application. - -On the contrary, WebGPU prevents almost all data races in the interest of portability and consistency. -In WebGPU there is even more risk of non-portability with races on mapped buffers because of the additional "shared memory" step that may be necessary on some drivers. -That's why `GPUBuffer` mapping is done as an ownership transfer between the CPU and the GPU. -At each instant, only one of the two can access it, so no race is possible. - -When an application requests to map a buffer, it initiates a transfer of the buffer's ownership to the CPU. -At this time, the GPU may still need to finish executing some operations that use the buffer, so the transfer doesn't complete until all previously-enqueued GPU operations are finished. -That's why mapping a buffer is an asynchronous operation (we'll discuss the other arguments below): - - -typedef [EnforceRange] unsigned long GPUMapModeFlags; -interface GPUMapMode { - const GPUFlagsConstant READ = 0x0001; - const GPUFlagsConstant WRITE = 0x0002; -}; - -partial interface GPUBuffer { - Promise<undefined> mapAsync(GPUMapModeFlags mode, - optional GPUSize64 offset = 0, - optional GPUSize64 size); -}; - - -
- Using it is done like so: - -
-        // Mapping a buffer for writing. Here offset and size are defaulted t
-        // so the whole buffer is mapped.
-        const myMapWriteBuffer = ...;
-        await myMapWriteBuffer.mapAsync(GPUMapMode.WRITE);
-
-        // Mapping a buffer for reading. Only the first four bytes are mapped.
-        const myMapReadBuffer = ...;
-        await myMapReadBuffer.mapAsync(GPUMapMode.READ, 0, 4);
-    
-
- -Once the application has finished using the buffer on the CPU, it can transfer ownership back to the GPU by unmapping it. -This is an immediate operation that makes the application lose all access to the buffer on the CPU (i.e. detaches `ArrayBuffers`): - - -partial interface GPUBuffer { - undefined unmap(); -}; - - -
- Using it is done like so: - -
-        const myMapReadBuffer = ...;
-        await myMapReadBuffer.mapAsync(GPUMapMode.READ, 0, 4);
-        // Do something with the mapped buffer.
-        buffer.unmap();
-    
-
- -When transferring ownership to the CPU, a copy may be necessary from the underlying mapped buffer to shared memory visible to the content process. -To avoid copying more than necessary, the application can specify which range it is interested in when calling `GPUBuffer.mapAsync`. - -`GPUBuffer.mapAsync`'s `mode` argument controls which type of mapping operation is performed. -At the moment its values are redundant with the buffer creation's usage flags, but it is present for explicitness and future extensibility. - -While a `GPUBuffer` is owned by the CPU, it is not possible to submit any operations on the device timeline that use it; otherwise, a validation error is produced. -However it is valid (and encouraged!) to record `GPUCommandBuffer`s using the `GPUBuffer`. - -### Creation of Mappable Buffers ### {#buffer-mapping-creation} - -The physical memory location for a `GPUBuffer`'s underlying buffer depends on whether it should be mappable and whether it is mappable for reading or writing (native APIs give some control on the CPU cache behavior for example). -At the moment mappable buffers can only be used to transfer data (so they can only have the correct `COPY_SRC` or `COPY_DST` usage in addition to a `MAP_*` usage), -That's why applications must specify that buffers are mappable when they are created using the (currently) mutually exclusive `GPUBufferUsage.MAP_READ` and `GPUBufferUsage.MAP_WRITE` flags: - -
-
-        const myMapReadBuffer = device.createBuffer({
-            usage: GPUBufferUsage.MAP_READ | GPUBufferUsage.COPY_DST,
-            size: 1000,
-        });
-        const myMapWriteBuffer = device.createBuffer({
-            usage: GPUBufferUsage.MAP_WRITE | GPUBufferUsage.COPY_SRC,
-            size: 1000,
-        });
-    
-
- -### Accessing Mapped Buffers ### {#buffer-mapping-access} - -Once a `GPUBuffer` is mapped, it is possible to access its memory from JavaScript - This is done by calling `GPUBuffer.getMappedRange`, which returns an `ArrayBuffer` called a "mapping". -These are available until `GPUBuffer.unmap` or `GPUBuffer.destroy` is called, at which point they are detached. -These `ArrayBuffer`s typically aren't new allocations, but instead pointers to some kind of shared memory visible to the content process (IPC shared memory, `mmap`ped file descriptor, etc.) - -When transferring ownership to the GPU, a copy may be necessary from the shared memory to the underlying mapped buffer. -`GPUBuffer.getMappedRange` takes an optional range of the buffer to map (for which `offset` 0 is the start of the buffer). -This way the browser knows which parts of the underlying `GPUBuffer` have been "invalidated" and need to be updated from the memory mapping. - -The range must be within the range requested in `mapAsync()`. - - -partial interface GPUBuffer { - ArrayBuffer getMappedRange(optional GPUSize64 offset = 0, - optional GPUSize64 size); -}; - - -
- Using it is done like so: - -
-        const myMapReadBuffer = ...;
-        await myMapReadBuffer.mapAsync(GPUMapMode.READ);
-        const data = myMapReadBuffer.getMappedRange();
-        // Do something with the data
-        myMapReadBuffer.unmap();
-    
-
- -### Mapping Buffers at Creation ### {#buffer-mapping-at-creation} - -A common need is to create a `GPUBuffer` that is already filled with some data. -This could be achieved by creating a final buffer, then a mappable buffer, filling the mappable buffer, and then copying from the mappable to the final buffer, but this would be inefficient. -Instead this can be done by making the buffer CPU-owned at creation: we call this "mapped at creation". -All buffers can be mapped at creation, even if they don't have the `MAP_WRITE` buffer usages. -The browser will just handle the transfer of data into the buffer for the application. - -Once a buffer is mapped at creation, it behaves as regularly mapped buffer: `GPUBUffer.getMappedRange()` is used to retrieve `ArrayBuffer`s, and ownership is transferred to the GPU with `GPUBuffer.unmap()`. - -
- Mapping at creation is done by passing `mappedAtCreation: true` in the buffer descriptor on creation: - -
-        const buffer = device.createBuffer({
-            usage: GPUBufferUsage.UNIFORM,
-            size: 256,
-            mappedAtCreation: true,
-        });
-        const data = buffer.getMappedRange();
-        // write to data
-        buffer.unmap();
-    
-
- -When using advanced methods to transfer data to the GPU (with a rolling list of buffers that are mapped or being mapped), mapping buffer at creation can be used to immediately create additional space where to put data to be transferred. - -### Examples ### {#buffer-mapping-examples} - -
- The optimal way to create a buffer with initial data, for example here a [Draco](https://google.github.io/draco/)-compressed 3D mesh: - -
-        const dracoDecoder = ...;
-
-        const buffer = device.createBuffer({
-            usage: GPUBuffer.VERTEX | GPUBuffer.INDEX,
-            size: dracoDecoder.decompressedSize,
-            mappedAtCreation: true,
-        });
-
-        dracoDecoder.decodeIn(buffer.getMappedRange());
-        buffer.unmap();
-    
-
- -
- Retrieving data from a texture rendered on the GPU: - -
-        const texture = getTheRenderedTexture();
-
-        const readbackBuffer = device.createBuffer({
-            usage: GPUBufferUsage.COPY_DST | GPUBufferUsage.MAP_READ,
-            size: 4 * textureWidth * textureHeight,
-        });
-
-        // Copy data from the texture to the buffer.
-        const encoder = device.createCommandEncoder();
-        encoder.copyTextureToBuffer(
-            { texture },
-            { buffer, rowPitch: textureWidth * 4 },
-            [textureWidth, textureHeight],
-        );
-        device.submit([encoder.finish()]);
-
-        // Get the data on the CPU.
-        await buffer.mapAsync(GPUMapMode.READ);
-        saveScreenshot(buffer.getMappedRange());
-        buffer.unmap();
-    
-
- -
- Updating a bunch of data on the GPU for a frame: - -
-        void frame() {
-            // Create a new buffer for our updates. In practice we would
-            // reuse buffers from frame to frame by re-mapping them.
-            const stagingBuffer = device.createBuffer({
-                usage: GPUBufferUsage.MAP_WRITE | GPUBufferUsage.COPY_SRC,
-                size: 16 * objectCount,
-                mappedAtCreation: true,
-            });
-            const stagingData = new Float32Array(stagingBuffer.getMappedRange());
-
-            // For each draw we are going to:
-            //  - Put the data for the draw in stagingData.
-            //  - Record a copy from the stagingData to the uniform buffer for the draw
-            //  - Encoder the draw
-            const copyEncoder = device.createCommandEncoder();
-            const drawEncoder = device.createCommandEncoder();
-            const renderPass = myCreateRenderPass(drawEncoder);
-            for (var i = 0; i < objectCount; i++) {
-                stagingData[i * 4 + 0] = ...;
-                stagingData[i * 4 + 1] = ...;
-                stagingData[i * 4 + 2] = ...;
-                stagingData[i * 4 + 3] = ...;
-
-                const {uniformBuffer, uniformOffset} = getUniformsForDraw(i);
-                copyEncoder.copyBufferToBuffer(
-                    stagingData, i * 16,
-                    uniformBuffer, uniformOffset,
-                    16);
-
-                encodeDraw(renderPass, {uniformBuffer, uniformOffset});
-            }
-            renderPass.endPass();
-
-            // We are finished filling the staging buffer, unmap() it so
-            // we can submit commands that use it.
-            stagingBuffer.unmap();
-
-            // Submit all the copies and then all the draws. The copies
-            // will happen before the draw such that each draw will use
-            // the data that was filled inside the for-loop above.
-            device.queue.submit([
-                copyEncoder.finish(),
-                drawEncoder.finish()
-            ]);
-        }
-    
-
- -## Multithreading ## {#multithreading} - -Multithreading is a key part of modern graphics APIs. -Unlike OpenGL, newer APIs allow applications to encode commands, submit work, transfer data to the GPU, and -so on, from multiple threads at once, alleviating CPU bottlenecks. -This is especially relevant to WebGPU, since IDL bindings are generally much slower than C calls. - -WebGPU does not *yet* allow multithreaded use of a single `GPUDevice`, but the API has been -designed from the ground up with this in mind. -This section describes the tentative plan for how it will work. - -As described in [[#gpu-process]], most WebGPU objects are actually just "handles" that refer to -objects in the browser's GPU process. -As such, it is relatively straightforward to allow these to be shared among threads. -For example, a `GPUTexture` object can simply be `postMessage()`d to another thread, creating a -new `GPUTexture` JavaScript object containing a handle to the *same* GPU-process object. - -Several objects, like `GPUBuffer`, have client-side state. -Applications still need to use them from multiple threads without having to `postMessage` such -objects back and forth with `[Transferable]` semantics (which would also create new wrapper -objects, breaking old references). -Therefore, these objects will also be `[Serializable]` but have (content-side) **shared state**, -just like `SharedArrayBuffer`. -For example, for threads Main and Worker: - -- Main: createBuffer → B1. -- Main: postMessage to Worker. -- Worker: receive message → B2. -- Worker: `B2.mapAsync()` → successfully puts the buffer in the "map pending" state. -- Main: `B1.mapAsync()` → **throws an exception**. -- Main: Encode some command that uses `B1`, like: - - ```js - encoder.copyBufferToTexture(B1, T); - const commandBuffer = encoder.finish(); - ``` - - → succeeds, because this doesn't depend on the buffer's client side state. -- Main: `queue.submit(commandBuffer)` → **asynchronous WebGPU error**, - because the CPU currently owns the buffer. -- Worker: waits for the mapping, writes to it, then calls `B2.unmap()`. -- Main: `queue.submit(commandBuffer)` → succeeds -- Main: `B1.mapAsync()` → successfully puts the buffer in the "map pending" state - -Further discussion can be found in [#354](https://github.com/gpuweb/gpuweb/issues/354) -(note not all of it reflects current thinking). - -### Unsolved: Synchronous Object Transfer ### {#multithreading-transfer} - -Some application architectures require objects to be passed between threads without having to -asynchronously wait for a message to arrive on the receiving thread. - -The most crucial class of such architectures are in WebAssembly applications: -Programs using native C/C++/Rust/etc. bindings for WebGPU will want to assume object handles -are plain-old-data (e.g. `typedef struct WGPUBufferImpl* WGPUBuffer;`) -that can be passed between threads freely. -Unfortunately, this cannot be implemented in C-on-JS bindings (e.g. Emscripten) without complex, -hidden, and slow asynchronicity (yielding on the receiving thread, interrupting the sending -thread to send a message, then waiting for the object on the receiving thread). - -Some alternatives are mentioned in issue [#747](https://github.com/gpuweb/gpuweb/issues/747): - -- `SharedObjectTable`, an object with shared-state (like `SharedArrayBuffer`) containing a table of - `[Serializable]` values. Effectively, a store into the table would serialize once, and then any - thread with the `SharedObjectTable` could (synchronously) deserialize the object on demand. -- A synchronous `MessagePort.receiveMessage()` method. - This would be less ideal as it would require any thread that creates one of these objects to - eagerly send it to every thread, just in case they need it later. -- Allow "exporting" a numerical ID for an object that can be used to "import" the object on - another thread. This bypasses the garbage collector and makes it easy to leak memory. - - -## Command Encoding and Submission ## {#command-encoding} - -Many operations in WebGPU are purely GPU-side operations that don't use data from the CPU. -These operations are not issued directly; instead, they are encoded into `GPUCommandBuffer`s -via the builder-like `GPUCommandEncoder` interface, then later sent to the GPU with -`gpuQueue.submit()`. -This design is used by the underlying native APIs as well. It provides several benefits: - -- Command buffer encoding is independent of other state, allowing encoding (and command buffer - validation) work to utilize multiple CPU threads. -- Provides a larger chunk of work at once, allowing the GPU driver to do more global - optimization, especially in how it schedules work across the GPU hardware. - -### Debug Markers and Debug Groups ### {#command-encoding-debug} - -For error messages and debugging tools, it is possible to label work inside a command buffer. -(See [[#errors-errorscopes-labels]].) - -- `insertDebugMarker(markerLabel)` marks a point in a stream of commands. -- `pushDebugGroup(groupLabel)`/`popDebugGroup()` nestably demarcate sub-streams of commands. - This can be used e.g. to label which part of a command buffer corresponds to different objects - or parts of a scene. - -### Passes ### {#command-encoding-passes} - -Issue: Briefly explain passes? - - -## Pipelines ## {#pipelines} - - -## Image, Video, and Canvas input ## {#image-input} - -Issue: Exact API still in flux as of this writing. - -WebGPU is largely isolated from the rest of the Web platform, but has several interop points. -One of these is image data input into the API. -Aside from the general data read/write mechanisms (`writeTexture`, `writeBuffer`, and `mapAsync`), -data can also come from ``/`ImageBitmap`, canvases, and videos. -There are many use-cases that require these, including: - -- Initializing textures from encoded images (JPEG, PNG, etc.) -- Rendering text with 2D canvas for use in WebGPU. -- Video element and video camera input for image processing, ML, 3D scenes, etc. - -There are two paths: - -- `copyExternalImageToTexture()` copies color data from a sub-rectangle of an - image/video/canvas object into an equally-sized sub-rectangle of a `GPUTexture`. - The input data is captured at the moment of the call. -- `importTexture()` takes a video or canvas and creates a `GPUExternalTexture` object which *can* - provide direct read access to an underlying resource if it exists on the (same) GPU already, - avoiding unnecessary copies or CPU-GPU bandwidth. - This is typically true of hardware-decoded videos and most canvas elements. - -Issue: Update both names to whatever we settle on. - -### GPUExternalTexture ### {#image-input-external-texture} - -A `GPUExternalTexture` is a sampleable texture object which can be used in similar ways to normal -sampleable `GPUTexture` objects. -In particular, it can be bound as a texture resource to a shader and used directly from the GPU: -when it is bound, additional metadata is attached that allows WebGPU to "automagically" -transform the data from its underlying representation (e.g. YUV) to RGB sampled data. - -A `GPUExternalTexture` represents a particular imported image, so the underlying data must not -change after import, either from internal (WebGPU) or external (Web platform) access. - -Issue: -Describe how this is achieved for video element, VideoFrame, canvas element, and OffscreenCanvas. - - -## Canvas Output ## {#canvas-output} - -Historically, drawing APIs (2d canvas, WebGL) are initialized from canvases using `getContext()`. -However, WebGPU is more than a drawing API, and many applications do not need a canvas. -WebGPU is initialized without a canvas - see [[#initialization]]. - -Following this, WebGPU has no "default" drawing buffer. -Instead, a WebGPU device may be connected to *any number* of canvases (zero or more) -and render to any number of them each frame. - -Canvas context creation and WebGPU device creation are decoupled. -Any `GPUCanvasContext` may be dynamically used with any `GPUDevice`. -This makes device switches easy (e.g. after recovering from a device loss). -(In comparison, WebGL context restoration is done on the same `WebGLRenderingContext` object, -even though context state does not persist across loss/restoration.) - -In order to access a canvas, an app gets a `GPUTexture` from the `GPUCanvasContext` -and then writes to it, as it would with a normal `GPUTexture`. - -### Swap Chains ### {#canvas-output-swap-chains} - -Canvas `GPUTexture`s are vended in a very structured way: - -- `canvas.getContext('gpupresent')` provides a `GPUCanvasContext`. -- `GPUCanvasContext.configureSwapChain({ device, format, usage })` provides a `GPUSwapChain`, - invalidating any previous swapchains, attaching the canvas to the provided device, and - setting the `GPUTextureFormat` and `GPUTextureUsage` for vended textures. -- `GPUSwapChain.getCurrentTexture()` provides a `GPUTexture`. - -This structure provides maximal compatibility with optimized paths in native graphics APIs. -In these, typically, a platform-specific "surface" object can produce an API object called a -"swap chain" which provides, possibly up-front, a possibly-fixed list of 1-3 textures to render -into. - -### Current Texture ### {#canvas-output-current-texture} - -A `GPUSwapChain` provides a "current texture" via `getCurrentTexture()`. -For <{canvas}> elements, this returns a texture for the *current frame*: - -- On `getCurrentTexture()`, `[[currentTexture]]` is created if it doesn't exist, then returned. -- During the "[=Update the rendering=]" step, the browser compositor takes ownership of the - `[[currentTexture]]` for display, and that internal slot is cleared for the next frame. - -### `getSwapChainPreferredFormat()` ### {#canvas-output-preferred-format} - -Due to framebuffer hardware differences, different devices have different preferred byte layouts -for display surfaces. -Any allowed format is allowed on all systems, but applications may save power by using the -preferred format. -The exact format cannot be hidden, because the format is observable - e.g., -in the behavior of a `copyBufferToTexture` call and in compatibility rules with render pipelines -(which specify a format, see `GPUColorTargetState.format`). - -Desktop-lineage hardware usually prefers `bgra8unorm` (4 bytes in BGRA order), -while mobile-lineage hardware usually prefers `rgba8unorm` (4 bytes in RGBA order). - -For high-bit-depth, different systems may also prefer different formats, -like `rgba16float` or `rgb10a2unorm`. - -### Multiple Displays ### {#canvas-output-multiple-displays} - -Some systems have multiple displays with different capabilities (e.g. HDR vs non-HDR). -Browser windows can be moved between these displays. - -As today with WebGL, user agents can make their own decisions about how to expose these -capabilities, e.g. choosing the capabilities of the initial, primary, or most-capable display. - -In the future, an event might be provided that allows applications to detect when a canvas moves -to a display with different properties so they can call `getSwapChainPreferredFormat()` and -`configureSwapChain()` again. - -#### Multiple Adapters #### {#canvas-output-multiple-adapters} - -Some systems have multiple displays connected to different hardware adapters; for example, -laptops with switchable graphics might have the internal display connected to the integrated GPU -and the HDMI port connected to the discrete GPU. - -This can incur overhead, as rendering on one adapter and displaying on another typically incurs -a copy or direct-memory-access (DMA) over a PCI bus. - -Currently, WebGPU does not provide a way to detect which adapter is optimal for a given display. -In the future, applications may be able to detect this, and receive events when this changes. - - -## Bitflags ## {#bitflags} - -WebGPU uses C-style bitflags in several places. -(Search `GPUFlagsConstant` in the spec for instances.) -A typical bitflag definition looks like this: - - -typedef [EnforceRange] unsigned long GPUColorWriteFlags; -[Exposed=Window] -interface GPUColorWrite { - const GPUFlagsConstant RED = 0x1; - const GPUFlagsConstant GREEN = 0x2; - const GPUFlagsConstant BLUE = 0x4; - const GPUFlagsConstant ALPHA = 0x8; - const GPUFlagsConstant ALL = 0xF; -}; - - -This was chosen because there is no other particularly ergonomic way to describe -"enum sets" in JavaScript today. - -Bitflags are used in WebGL, which many WebGPU developers will be familiar with. -They also match closely with the API shape that would be used by many native-language bindings. - -The closest option is `sequence`, but it doesn't naturally describe -an unordered set of unique items and doesn't easily allow things like -`GPUColorWrite.ALL` above. -Additionally, `sequence` has significant overhead, so we would have to avoid it in any -APIs that are expected to be "hot paths" (like command encoder methods), causing inconsistency with -parts of the API that *do* use it. - -See also issue [#747](https://github.com/gpuweb/gpuweb/issues/747) -which mentions that strongly-typed bitflags in JavaScript would be useful. - - -# WebGPU Shading Language # {#wgsl} - - -# Security and Privacy self-review # {#questionnaire} - -## What information might this feature expose to Web sites or other parties, and for what purposes is that exposure necessary? ## {#questionnaire-1} - -The feature exposes information about the system's GPUs (or lack of). - -It allows determining if one of the GPUs in the system supports WebGPU by requesting a `GPUAdapter` without software fallback. -This is necessary for sites to be able to fallback to hardware-accelerated WebGL if the system doesn't support hardware-accelerated WebGPU. - -For requested adapters the feature exposes a name, set of optional WebGPU capabilities that the `GPUAdapter` supports, as well as a set of numeric limits that the `GPUAdapter` supports. -This is necessary because there is a lot of diversity in GPU hardware and while WebGPU target the lowest common denominator it is meant to scale to expose more powerful features when the hardware allows it. -The name can be surfaced to the user when choosing, for example to let it choose an adapter and can be used by sites to do GPU-specific workarounds (this was critical in the past for WebGL). - -Note that the user agent controls which name, optional features, and limits are exposed. -It is not possible for sites to differentiate between hardware not supporting a feature and the user agent choosing not to expose it. -User agents are expected to bucket the actual capabilities of the GPU and only expose a limited number of such buckets to the site. - -## Do features in your specification expose the minimum amount of information necessary to enable their intended uses? ## {#questionnaire-2} - -Yes. -WebGPU only requires exposing if hardware-accelerated WebGPU is available, not why, or if the browser chose to not expose it etc. - -For the name, optional features, and limits the information exposed is not specified to be minimal because each site might require a different subset of the limits and optional features. -Instead the information exposed is controlled by the user-agent that is expected to only expose a small number of buckets that all expose the same information. - -## How do the features in your specification deal with personal information, personally-identifiable information (PII), or information derived from them? ## {#questionnaire-3} - -WebGPU doesn't deal with PII unless the site puts PII inside the API, which means that Javascript got access to the PII before WebGPU could. - -## How do the features in your specification deal with sensitive information? ## {#questionnaire-4} - -WebGPU doesn't deal with sensitive information. -However some of the information it exposes could be correlated with sensitive information: the presence of powerful optional features or a high speed of GPU computation would allow deducing access to "high-end" GPUs which itself correlates with other information. - -## Do the features in your specification introduce new state for an origin that persists across browsing sessions? ## {#questionnaire-5} - -The WebGPU specification doesn't introduce new state. -However implementations are expected to cache the result of compiling shaders and pipelines. -This introduces state that could be inspected by measuring how long compilation of a set of shaders and pipelines take. -Note that GPU drivers also have their own caches so user-agents will have to find ways to disable that cache (otherwise state could be leaked across origins). - -## Do the features in your specification expose information about the underlying platform to origins? ## {#questionnaire-6} - -Yes. -The specification exposes whether hardware-accelerated WebGPU is available and a user-agent controlled name and set of optional features and limits each `GPUAdapter` supports. -Different requests for adapters returning adapters with different capabilities would also indicate the system contains multiple GPUs. - -## Does this specification allow an origin to send data to the underlying platform? ## {#questionnaire-7} - -WebGPU allows sending data to the system's GPU. -The WebGPU specification prevents ill-formed GPU commands from being sent to the hardware. -It is also expected that user-agents will have work-arounds for bugs in the driver that could cause issue even with well-format GPU commands. - -## Do features in this specification allow an origin access to sensors on a user’s device? ## {#questionnaire-8} - -No. - -## What data do the features in this specification expose to an origin? Please also document what data is identical to data exposed by other features, in the same or different contexts. ## {#questionnaire-9} - -WebGPU exposes with hardware-accelerated WebGPU is available, which is a new piece of data. -The adapter's name, optional features, and limits has a large intersection with WebGL's RENDERER_STRING, limits and extensions: even limits not in WebGL can mostly be deduced from the other limits exposed by WebGL (by deducing what GPU model the system has). - -## Do features in this specification enable new script execution/loading mechanisms? ## {#questionnaire-10} - -Yes. -WebGPU allows running arbitrary GPU computations specified with the WebGPU Shading Language (WGSL). -WGSL is compiled into a `GPUShaderModule` objects that are then used to specify "pipelines" that run computations on the GPU. - -## Do features in this specification allow an origin to access other devices? ## {#questionnaire-11} - -No. -WebGPU allows access to PCI-e and external GPUs plugged into the system but these are just part of the system. - -## Do features in this specification allow an origin some measure of control over a user agent's native UI? ## {#questionnaire-12} - -No. -However WebGPU can be used to render to fullscreen or WebXR which does change the UI. -WebGPU can also run GPU computations that take too long and cause of device timeout and a restart of GPU (TDR), which can produce a couple system-wide black frames. -Note that this is possible with "just" HTML / CSS but WebGPU makes it easier to cause a TDR. - -## What temporary identifiers do the features in this specification create or expose to the web? ## {#questionnaire-13} - -None. - -## How does this specification distinguish between behavior in first-party and third-party contexts? ## {#questionnaire-14} - -There are no specific behavior difference between first-party and third-party contexts. -However the user-agent can decide to limit the `GPUAdapters` returned to third-party contexts: by using less buckets, by using a single bucket, or by not exposing WebGPU. - -## How do the features in this specification work in the context of a browser’s Private Browsing or Incognito mode? ## {#questionnaire-15} - -There is no difference in Incognito mode, but the user-agent can decide to limit the `GPUAdapters` returned. -User-agents will need to be careful not to reuse the shader compilation caches when in Incognito mode. - -## Does this specification have both "Security Considerations" and "Privacy Considerations" sections? ## {#questionnaire-16} - -Yes. -They are both under the [Malicious use considerations](https://gpuweb.github.io/gpuweb/#malicious-use) section. - -## Do features in your specification enable origins to downgrade default security protections? ## {#questionnaire-17} - -No. -Except that WebGPU can be used to render to fullscreen or WebXR. - -## What should this questionnaire have asked? ## {#questionnaire-18} - -Does the specification allow interacting with cross-origin data? With DRM data? - -At the moment WebGPU cannot do that but it is likely that someone will request these features in the future. -It might be possible to introduce the concept of "protected queues" that only allow computations to end up on the screen, and not into Javascript. -However investigation in WebGL show that GPU timings can be used to leak from such protected queues. diff --git a/process/CheckinApproval.md b/process/CheckinApproval.md deleted file mode 100644 index 1f07d0d8fa..0000000000 --- a/process/CheckinApproval.md +++ /dev/null @@ -1,3 +0,0 @@ -A proposed change that is either considered to be noncontroversial or has already been agreed upon by the group should begin as a pull request to the GitHub repository: https://github.com/gpuweb/gpuweb Where it will require approval from an editor. If the pull request is initiated by an editor, another will still have to review it. That is not to say that other feedback on pull requests is unwelcome. Discussions can play out over these as they have previously. - -If the reviewer agrees that the pull request is uncontroversial or matches what has already been agreed upon, an editor can approve and merge it. To make sure this is conducted properly, interested parties are invited to *watch* the gpuweb repository and monitor the changes there. If something is approved as uncontroversial or resolved seems incorrect, the issue can be brought up and discussed with the larger group if needed. If necessary, merged changes can be reverted or altered if its determined they were in error. diff --git a/process/HomeworkForMeetings.md b/process/HomeworkForMeetings.md deleted file mode 100644 index eb4d1c132f..0000000000 --- a/process/HomeworkForMeetings.md +++ /dev/null @@ -1,24 +0,0 @@ -# Homework for meetings - -Discussions we have in the WebGPU meetings usually require a lot of context to be productive. -This is both because we need to know how all 3 native APIs work and because we try to find the right tradeoffs given many constraints. -Having members of the group produce investigations and proposals is the best way to share context to the group, but it needs to arrive in a timely manner to be useful. -It happened that multi-page investigations were posted 5 minutes before the start of the meeting in which they were to be discussed. -This obviously makes it impossible for other members to catch up and explaining the investigation step-by-step in the meeting is a waste of people's valuable time. -The reverse situation happened too were investigations were posted days (or weeks) in advance but members didn't take/have the time to read them in time, resulting in the same waste of time. - -This proposal tries to codify what the expectations are on the members to make meetings happen smoothly. - -## Producer homework - -In order for members to have time to review material for a meeting, the meat of investigations and proposals have to be posted at least 48 working day hours in advance. -For example for a meeting on Monday at 3PM, material has to be posted before the previous Thursday at 3PM. -Topics for which the material isn't ready by that deadline can be dropped from the meeting to leave space for topics with more shared knowledge. -Small-scale clarifications and discussions on the material don't have a deadline themselves though. - -To be determined is how these materials are advertised so people know what to read. - -## Consumer homework - -Members should become familiar with the material for the meeting in these 48 hours (or even before that). -Ideally they should ask for simple clarifications in advance too. diff --git a/samples/hello-cube.html b/samples/hello-cube.html index 7ff8e3ae8c..78c61f7b90 100644 --- a/samples/hello-cube.html +++ b/samples/hello-cube.html @@ -16,153 +16,175 @@ const shader = ` struct FragmentData { - [[builtin(position)]] position: vec4; - [[location(${colorLocation})]] color: vec4; + @builtin(position) position: vec4; + @location(${colorLocation}) color: vec4; }; -[[block]] struct Uniforms { modelViewProjectionMatrix: mat4x4; }; -[[group(${bindGroupIndex}), binding(${transformBindingNum})]] var uniforms; +@group(${bindGroupIndex}) @binding(${transformBindingNum}) var uniforms: Uniforms; -[[stage(vertex)]] -fn main( - [[location(${positionAttributeNum})]] position: vec4 - [[location(${colorAttributeNum})]] color: vec4, +let pos = array, 3>( + vec2(0.0, 0.5), + vec2(-0.5, -0.5), + vec2(0.5, -0.5)); + +@stage(vertex) +fn vertex_main( + @builtin(vertex_index) VertexIndex : u32, + @location(${positionAttributeNum}) position: vec4, + @location(${colorAttributeNum}) color: vec4 ) -> FragmentData { - FragmentData out; - out.position = mul(uniforms.modelViewProjectionMatrix[0], position); - out.color = color; - return out; + var outData: FragmentData; + outData.position = uniforms.modelViewProjectionMatrix * position; + outData.color = color; + return outData; } -[[stage(fragment)]] -fn main(data: FragmentData) -> [[location(0)]] vec4 { +@stage(fragment) +fn fragment_main(data: FragmentData) -> @location(0) vec4 { return data.color; } `; -let device, swapChain, verticesBuffer, bindGroupLayout, pipeline, renderPassDescriptor; -let projectionMatrix = mat4.create(); -let mappedGroups = []; +let device, context, verticesBuffer, renderPipeline, renderPassDescriptor; +let transformBuffer, bindGroup; +const projectionMatrix = mat4.create(); +const mappedGroups = []; const colorOffset = 4 * 4; const vertexSize = 4 * 8; const verticesArray = new Float32Array([ - // float4 position, float4 color - 1, -1, 1, 1, 1, 0, 1, 1, - -1, -1, 1, 1, 0, 0, 1, 1, - -1, -1, -1, 1, 0, 0, 0, 1, - 1, -1, -1, 1, 1, 0, 0, 1, - 1, -1, 1, 1, 1, 0, 1, 1, - -1, -1, -1, 1, 0, 0, 0, 1, - - 1, 1, 1, 1, 1, 1, 1, 1, - 1, -1, 1, 1, 1, 0, 1, 1, - 1, -1, -1, 1, 1, 0, 0, 1, - 1, 1, -1, 1, 1, 1, 0, 1, - 1, 1, 1, 1, 1, 1, 1, 1, - 1, -1, -1, 1, 1, 0, 0, 1, - - -1, 1, 1, 1, 0, 1, 1, 1, - 1, 1, 1, 1, 1, 1, 1, 1, - 1, 1, -1, 1, 1, 1, 0, 1, - -1, 1, -1, 1, 0, 1, 0, 1, - -1, 1, 1, 1, 0, 1, 1, 1, - 1, 1, -1, 1, 1, 1, 0, 1, - - -1, -1, 1, 1, 0, 0, 1, 1, - -1, 1, 1, 1, 0, 1, 1, 1, - -1, 1, -1, 1, 0, 1, 0, 1, - -1, -1, -1, 1, 0, 0, 0, 1, - -1, -1, 1, 1, 0, 0, 1, 1, - -1, 1, -1, 1, 0, 1, 0, 1, - - 1, 1, 1, 1, 1, 1, 1, 1, - -1, 1, 1, 1, 0, 1, 1, 1, - -1, -1, 1, 1, 0, 0, 1, 1, - -1, -1, 1, 1, 0, 0, 1, 1, - 1, -1, 1, 1, 1, 0, 1, 1, - 1, 1, 1, 1, 1, 1, 1, 1, - - 1, -1, -1, 1, 1, 0, 0, 1, - -1, -1, -1, 1, 0, 0, 0, 1, - -1, 1, -1, 1, 0, 1, 0, 1, - 1, 1, -1, 1, 1, 1, 0, 1, - 1, -1, -1, 1, 1, 0, 0, 1, - -1, 1, -1, 1, 0, 1, 0, 1, + // float32x4 position, float32x4 color + 1, -1, 1, 1, 1, 0, 1, 1, + -1, -1, 1, 1, 0, 0, 1, 1, + -1, -1, -1, 1, 0, 0, 0, 1, + 1, -1, -1, 1, 1, 0, 0, 1, + 1, -1, 1, 1, 1, 0, 1, 1, + -1, -1, -1, 1, 0, 0, 0, 1, + + 1, 1, 1, 1, 1, 1, 1, 1, + 1, -1, 1, 1, 1, 0, 1, 1, + 1, -1, -1, 1, 1, 0, 0, 1, + 1, 1, -1, 1, 1, 1, 0, 1, + 1, 1, 1, 1, 1, 1, 1, 1, + 1, -1, -1, 1, 1, 0, 0, 1, + + -1, 1, 1, 1, 0, 1, 1, 1, + 1, 1, 1, 1, 1, 1, 1, 1, + 1, 1, -1, 1, 1, 1, 0, 1, + -1, 1, -1, 1, 0, 1, 0, 1, + -1, 1, 1, 1, 0, 1, 1, 1, + 1, 1, -1, 1, 1, 1, 0, 1, + + -1, -1, 1, 1, 0, 0, 1, 1, + -1, 1, 1, 1, 0, 1, 1, 1, + -1, 1, -1, 1, 0, 1, 0, 1, + -1, -1, -1, 1, 0, 0, 0, 1, + -1, -1, 1, 1, 0, 0, 1, 1, + -1, 1, -1, 1, 0, 1, 0, 1, + + 1, 1, 1, 1, 1, 1, 1, 1, + -1, 1, 1, 1, 0, 1, 1, 1, + -1, -1, 1, 1, 0, 0, 1, 1, + -1, -1, 1, 1, 0, 0, 1, 1, + 1, -1, 1, 1, 1, 0, 1, 1, + 1, 1, 1, 1, 1, 1, 1, 1, + + 1, -1, -1, 1, 1, 0, 0, 1, + -1, -1, -1, 1, 0, 0, 0, 1, + -1, 1, -1, 1, 0, 1, 0, 1, + 1, 1, -1, 1, 1, 1, 0, 1, + 1, -1, -1, 1, 1, 0, 0, 1, + -1, 1, -1, 1, 0, 1, 0, 1, ]); async function init() { const adapter = await navigator.gpu.requestAdapter(); device = await adapter.requestDevice(); + // Canvas + const canvas = document.querySelector('canvas'); - let canvasSize = canvas.getBoundingClientRect(); + const canvasSize = canvas.getBoundingClientRect(); canvas.width = canvasSize.width; canvas.height = canvasSize.height; const aspect = Math.abs(canvas.width / canvas.height); mat4.perspective(projectionMatrix, (2 * Math.PI) / 5, aspect, 1, 100.0); - const context = canvas.getContext('gpu'); + context = canvas.getContext('webgpu'); + const canvasFormat = "bgra8unorm"; - const swapChainDescriptor = { + const contextConfiguration = { device: device, - format: "bgra8unorm" + format: canvasFormat, }; - swapChain = context.configureSwapChain(swapChainDescriptor); + context.configure(contextConfiguration); + + // Bind Group Layout + Pipeline Layout + + const transformBufferBindGroupLayoutEntry = { + binding: transformBindingNum, // @group(0) @binding(0) + visibility: GPUShaderStage.VERTEX, + buffer: { type: "uniform" }, + }; + const bindGroupLayoutDescriptor = { entries: [transformBufferBindGroupLayoutEntry] }; + const bindGroupLayout = device.createBindGroupLayout(bindGroupLayoutDescriptor); + + const pipelineLayoutDescriptor = { bindGroupLayouts: [bindGroupLayout] }; + const pipelineLayout = device.createPipelineLayout(pipelineLayoutDescriptor); + + // Shader Module const shaderModuleDescriptor = { code: shader }; const shaderModule = device.createShaderModule(shaderModuleDescriptor); - const verticesBufferDescriptor = { size: verticesArray.byteLength, usage: GPUBufferUsage.VERTEX }; - let verticesArrayBuffer; - [verticesBuffer, verticesArrayBuffer] = device.createBufferMapped(verticesBufferDescriptor); + // Vertex Buffer + + const verticesBufferDescriptor = { + size: verticesArray.byteLength, + usage: GPUBufferUsage.VERTEX, + mappedAtCreation: true, + }; + verticesBuffer = device.createBuffer(verticesBufferDescriptor) + const verticesArrayBuffer = verticesBuffer.getMappedRange(); const verticesWriteArray = new Float32Array(verticesArrayBuffer); verticesWriteArray.set(verticesArray); verticesBuffer.unmap(); - // Vertex Input + // Render Pipeline + + // Render Pipeline > Vertex Input const positionAttributeState = { - shaderLocation: positionAttributeNum, // [[attribute(0)]] + format: "float32x4", offset: 0, - format: "float4" + shaderLocation: positionAttributeNum, // @attribute(0) }; const colorAttributeState = { - shaderLocation: colorAttributeNum, + format: "float32x4", offset: colorOffset, - format: "float4" + shaderLocation: colorAttributeNum, // @attribute(1) } const vertexBufferState = { - attributeSet: [positionAttributeState, colorAttributeState], - stride: vertexSize, - stepMode: "vertex" + arrayStride: vertexSize, + stepMode: "vertex", + attributes: [positionAttributeState, colorAttributeState], }; - // Bind group binding layout - const transformBufferBindGroupLayoutEntry = { - binding: transformBindingNum, // id[[(0)]] - visibility: GPUShaderStageBit.VERTEX, - type: "uniform-buffer" - }; - - const bindGroupLayoutDescriptor = { entries: [transformBufferBindGroupLayoutEntry] }; - bindGroupLayout = device.createBindGroupLayout(bindGroupLayoutDescriptor); - - // Pipeline + // Render Pipeline > Depth/Stencil State + const depthFormat = "depth24plus"; const depthStateDescriptor = { + format: depthFormat, depthWriteEnabled: true, depthCompare: "less" }; - const pipelineLayoutDescriptor = { bindGroupLayouts: [bindGroupLayout] }; - const pipelineLayout = device.createPipelineLayout(pipelineLayoutDescriptor); const colorTargetState = { - format: "bgra8unorm", + format: canvasFormat, blend: { alpha: { srcFactor: "src-alpha", @@ -175,29 +197,30 @@ operation: "add" }, }, - writeMask: GPUColorWriteBits.ALL + writeMask: GPUColorWrite.ALL, }; - const pipelineDescriptor = { + const renderPipelineDescriptor = { layout: pipelineLayout, vertex: { buffers: [vertexBufferState], module: shaderModule, entryPoint: "vertex_main" }, + depthStencil: depthStateDescriptor, fragment: { module: shaderModule, entryPoint: "fragment_main", targets: [colorTargetState], }, - depthStencil: depthStateDescriptor, }; - pipeline = device.createRenderPipeline(pipelineDescriptor); + renderPipeline = device.createRenderPipeline(renderPipelineDescriptor); - let colorAttachment = { + // Render Pass Descriptor + + const colorAttachment = { // attachment is acquired in render loop. - loadOp: "clear", + loadValue: { r: 0.5, g: 1.0, b: 1.0, a: 1.0 }, // GPUColor storeOp: "store", - clearColor: { r: 0.5, g: 1.0, b: 1.0, a: 1.0 } // GPUColor }; // Depth stencil texture @@ -206,7 +229,7 @@ const depthSize = { width: canvas.width, height: canvas.height, - depth: 1 + depthOrArrayLayers: 1 }; const depthTextureDescriptor = { @@ -215,7 +238,7 @@ mipLevelCount: 1, sampleCount: 1, dimension: "2d", - format: "depth32float-stencil8", + format: depthFormat, usage: GPUTextureUsage.RENDER_ATTACHMENT }; @@ -223,10 +246,11 @@ // GPURenderPassDepthStencilAttachmentDescriptor const depthAttachment = { - view: depthTexture.createDefaultView(), - depthLoadOp: "clear", + view: depthTexture.createView(), + depthLoadValue: 1.0, depthStoreOp: "store", - clearDepth: 1.0 + stencilLoadValue: 0, + stencilStoreOp: "store", }; renderPassDescriptor = { @@ -234,28 +258,15 @@ depthStencilAttachment: depthAttachment }; - render(); -} - -// Transform Buffers and Bindings + // Transform Buffers and Bindings -const transformSize = 4 * 16; -const transformBufferDescriptor = { - size: transformSize, - usage: GPUBufferUsage.UNIFORM | GPUBufferUsage.MAP_WRITE -}; - -function render() { - if (mappedGroups.length === 0) { - const [buffer, arrayBuffer] = device.createBufferMapped(transformBufferDescriptor); - const group = device.createBindGroup(createBindGroupDescriptor(buffer)); - let mappedGroup = { buffer: buffer, arrayBuffer: arrayBuffer, bindGroup: group }; - drawCommands(mappedGroup); - } else - drawCommands(mappedGroups.shift()); -} + const transformSize = 4 * 16; + const transformBufferDescriptor = { + size: transformSize, + usage: GPUBufferUsage.UNIFORM | GPUBufferUsage.COPY_DST, + }; + transformBuffer = device.createBuffer(transformBufferDescriptor) -function createBindGroupDescriptor(transformBuffer) { const transformBufferBinding = { buffer: transformBuffer, offset: 0, @@ -265,50 +276,46 @@ binding: transformBindingNum, resource: transformBufferBinding }; - return { + const bindGroupDescriptor = { layout: bindGroupLayout, entries: [transformBufferBindGroupEntry] }; + bindGroup = device.createBindGroup(bindGroupDescriptor); + + render(); } -function drawCommands(mappedGroup) { - updateTransformArray(new Float32Array(mappedGroup.arrayBuffer)); - mappedGroup.buffer.unmap(); +function render() { + updateTransformArray(); const commandEncoder = device.createCommandEncoder(); - renderPassDescriptor.colorAttachments[0].view = swapChain.getCurrentTexture().createDefaultView(); + renderPassDescriptor.colorAttachments[0].view = context.getCurrentTexture().createView(); const passEncoder = commandEncoder.beginRenderPass(renderPassDescriptor); // Encode drawing commands - passEncoder.setPipeline(pipeline); + passEncoder.setPipeline(renderPipeline); // Vertex attributes - passEncoder.setVertexBuffers(0, [verticesBuffer], [0]); + passEncoder.setVertexBuffer(0, verticesBuffer); // Bind groups - passEncoder.setBindGroup(bindGroupIndex, mappedGroup.bindGroup); + passEncoder.setBindGroup(bindGroupIndex, bindGroup); // 36 vertices, 1 instance, 0th vertex, 0th instance. passEncoder.draw(36, 1, 0, 0); - passEncoder.endPass(); - - device.getQueue().submit([commandEncoder.finish()]); + passEncoder.end(); - // Ready the current buffer for update after GPU is done with it. - mappedGroup.buffer.mapWriteAsync().then((arrayBuffer) => { - mappedGroup.arrayBuffer = arrayBuffer; - mappedGroups.push(mappedGroup); - }); + device.queue.submit([commandEncoder.finish()]); requestAnimationFrame(render); } -function updateTransformArray(array) { - let viewMatrix = mat4.create(); +function updateTransformArray() { + const viewMatrix = mat4.create(); mat4.translate(viewMatrix, viewMatrix, vec3.fromValues(0, 0, -5)); - let now = Date.now() / 1000; + const now = Date.now() / 1000; mat4.rotate(viewMatrix, viewMatrix, 1, vec3.fromValues(Math.sin(now), Math.cos(now), 0)); - let modelViewProjectionMatrix = mat4.create(); + const modelViewProjectionMatrix = mat4.create(); mat4.multiply(modelViewProjectionMatrix, projectionMatrix, viewMatrix); - mat4.copy(array, modelViewProjectionMatrix); + device.queue.writeBuffer(transformBuffer, 0, modelViewProjectionMatrix); } window.addEventListener("load", init); diff --git a/samples/workload-simulator.html b/samples/workload-simulator.html index 6d8617a279..c8544477e6 100644 --- a/samples/workload-simulator.html +++ b/samples/workload-simulator.html @@ -170,8 +170,8 @@

Drag the logo, or choose "Animate".

-
- WebGPU swap chain options +
+ WebGPU canvas context options
@@ -298,7 +298,7 @@

Web graphics workload simulator

finish.parentElement.style.display = useGl.checked ? '' : 'none'; readPixels.parentElement.style.display = useGl.checked ? '' : 'none'; offscreenCanvasOptions.style.display = useGl.checked || use2D.checked ? '' : 'none'; - swapChainOptions.style.display = useWebGPU.checked ? '' : 'none'; + webgpuCanvasOptions.style.display = useWebGPU.checked ? '' : 'none'; adapterInfo.style.display = useWebGPU.checked ? '' : 'none'; fpsOptions.style.display = animate.checked && (useSetTimeout.checked || useSetInterval.checked) ? '' : 'none'; @@ -405,7 +405,7 @@

Web graphics workload simulator

let deviceRequested = false; let device; let gpuPresent; -let swapChain; +let canvasContext; let multisampleRenderAttachment; let sampleCount = 4; let pipeline; @@ -426,7 +426,7 @@

Web graphics workload simulator

let mapAsyncBufferSize = 0; let mapAsyncReady = []; let mapAsyncBuffersOutstanding = 0; -const swapChainFormat = 'bgra8unorm'; +const canvasFormat = 'bgra8unorm'; let bitmapRenderer; let offscreenCanvas; @@ -594,7 +594,7 @@

Web graphics workload simulator

featuresAndLimits.textContent = JSON.stringify( { name: adapter.name, - features: Array.from(adapter.features || []).slice().sort(), + features: Array.from(adapter.features || []).sort(), limits: adapter.limits }, 0, 2); const useTimestampQueries = adapter.features.has('timestamp-query'); @@ -602,28 +602,25 @@

Web graphics workload simulator

nonGuaranteedFeatures: useTimestampQueries ? ['timestamp-query'] : [], }); device.addEventListener('uncapturederror', (e)=>{ - console.log(e); if (!errorMessage.textContent) errorMessage.textContent = 'Uncaptured error: ' + e.error.message; }); - gpuPresent = webGPUCanvas.getContext('gpupresent'); - swapChain = gpuPresent.configureSwapChain({device, format: swapChainFormat}); + canvasContext = webGPUCanvas.getContext('webgpu'); + canvasContext.configure({device, format: canvasFormat}); const bindGroupLayout = device.createBindGroupLayout({ entries: [ { binding: 0, - // TODO: These shouldn't need to be visible to both stages. There's a bug in Dawn that - // incorrectly requires variables that are unused in one stage to be visible to all stages. - visibility: GPUShaderStage.VERTEX | GPUShaderStage.FRAGMENT, + visibility: GPUShaderStage.VERTEX, buffer: { type: 'uniform', }, }, { binding: 1, - visibility: GPUShaderStage.VERTEX | GPUShaderStage.FRAGMENT, + visibility: GPUShaderStage.FRAGMENT, sampler: { type: 'filtering', }, }, { binding: 2, - visibility: GPUShaderStage.VERTEX | GPUShaderStage.FRAGMENT, + visibility: GPUShaderStage.FRAGMENT, texture: { sampleType: 'float', }, }, ], @@ -635,49 +632,45 @@

Web graphics workload simulator

multisampleRenderAttachment = device.createTexture({ size: { width: size, height: size }, sampleCount, - format: swapChainFormat, + format: canvasFormat, usage: GPUTextureUsage.RENDER_ATTACHMENT, }) if (!multisampleRenderAttachment) errorMessage.textContent = 'Failed to allocate multisample render attachment.'; } let shaderModule = device.createShaderModule({ code: ` - let pos : array, 6> = array, 6>( - vec2(-1., 1.), - vec2(-1., -1.), - vec2(1., -1.), - vec2(-1., 1.), - vec2(1., -1.), - vec2(1., 1.)); - - [[block]] struct Uniforms { - xy: vec2; - }; - [[binding(0), group(0)]] var uniforms: Uniforms; - [[binding(1), group(0)]] var sampler1: sampler; - [[binding(2), group(0)]] var texture1: texture_2d; + let pos = array, 6>( + vec2(-1., 1.), + vec2(-1., -1.), + vec2(1., -1.), + vec2(-1., 1.), + vec2(1., -1.), + vec2(1., 1.)); + + @binding(0) @group(0) var xy: vec2; + @binding(1) @group(0) var sampler1: sampler; + @binding(2) @group(0) var texture1: texture_2d; struct Output { - [[builtin(position)]] Position : vec4; - [[location(0)]] uv : vec2; + @builtin(position) Position : vec4; + @location(0) uv : vec2; }; - [[stage(vertex)]] - fn vs([[builtin(vertex_index)]] VertexIndex : u32) -> Output { + @stage(vertex) + fn vs(@builtin(vertex_index) VertexIndex : u32) -> Output { var output : Output; - output.Position = vec4(pos[VertexIndex] / vec2(2.,2.) + uniforms.xy, 0.0, 1.0); - output.uv = pos[VertexIndex] / vec2(2., -2.) + vec2(.5, .5); + output.Position = vec4(pos[VertexIndex] / vec2(2.,2.) + xy, 0.0, 1.0); + output.uv = pos[VertexIndex] / vec2(2., -2.) + vec2(.5, .5); return output; } - [[stage(fragment)]] - fn fs([[location(0)]] uv: vec2) -> [[location(0)]] vec4 { + @stage(fragment) + fn fs(@location(0) uv: vec2) -> @location(0) vec4 { return textureSample(texture1, sampler1, uv); } `, }); if (shaderModule.compilationInfo) - shaderModule.compilationInfo().then((e)=>console.log(e)); pipeline = device.createRenderPipeline({ primitive: { topology: 'triangle-list' }, layout: pipelineLayout, @@ -688,7 +681,7 @@

Web graphics workload simulator

}, fragment: { entryPoint: 'fs', - targets: [ { format: swapChainFormat, }, ], + targets: [ { format: canvasFormat, }, ], module: shaderModule, }, }); @@ -699,12 +692,12 @@

Web graphics workload simulator

const texture = device.createTexture({ size: [256, 256, 1], format: 'rgba8unorm', - usage: GPUTextureUsage.SAMPLED | GPUTextureUsage.COPY_DST, + usage: GPUTextureUsage.TEXTURE_BINDING | GPUTextureUsage.COPY_DST | GPUTextureUsage.RENDER_ATTACHMENT, }); const loadImage = async () => { const imageBitmap = await createImageBitmap(img); - device.queue.copyImageBitmapToTexture( - { imageBitmap }, { texture }, [256, 256, 1]); + device.queue.copyExternalImageToTexture( + { source: imageBitmap }, { texture }, [256, 256, 1]); render(); }; if (img.complete) loadImage(); else img.addEventListener('load', loadImage); @@ -741,7 +734,7 @@

Web graphics workload simulator

return; } try { - const swapChainView = swapChain.getCurrentTexture().createView(); + const canvasView = canvasContext.getCurrentTexture().createView(); const commandBuffers = []; device.queue.writeBuffer(uniformBuffer, 0, new Float32Array([position[0]/size*2 - 0.5, -position[1]*2/size + 0.5])); let buffer = null; @@ -814,16 +807,16 @@

Web graphics workload simulator

// Record command buffer. const commandEncoder = device.createCommandEncoder(); - let renderAttachment = swapChainView; + let renderAttachment = canvasView; if (multisampling.checked) { renderAttachment = multisampleRenderAttachment.createView(); } const passEncoder = commandEncoder.beginRenderPass({ colorAttachments: [ { view: renderAttachment, - resolveTarget: multisampling.checked ? swapChainView : undefined, + resolveTarget: multisampling.checked ? canvasView : undefined, loadValue: { r: 1, g: 1, b: 1, a: 1 }, - storeOp: multisampling.checked ? 'clear' : 'store', + storeOp: multisampling.checked ? 'discard' : 'store', }, ], }); if (useRenderBundles.checked && renderBundle && instances == renderBundleInstances && renderBundleNumDrawCalls == numDrawCalls) { @@ -847,7 +840,7 @@

Web graphics workload simulator

renderBundleInstances = instances; renderBundleNumDrawCalls = numDrawCalls; passOrBundleEncoder = device.createRenderBundleEncoder({ - colorFormats: [swapChainFormat], + colorFormats: [canvasFormat], sampleCount, }); } @@ -869,7 +862,7 @@

Web graphics workload simulator

// one. renderBundle = passOrBundleEncoder.finish(); passOrBundleEncoder = device.createRenderBundleEncoder({ - colorFormats: [swapChainFormat], + colorFormats: [canvasFormat], sampleCount, }); passOrBundleEncoder.setPipeline(pipeline); @@ -898,7 +891,7 @@

Web graphics workload simulator

} } } - passEncoder.endPass(); + passEncoder.end(); if (buffer) buffer.destroy(); let queryBuffer; if (querySet) { diff --git a/spec/Makefile b/spec/Makefile deleted file mode 100644 index ad33e1b5d4..0000000000 --- a/spec/Makefile +++ /dev/null @@ -1,12 +0,0 @@ -all: index.html webgpu.idl - -index.html: index.bs - bikeshed --die-on=everything spec index.bs - -webgpu.idl: index.bs extract-idl.py - python extract-idl.py index.bs > webgpu.idl - -online: - curl https://api.csswg.org/bikeshed/ -F file=@index.bs -F output=err - curl https://api.csswg.org/bikeshed/ -F file=@index.bs -F force=1 > index.html - python extract-idl.py index.bs > webgpu.idl diff --git a/spec/README.md b/spec/README.md deleted file mode 100644 index e05bfd2a82..0000000000 --- a/spec/README.md +++ /dev/null @@ -1,20 +0,0 @@ -# WebGPU Specification - -## Generating the specification - -The specification is written using [Bikeshed](https://tabatkins.github.io/bikeshed). - -If you have bikeshed installed locally, you can generate the specification with: - -``` -prompt> make -``` - -This simply runs bikeshed on the `index.bs` file. - -Otherwise, you can use the bikeshed Web API: - -``` -prompt> make online -``` - diff --git a/spec/extract-idl.py b/spec/extract-idl.py deleted file mode 100755 index b878d30e1c..0000000000 --- a/spec/extract-idl.py +++ /dev/null @@ -1,44 +0,0 @@ -#!/usr/bin/python - -from datetime import date -from string import Template -import re -import sys - -HEADER = """ -// Copyright (C) [$YEAR] World Wide Web Consortium, -// (Massachusetts Institute of Technology, European Research Consortium for -// Informatics and Mathematics, Keio University, Beihang). -// All Rights Reserved. -// -// This work is distributed under the W3C (R) Software License [1] in the hope -// that it will be useful, but WITHOUT ANY WARRANTY; without even the implied -// warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. -// -// [1] http://www.w3.org/Consortium/Legal/copyright-software - -// **** This file is auto-generated. Do not edit. **** - -""".lstrip() - -inputfilename = sys.argv[1] -inputfile = open(inputfilename) -idlList = [] -recording = False -idlStart = re.compile("\ - -{{GPUObjectBase}} has the following attributes: - -
- : label - :: - A label which can be used by development tools (such as error/warning messages, - browser developer tools, or platform debugging utilities) to identify the underlying - [=internal object=] to the developer. - It has no specified format, and therefore cannot be reliably machine-parsed. - - In any given situation, the user agent may or may not choose to use this label. -
- -{{GPUObjectBase}} has the following internal slots: - -
- : \[[device]], of type [=device=], readonly - :: - An internal slot holding the [=device=] which owns the [=internal object=]. -
- -### Object Descriptors ### {#object-descriptors} - -An object descriptor holds the information needed to create an object, -which is typically done via one of the `create*` methods of {{GPUDevice}}. - - - -{{GPUObjectDescriptorBase}} has the following members: - -
- : label - :: - The initial value of {{GPUObjectBase/label|GPUObjectBase.label}}. -
- -## Invalid Internal Objects & Contagious Invalidity ## {#invalidity} - -If an object is successfully created, it is valid at that moment. -An [=internal object=] may be invalid. -It may become [=invalid=] during its lifetime, but it will never become valid again. - -Issue: Consider separating "invalid" from "destroyed". -This would let validity be immutable, and only operations involving devices, buffers, and textures -(e.g. submit, map) would check those objects' `[[destroyed]]` state (explicitly). - -
- [=Invalid=] objects result from a number of situations, including: - - - If there is an error in the creation of an object, it is immediately invalid. - This can happen, for example, if the [=object descriptor=] doesn't describe a valid - object, or if there is not enough memory to allocate a resource. - - If an object is explicitly destroyed (e.g. {{GPUBuffer/destroy()|GPUBuffer.destroy()}}), - it becomes invalid. - - If the [=device=] that owns an object is lost, the object becomes invalid. -
- -
- To determine if a given {{GPUObjectBase}} |object| is valid to use with - a |targetObject|, run the following steps: - - 1. If any of the following conditions are unsatisfied return `false`: -
- - |object| is [=valid=] - - If |targetObject| is a {{GPUDevice}} |object|.{{GPUObjectBase/[[device]]}} is |targetObject|. - - Otherwise |object|.{{GPUObjectBase/[[device]]}} is |targetObject|.{{GPUObjectBase/[[device]]}}. -
- 1. Return `true`. -
- -## Coordinate Systems ## {#coordinate-systems} - -WebGPU's coordinate systems match DirectX and Metal's coordinate systems in a graphics pipeline. - - Y-axis is up in normalized device coordinate (NDC): point(-1.0, -1.0) in NDC is located at the bottom-left corner of NDC. - In addition, x and y in NDC should be between -1.0 and 1.0 inclusive, while z in NDC should be between 0.0 and 1.0 inclusive. - Vertices out of this range in NDC will not introduce any errors, but they will be clipped. - - Y-axis is down in [=framebuffer=] coordinate, viewport coordinate and fragment/pixel coordinate: - origin(0, 0) is located at the top-left corner in these coordinate systems. - - Window/present coordinate matches [=framebuffer=] coordinate. - - UV of origin(0, 0) in texture coordinate represents the first texel (the lowest byte) in texture memory. - - -## Programming Model ## {#programming-model} - -### Timelines ### {#programming-model-timelines} - -*This section is non-normative.* - -A computer system with a user agent at the front-end and GPU at the back-end -has components working on different timelines in parallel: - -: Content timeline -:: Associated with the execution of the Web script. - It includes calling all methods described by this specification. - -
- Steps executed on the content timeline look like this. -
- -: Device timeline -:: Associated with the GPU device operations - that are issued by the user agent. - It includes creation of adapters, devices, and GPU resources - and state objects, which are typically synchronous operations from the point - of view of the user agent part that controls the GPU, - but can live in a separate OS process. - -
- Steps executed on the device timeline look like this. -
- -: Queue timeline -:: Associated with the execution of operations - on the compute units of the GPU. It includes actual draw, copy, - and compute jobs that run on the GPU. - -
- Steps executed on the queue timeline look like this. -
- -In this specification, asynchronous operations are used when the result value -depends on work that happens on any timeline other than the [=Content timeline=]. -They are represented by callbacks and promises in JavaScript. - -
-{{GPUComputePassEncoder/dispatch(x, y, z)|GPUComputePassEncoder.dispatch()}}: - - 1. User encodes a `dispatch` command by calling a method of the - {{GPUComputePassEncoder}} which happens on the [=Content timeline=]. - 2. User issues {{GPUQueue/submit(commandBuffers)|GPUQueue.submit()}} that hands over - the {{GPUCommandBuffer}} to the user agent, which processes it - on the [=Device timeline=] by calling the OS driver to do a low-level submission. - 3. The submit gets dispatched by the GPU invocation scheduler onto the - actual compute units for execution, which happens on the [=Queue timeline=]. - -
-
-{{GPUDevice/createBuffer(descriptor)|GPUDevice.createBuffer()}}: - - 1. User fills out a {{GPUBufferDescriptor}} and creates a {{GPUBuffer}} with it, - which happens on the [=Content timeline=]. - 2. User agent creates a low-level buffer on the [=Device timeline=]. - -
-
-{{GPUBuffer/mapAsync()|GPUBuffer.mapAsync()}}: - - 1. User requests to map a {{GPUBuffer}} on the [=Content timeline=] and - gets a promise in return. - 2. User agent checks if the buffer is currently used by the GPU - and makes a reminder to itself to check back when this usage is over. - 3. After the GPU operating on [=Queue timeline=] is done using the buffer, - the user agent maps it to memory and [=resolves=] the promise. - -
- -### Memory Model ### {#programming-model-memory} - -*This section is non-normative.* - -Once a {{GPUDevice}} has been obtained during an application initialization routine, -we can describe the WebGPU platform as consisting of the following layers: - 1. User agent implementing the specification. - 2. Operating system with low-level native API drivers for this device. - 3. Actual CPU and GPU hardware. - -Each layer of the [=WebGPU platform=] may have different memory types -that the user agent needs to consider when implementing the specification: - - The script-owned memory, such as an {{ArrayBuffer}} created by the script, - is generally not accessible by a GPU driver. - - A user agent may have different processes responsible for running - the content and communication to the GPU driver. - In this case, it uses inter-process shared memory to transfer data. - - Dedicated GPUs have their own memory with high bandwidth, - while integrated GPUs typically share memory with the system. - -Most [=physical resources=] are allocated in the memory of type -that is efficient for computation or rendering by the GPU. -When the user needs to provide new data to the GPU, -the data may first need to cross the process boundary in order to reach -the user agent part that communicates with the GPU driver. -Then it may need to be made visible to the driver, -which sometimes requires a copy into driver-allocated staging memory. -Finally, it may need to be transferred to the dedicated GPU memory, -potentially changing the internal layout into one -that is most efficient for GPUs to operate on. - -All of these transitions are done by the WebGPU implementation of the user agent. - -Note: This example describes the worst case, while in practice -the implementation may not need to cross the process boundary, -or may be able to expose the driver-managed memory directly to -the user behind an `ArrayBuffer`, thus avoiding any data copies. - -### Multi-Threading ### {#programming-model-multi-threading} - -### Resource Usages ### {#programming-model-resource-usages} - -A [=physical resource=] can be used on GPU with an internal usage: -
- : input - :: Buffer with input data for draw or dispatch calls. Preserves the contents. - Allowed by buffer {{GPUBufferUsage/INDEX}}, buffer {{GPUBufferUsage/VERTEX}}, or buffer {{GPUBufferUsage/INDIRECT}}. - : constant - :: Resource bindings that are constant from the shader point of view. Preserves the contents. - Allowed by buffer {{GPUBufferUsage/UNIFORM}} or texture {{GPUTextureUsage/SAMPLED}}. - : storage - :: Read-write storage resource binding. - Allowed by buffer {{GPUBufferUsage/STORAGE}}. - : storage-read - :: Read-only storage resource bindings. Preserves the contents. - Allowed by buffer {{GPUBufferUsage/STORAGE}} or texture {{GPUTextureUsage/STORAGE}}. - : storage-write - :: Write-only storage resource bindings. - Allowed by texture {{GPUTextureUsage/STORAGE}}. - : attachment - :: Texture used as an output attachment in a render pass. - Allowed by texture {{GPUTextureUsage/RENDER_ATTACHMENT}}. - : attachment-read - :: Texture used as a read-only attachment in a render pass. Preserves the contents. - Allowed by texture {{GPUTextureUsage/RENDER_ATTACHMENT}}. -
- -Textures may consist of separate [=mipmap levels=] and [=array layers=], -which can be used differently at any given time. -Each such texture subresource is uniquely identified by a -[=texture=], [=mipmap level=], and -(for {{GPUTextureDimension/2d}} textures only) [=array layer=], -and [=aspect=]. - -We define subresource to be either a whole buffer, or a [=texture subresource=]. - -
-Some [=internal usages=] are compatible with others. A [=subresource=] can be in a state -that combines multiple usages together. We consider a list |U| to be -a compatible usage list if (and only if) it satisfies any of the following rules: - - Each usage in |U| is [=internal usage/input=], [=internal usage/constant=], [=internal usage/storage-read=], or [=internal usage/attachment-read=]. - - Each usage in |U| is [=internal usage/storage=]. - - Each usage in |U| is [=internal usage/storage-write=]. - - |U| contains exactly one element: [=internal usage/attachment=]. -
- -Enforcing that the usages are only combined into a [=compatible usage list=] -allows the API to limit when data races can occur in working with memory. -That property makes applications written against -WebGPU more likely to run without modification on different platforms. - -Generally, when an implementation processes an operation that uses a [=subresource=] -in a different way than its current usage allows, it schedules a transition of the resource -into the new state. In some cases, like within an open {{GPURenderPassEncoder}}, such a -transition is impossible due to the hardware limitations. -We define these places as usage scopes. - -The **main usage rule** is, for any one [=subresource=], its list of [=internal usages=] -within one [=usage scope=] must be a [=compatible usage list=]. - -For example, binding the same buffer for [=internal usage/storage=] as well as for -[=internal usage/input=] within the same {{GPURenderPassEncoder}} would put the encoder -as well as the owning {{GPUCommandEncoder}} into the error state. -This combination of usages does not make a [=compatible usage list=]. - -Note: race condition of multiple writable storage buffer/texture usages in a single [=usage scope=] is allowed. - -The [=subresources=] of textures included in the views provided to -{{GPURenderPassColorAttachment/view|GPURenderPassColorAttachment.view}} and -{{GPURenderPassColorAttachment/resolveTarget|GPURenderPassColorAttachment.resolveTarget}} -are considered to be used as [=internal usage/attachment=] for the [=usage scope=] of this render pass. - -The physical size of a [=texture subresource=] is the dimension of the -[=texture subresource=] in texels that includes the possible extra paddings -to form complete [=texel blocks=] in the [=subresource=]. - - - For pixel-based {{GPUTextureFormat|GPUTextureFormats}}, the [=physical size=] is always equal to the size of the [=texture subresource=] - used in the sampling hardwares. - - [=Textures=] in block-based compressed {{GPUTextureFormat|GPUTextureFormats}} always have a [=mipmap level=] 0 whose {{GPUTexture/[[textureSize]]}} - is a multiple of the [=texel block size=], but the lower mipmap levels might not be - multiples of the [=texel block size=] and can have paddings. - -
-Considering a {{GPUTexture}} in BC format whose {{GPUTexture/[[textureSize]]}} is {60, 60, 1}, when sampling -the {{GPUTexture}} at [=mipmap level=] 2, the sampling hardware uses {15, 15, 1} as the size of the [=texture subresource=], -while its [=physical size=] is {16, 16, 1} as the block-compression algorithm can only operate on 4x4 [=texel blocks=]. -
- -### Synchronization ### {#programming-model-synchronization} - -For each [=subresource=] of a [=physical resource=], its set of -[=internal usage=] flags is tracked on the [=Queue timeline=]. - -Issue: This section will need to be revised to support multiple queues. - -On the [=Queue timeline=], there is an ordered sequence of [=usage scopes=]. -For the duration of each scope, the set of [=internal usage=] flags of any given -[=subresource=] is constant. -A [=subresource=] may transition to new usages at the boundaries between [=usage scope=]s. - -This specification defines the following [=usage scopes=]: - -- Outside of a pass (in {{GPUCommandEncoder}}), each (non-state-setting) command is one usage scope - (e.g. {{GPUCommandEncoder/copyBufferToTexture()}}). -- In a compute pass, each dispatch command ({{GPUComputePassEncoder/dispatch()}} or - {{GPUComputePassEncoder/dispatchIndirect()}}) is one usage scope. - A subresource is "used" in the usage scope if it's accessible by the command. - Within a dispatch, every subresource in every currently bound {{GPUBindGroup}} - is "used" in the usage scope. - State-setting compute pass commands, like - {{GPUProgrammablePassEncoder/setBindGroup(index, bindGroup, dynamicOffsets)}}, - do not contribute directly to a usage scope; they instead change the - state that is checked in dispatch commands. -- One render pass is one usage scope. - A subresource is "used" in the usage scope if it's referenced by any - (state-setting or non-state-setting) command. For example, in - {{GPUProgrammablePassEncoder/setBindGroup(index, bindGroup, dynamicOffsets)}}, - every subresource in `bindGroup` is "used" in the render pass's usage scope. - -Issue: The above should probably talk about [=GPU commands=]. But we don't have a way to -reference specific GPU commands (like dispatch) yet. - -
- The above rules mean the following example resource usages **are** - included in [=usage scope validation=]: - - - In a render pass, subresources used in any - {{GPUProgrammablePassEncoder/setBindGroup(index, bindGroup, dynamicOffsets)|setBindGroup()}} - call, regardless of whether the currently bound pipeline's - shader or layout actually depends on these bindings, - or the bind group is shadowed by another 'set' call. - - A buffer used in any {{GPURenderEncoderBase/setVertexBuffer()|setVertexBuffer()}} - call, regardless of whether any draw call depends on this buffer, - or this buffer is shadowed by another 'set' call. - - A buffer used in any {{GPURenderEncoderBase/setIndexBuffer()|setIndexBuffer()}} - call, regardless of whether any draw call depends on this buffer, - or this buffer is shadowed by another 'set' call. - - A texture subresource used as a color attachment, resolve attachment, or - depth/stencil attachment in {{GPURenderPassDescriptor}} by - {{GPUCommandEncoder/beginRenderPass()|beginRenderPass()}}, - regardless of whether the shader actually depends on these attachments. - - Resources used in bind group entries with visibility 0, or visible only - to the compute stage but used in a render pass (or vice versa). -
- -During command encoding, every usage of a subresource is recorded in one of the -[=usage scopes=] in the command buffer. -For each [=usage scope=], the implementation performs -usage scope validation by composing the list of all -[=internal usage=] flags of each [=subresource=] used in the [=usage scope=]. -If any of those lists is not a [=compatible usage list=], -{{GPUCommandEncoder/finish()|GPUCommandEncoder.finish()}} -generates a {{GPUValidationError}} in the current error scope. - - -## Core Internal Objects ## {#core-internal-objects} - -### Adapters ### {#adapters} - -An adapter represents an implementation of WebGPU on the system. -Each adapter identifies both an instance of a hardware accelerator (e.g. GPU or CPU) and -an instance of a browser's implementation of WebGPU on top of that accelerator. - -If an [=adapter=] becomes unavailable, it becomes [=invalid=]. -Once invalid, it never becomes valid again. -Any [=devices=] on the adapter, and [=internal objects=] owned by those devices, -also become invalid. - -Note: -An [=adapter=] may be a physical display adapter (GPU), but it could also be -a software renderer. -A returned [=adapter=] could refer to different physical adapters, or to -different browser codepaths or system drivers on the same physical adapters. -Applications can hold onto multiple [=adapters=] at once (via {{GPUAdapter}}) -(even if some are [=invalid=]), -and two of these could refer to different instances of the same physical -configuration (e.g. if the GPU was reset or disconnected and reconnected). - -An [=adapter=] has the following internal slots: - -
- : \[[features]], of type [=ordered set=]<{{GPUFeatureName}}>, readonly - :: - The [=features=] which can be used to create devices on this adapter. - - : \[[limits]], of type [=supported limits=], readonly - :: - The [=limit/better|best=] limits which can be used to create devices on this adapter. - - Each adapter limit must be the same or [=limit/better=] than its default value - in [=supported limits=]. - - : \[[current]], of type boolean - :: - Indicates whether the adapter is allowed to vend new devices at this time. - Its value may change at any time. - - It is initially set to `true` inside {{GPU/requestAdapter()}}. - It becomes `false` inside "[=lose the device=]" and "[=mark adapters stale=]". - Once set to `false`, it cannot become `true` again. - - Note: - This mechanism ensures that various adapter-creation scenarios look similar to applications, - so they can easily be robust to more scenarios with less testing: first initialization, - reinitialization due to an unplugged adapter, reinitialization due to a test - {{GPUDevice/destroy()|GPUDevice.destroy()}} call, etc. It also ensures applications use - the latest system state to make decisions about which adapter to use. -
- -[=Adapters=] are exposed via {{GPUAdapter}}. - -### Devices ### {#devices} - -A device is the logical instantiation of an [=adapter=], -through which [=internal objects=] are created. -It can be shared across multiple [=agents=] (e.g. dedicated workers). - -A [=device=] is the exclusive owner of all [=internal objects=] created from it: -when the [=device=] is [=lose the device|lost=], it and all objects created on it (directly, e.g. -{{GPUDevice/createTexture()}}, or indirectly, e.g. {{GPUTexture/createView()}}) become -[=invalid=]. - -Issue: Define "ownership". - -A [=device=] has the following internal slots: - -
- : \[[adapter]], of type [=adapter=], readonly - :: - The [=adapter=] from which this device was created. - - : \[[features]], of type [=ordered set=]<{{GPUFeatureName}}>, readonly - :: - The [=features=] which can be used on this device. - No additional features can be used, even if the underlying [=adapter=] can support them. - - : \[[limits]], of type [=supported limits=], readonly - :: - The limits which can be used on this device. - No [=limit/better=] limits can be used, even if the underlying [=adapter=] can support them. -
- -
- When a new device |device| is created from [=adapter=] |adapter| - with {{GPUDeviceDescriptor}} |descriptor|: - - - Set |device|.{{device/[[adapter]]}} to |adapter|. - - - Set |device|.{{device/[[features]]}} to the [=ordered set|set=] of values in - |descriptor|.{{GPUDeviceDescriptor/nonGuaranteedFeatures}}. - - - Let |device|.{{device/[[limits]]}} be a [=supported limits=] object with the default values. - For each (|key|, |value|) pair in |descriptor|.{{GPUDeviceDescriptor/nonGuaranteedLimits}}, set the - member corresponding to |key| in |device|.{{device/[[limits]]}} to the value |value|. -
- -Any time the user agent needs to revoke access to a device, it calls -[=lose the device=](device, `undefined`). - -
- To lose the device(|device|, |reason|): - - 1. Set |device|.{{device/[[adapter]]}}.{{adapter/[[current]]}} to `false`. - 1. Issue: explain how to get from |device| to its "primary" {{GPUDevice}}. - 1. Resolve {{GPUDevice/lost|GPUDevice.lost}} with a new {{GPUDeviceLostInfo}} with - {{GPUDeviceLostInfo/reason}} set to |reason| and - {{GPUDeviceLostInfo/message}} set to an implementation-defined value. - - Note: {{GPUDeviceLostInfo/message}} should not disclose unnecessary user/system - information and should never be parsed by applications. -
- -[=Devices=] are exposed via {{GPUDevice}}. - -## Optional Capabilities ## {#optional-capabilities} - -WebGPU [=adapters=] and [=devices=] have capabilities, which -describe WebGPU functionality that differs between different implementations, -typically due to hardware or system software constraints. -A [=capability=] is either a [=feature=] or a [=limit=]. - -### Features ### {#features} - -A feature is a set of optional WebGPU functionality that is not supported -on all implementations, typically due to hardware or system software constraints. - -Each {{GPUAdapter}} exposes a set of available features. -Only those features may be requested in {{GPUAdapter/requestDevice()}}. - -Functionality that is part of an feature may only be used if the feature -was requested at device creation. See the [[#feature-index|Feature Index]] -for a description of the functionality each feature enables. - -### Limits ### {#limits} - -Each limit is a numeric limit on the usage of WebGPU on a device. - -A supported limits object has a value for every defined limit. -Each [=adapter=] has a set of [=supported limits=], and -[=devices=] are {{GPUDeviceDescriptor/nonGuaranteedLimits|created}} with specific [=supported limits=] in place. -The device limits are enforced regardless of the adapter's limits. - -One limit value may be better than another. -A [=limit/better=] limit value always relaxes validation, enabling strictly -more programs to be valid. For each limit, "better" is defined. - -Note: -Setting "better" limits may not necessarily be desirable, as they may have a performance impact. -Because of this, and to improve portability across devices and implementations, -applications should generally request the "worst" limits that work for their content -(ideally, the default values). - -Each limit also has a default value. -Every [=adapter=] is guaranteed to support the default value or [=limit/better=]. -The default is used if a value is not explicitly specified in {{GPUDeviceDescriptor/nonGuaranteedLimits}}. - - - - - -
Limit name Type [=limit/Better=] [=limit/Default=] -
maxTextureDimension1D - {{GPUSize32}} Higher 8192 -
- The maximum allowed value for the {{GPUTextureDescriptor/size}}.[=Extent3D/width=] - of a [=texture=] created with {{GPUTextureDescriptor/dimension}} {{GPUTextureDimension/"1d"}}. - -
maxTextureDimension2D - {{GPUSize32}} Higher 8192 -
- The maximum allowed value for the {{GPUTextureDescriptor/size}}.[=Extent3D/width=] and {{GPUTextureDescriptor/size}}.[=Extent3D/height=] - of a [=texture=] created with {{GPUTextureDescriptor/dimension}} {{GPUTextureDimension/"2d"}}. - -
maxTextureDimension3D - {{GPUSize32}} Higher 2048 -
- The maximum allowed value for the {{GPUTextureDescriptor/size}}.[=Extent3D/width=], {{GPUTextureDescriptor/size}}.[=Extent3D/height=] and {{GPUTextureDescriptor/size}}.[=Extent3D/depthOrArrayLayers=] - of a [=texture=] created with {{GPUTextureDescriptor/dimension}} {{GPUTextureDimension/"3d"}}. - -
maxTextureArrayLayers - {{GPUSize32}} Higher 2048 -
- The maximum allowed value for the {{GPUTextureDescriptor/size}}.[=Extent3D/depthOrArrayLayers=] - of a [=texture=] created with {{GPUTextureDescriptor/dimension}} {{GPUTextureDimension/"1d"}} or {{GPUTextureDimension/"2d"}}. - -
maxBindGroups - {{GPUSize32}} Higher 4 -
- The maximum number of {{GPUBindGroupLayout|GPUBindGroupLayouts}} - allowed in {{GPUPipelineLayoutDescriptor/bindGroupLayouts}} - when creating a {{GPUPipelineLayout}}. - -
maxDynamicUniformBuffersPerPipelineLayout - {{GPUSize32}} Higher 8 -
- The maximum number of {{GPUBindGroupLayoutDescriptor/entries}} for which: - - - [$layout entry binding type$] is {{GPUBufferBindingType/"uniform"}}, and - - {{GPUBindGroupLayoutEntry/buffer}}.{{GPUBufferBindingLayout/hasDynamicOffset}} is `true`, - - across all {{GPUPipelineLayoutDescriptor/bindGroupLayouts}} - when creating a {{GPUPipelineLayout}}. - -
maxDynamicStorageBuffersPerPipelineLayout - {{GPUSize32}} Higher 4 -
- The maximum number of {{GPUBindGroupLayoutDescriptor/entries}} for which: - - - [$layout entry binding type$] is {{GPUBufferBindingType/"storage"}}, and - - {{GPUBindGroupLayoutEntry/buffer}}.{{GPUBufferBindingLayout/hasDynamicOffset}} is `true`, - - across all {{GPUPipelineLayoutDescriptor/bindGroupLayouts}} - when creating a {{GPUPipelineLayout}}. - -
maxSampledTexturesPerShaderStage - {{GPUSize32}} Higher 16 -
- For each possible {{GPUShaderStage}} `stage`, - the maximum number of {{GPUBindGroupLayoutDescriptor/entries}} for which: - - - {{GPUBindGroupLayoutEntry/texture}} is not `undefined`, and - - {{GPUBindGroupLayoutEntry/visibility}} includes `stage`, - - across all {{GPUPipelineLayoutDescriptor/bindGroupLayouts}} - when creating a {{GPUPipelineLayout}}. - -
maxSamplersPerShaderStage - {{GPUSize32}} Higher 16 -
- For each possible {{GPUShaderStage}} `stage`, - the maximum number of {{GPUBindGroupLayoutDescriptor/entries}} for which: - - - [=Binding member=] is {{GPUBindGroupLayoutEntry/sampler}}, and - - {{GPUBindGroupLayoutEntry/visibility}} includes `stage`, - - across all {{GPUPipelineLayoutDescriptor/bindGroupLayouts}} - when creating a {{GPUPipelineLayout}}. - -
maxStorageBuffersPerShaderStage - {{GPUSize32}} Higher 4 -
- For each possible {{GPUShaderStage}} `stage`, - the maximum number of {{GPUBindGroupLayoutDescriptor/entries}} for which: - - - [$layout entry binding type$] is {{GPUBufferBindingType/"storage"}}, and - - {{GPUBindGroupLayoutEntry/visibility}} includes `stage`, - - across all {{GPUPipelineLayoutDescriptor/bindGroupLayouts}} - when creating a {{GPUPipelineLayout}}. - -
maxStorageTexturesPerShaderStage - {{GPUSize32}} Higher 4 -
- For each possible {{GPUShaderStage}} `stage`, - the maximum number of {{GPUBindGroupLayoutDescriptor/entries}} for which: - - - [=Binding member=] is {{GPUBindGroupLayoutEntry/storageTexture}}, and - - {{GPUBindGroupLayoutEntry/visibility}} includes `stage`, - - across all {{GPUPipelineLayoutDescriptor/bindGroupLayouts}} - when creating a {{GPUPipelineLayout}}. - -
maxUniformBuffersPerShaderStage - {{GPUSize32}} Higher 12 -
- For each possible {{GPUShaderStage}} `stage`, - the maximum number of {{GPUBindGroupLayoutDescriptor/entries}} for which: - - - [$layout entry binding type$] is {{GPUBufferBindingType/"uniform"}}, and - - {{GPUBindGroupLayoutEntry/visibility}} includes `stage`, - - across all {{GPUPipelineLayoutDescriptor/bindGroupLayouts}} - when creating a {{GPUPipelineLayout}}. - -
maxUniformBufferBindingSize - {{GPUSize32}} Higher 16384 -
- The maximum {{GPUBufferBinding}}.{{GPUBufferBinding/size}} for bindings for which the - [$layout entry binding type$] is {{GPUBufferBindingType/"uniform"}}. - -
maxStorageBufferBindingSize - {{GPUSize32}} Higher 134217728 (128 MiB) -
- The maximum {{GPUBufferBinding}}.{{GPUBufferBinding/size}} for bindings for which the - [$layout entry binding type$] is {{GPUBufferBindingType/"storage"}} or {{GPUBufferBindingType/"read-only-storage"}}. - -
maxVertexBuffers - {{GPUSize32}} Higher 8 -
- The maximum number of {{GPUVertexState/buffers}} - when creating a {{GPURenderPipeline}}. - -
maxVertexAttributes - {{GPUSize32}} Higher 16 -
- The maximum number of {{GPUVertexBufferLayout/attributes}} - in total across {{GPUVertexState/buffers}} - when creating a {{GPURenderPipeline}}. - -
maxVertexBufferArrayStride - {{GPUSize32}} Higher 2048 -
- The maximum allowed {{GPUVertexBufferLayout/arrayStride}} - when creating a {{GPURenderPipeline}}. -
- -#### GPUAdapterLimits #### {#gpu-adapterlimits} - -{{GPUAdapterLimits}} exposes the [=limits=] supported by an adapter. -See {{GPUAdapter/limits|GPUAdapter.limits}}. - - - -#### GPUSupportedFeatures #### {#gpu-supportedfeatures} - -{{GPUSupportedFeatures}} is a [=setlike=] interface. Its [=set entries=] are -the {{GPUFeatureName}} values of the [=features=] supported by an adapter or -device. - - - - -# Initialization # {#initialization} - -## Examples ## {#initialization-examples} - -Issue: -Need a robust example like the one in ErrorHandling.md, which handles all situations. -Possibly also include a simple example with no handling. - -## navigator.gpu ## {#navigator-gpu} - -A {{GPU}} object is available in the {{Window}} and {{DedicatedWorkerGlobalScope}} contexts through the {{Navigator}} -and {{WorkerNavigator}} interfaces respectively and is exposed via `navigator.gpu`: - - - -## GPU ## {#gpu-interface} - -GPU is the entry point to WebGPU. - - - -{{GPU}} has the following methods: - -
- : requestAdapter(options) - :: - Requests an [=adapter=] from the user agent. - The user agent chooses whether to return an adapter, and, if so, - chooses according to the provided options. - -
- **Called on:** {{GPU}} |this|. - - **Arguments:** -
-                |options|: Criteria used to select the adapter.
-            
- - **Returns:** {{Promise}}<{{GPUAdapter}}?> - - 1. Let |promise| be [=a new promise=]. - 1. Issue the following steps on the [=Device timeline=] of |this|: -
- 1. If the user agent chooses to return an adapter, it should: - - 1. Create an [=adapter=] |adapter| with {{adapter/[[current]]}} set to `true`, - chosen according to the rules in - [[#adapter-selection]] and the criteria in |options|. - - 1. [=Resolve=] |promise| with a new {{GPUAdapter}} encapsulating |adapter|. - - 1. Otherwise, |promise| [=resolves=] with `null`. -
- 1. Return |promise|. - - -
-
- -{{GPU}} has the following internal slots: - -
- : \[[previously_returned_adapters]], of type [=ordered set=]<[=adapter=]> - :: - The set of [=adapters=] that have been returned via {{GPU/requestAdapter()}}. - It is used, then cleared, in [=mark adapters stale=]. -
- -Upon any change in the system's state that could affect the result of any {{GPU/requestAdapter()}} -call, the user agent *should* [=mark adapters stale=]. For example: - -- A physical adapter is added/removed (via plug, driver update, TDR, etc.) -- The system's power configuration has changed (laptop unplugged, power settings changed, etc.) - -Additionally, [=mark adapters stale=] may by scheduled at any time. -User agents may choose to do this often even when there has been no system state change (e.g. -several seconds after the last call to {{GPUAdapter/requestDevice()}}. -This has no effect on well-formed applications, obfuscates real system state changes, and makes -developers more aware that calling {{GPU/requestAdapter()}} again is always necessary before -calling {{GPUAdapter/requestDevice()}}. - -
- To mark adapters stale: - - 1. For each |adapter| in `navigator.gpu.`{{GPU/[[previously_returned_adapters]]}}: - 1. Set |adapter|.{{GPUAdapter/[[adapter]]}}.{{adapter/[[current]]}} to `false`. - 1. [=list/Empty=] `navigator.gpu.`{{GPU/[[previously_returned_adapters]]}}. - - Issue: Update here if an `adaptersadded`/`adapterschanged` event is introduced. -
- -
- Request a {{GPUAdapter}}: -
-        const adapter = await navigator.gpu.requestAdapter(/* ... */);
-        const features = adapter.features;
-        // ...
-    
-
- -### Adapter Selection ### {#adapter-selection} - -GPURequestAdapterOptions -provides hints to the user agent indicating what -configuration is suitable for the application. - - - - - -{{GPURequestAdapterOptions}} has the following members: - -
- : powerPreference - :: - Optionally provides a hint indicating what class of [=adapter=] should be selected from - the system's available adapters. - - The value of this hint may influence which adapter is chosen, but it must not - influence whether an adapter is returned or not. - - Note: - The primary utility of this hint is to influence which GPU is used in a multi-GPU system. - For instance, some laptops have a low-power integrated GPU and a high-performance - discrete GPU. - - Note: - Depending on the exact hardware configuration, such as battery status and attached displays - or removable GPUs, the user agent may select different [=adapters=] given the same power - preference. - Typically, given the same hardware configuration and state and - `powerPreference`, the user agent is likely to select the same adapter. - - It must be one of the following values: - -
- : `undefined` (or not present) - :: - Provides no hint to the user agent. - - : "low-power" - :: - Indicates a request to prioritize power savings over performance. - - Note: - Generally, content should use this if it is unlikely to be constrained by drawing - performance; for example, if it renders only one frame per second, draws only relatively - simple geometry with simple shaders, or uses a small HTML canvas element. - Developers are encouraged to use this value if their content allows, since it may - significantly improve battery life on portable devices. - - : "high-performance" - :: - Indicates a request to prioritize performance over power consumption. - - Note: - By choosing this value, developers should be aware that, for [=devices=] created on the - resulting adapter, user agents are more likely to force device loss, in order to save - power by switching to a lower-power adapter. - Developers are encouraged to only specify this value if they believe it is absolutely - necessary, since it may significantly decrease battery life on portable devices. -
-
- -## GPUAdapter ## {#gpu-adapter} - -A {{GPUAdapter}} encapsulates an [=adapter=], -and describes its capabilities ([=features=] and [=limits=]). - -To get a {{GPUAdapter}}, use {{GPU/requestAdapter()}}. - - - -{{GPUAdapter}} has the following attributes: - -
- : name - :: - A human-readable name identifying the adapter. - The contents are implementation-defined. - - : features - :: - The set of values in `this`.{{GPUAdapter/[[adapter]]}}.{{adapter/[[features]]}}. - - : limits - :: - The limits in `this`.{{GPUAdapter/[[adapter]]}}.{{adapter/[[limits]]}}. -
- -{{GPUAdapter}} has the following internal slots: - -
- : \[[adapter]], of type [=adapter=], readonly - :: - The [=adapter=] to which this {{GPUAdapter}} refers. -
- -{{GPUAdapter}} has the following methods: - -
- : requestDevice(descriptor) - :: - Requests a [=device=] from the [=adapter=]. - -
- **Called on:** {{GPUAdapter}} |this|. - - **Arguments:** -
-                |descriptor|: Description of the {{GPUDevice}} to request.
-            
- - **Returns:** {{Promise}}<{{GPUDevice}}?> - - 1. Let |promise| be [=a new promise=]. - 1. Let |adapter| be |this|.{{GPUAdapter/[[adapter]]}}. - 1. Issue the following steps to the [=Device timeline=]: -
- 1. If any of the following requirements are unmet, - [=reject=] |promise| with a {{TypeError}} and stop. - -
- - The set of values in |descriptor|.{{GPUDeviceDescriptor/nonGuaranteedFeatures}} - must be a subset of those in |adapter|.{{adapter/[[features]]}}. - - - Each key in |descriptor|.{{GPUDeviceDescriptor/nonGuaranteedLimits}} - must be the name of a member of [=supported limits=]. -
- - 1. If any of the following requirements are unmet, - [=reject=] |promise| with an {{OperationError}} and stop. - -
- - For each type of limit in [=supported limits=], the value of that - limit in |descriptor|.{{GPUDeviceDescriptor/nonGuaranteedLimits}} - must be no [=limit/better=] than the value of that limit in - |adapter|.{{adapter/[[limits]]}}. -
- - 1. If |adapter|.{{adapter/[[current]]}} is `false`, - or the user agent otherwise cannot fulfill the request: - - 1. Let |device| be a new [=device=]. - 1. [=Lose the device=](|device|, `undefined`). - - Note: - This makes |adapter|.{{adapter/[[current]]}} `false`, if it wasn't already. - - Note: - User agents should consider issuing developer-visible warnings in - most or all cases when this occurs. Applications should perform - reinitialization logic starting with {{GPU/requestAdapter()}}. - - 1. [=Resolve=] |promise| with a new {{GPUDevice}} encapsulating |device|, - and stop. - - 1. [=Resolve=] |promise| with a new {{GPUDevice}} object encapsulating - [=a new device=] with the capabilities described by |descriptor|. -
- 1. Return |promise|. - -
-
- -### GPUDeviceDescriptor ### {#gpudevicedescriptor} - -{{GPUDeviceDescriptor}} describes a device request. - - - -{{GPUDeviceDescriptor}} has the following members: - -
- : nonGuaranteedFeatures - :: - The set of {{GPUFeatureName}} values in this sequence defines the exact set of - [=features=] that must be enabled on the device. - - : nonGuaranteedLimits - :: - Defines the exact [=limits=] that must be enabled on the device. - Each key must be the name of a member of [=supported limits=]. - - -
- -#### GPUFeatureName #### {#gpufeaturename} - -Each {{GPUFeatureName}} identifies a set of functionality which, if available, -allows additional usages of WebGPU that would have otherwise been invalid. - - - -## GPUDevice ## {#gpu-device} - -A {{GPUDevice}} encapsulates a [=device=] and exposes -the functionality of that device. - -{{GPUDevice}} is the top-level interface through which [=WebGPU interfaces=] are created. - -To get a {{GPUDevice}}, use {{GPUAdapter/requestDevice()}}. - - - -{{GPUDevice}} has the following attributes: - -
- : features - :: - A set containing the {{GPUFeatureName}} values of the features - supported by the device (i.e. the ones with which it was created). - - : limits - :: - Exposes the limits supported by the device - (which are exactly the ones with which it was created). - - Issue: Should this be an `interface GPUSupportedLimits`? - - : queue - :: - The primary {{GPUQueue}} for this device. -
- -{{GPUDevice}} has the following internal slots: - -
- : \[[device]], of type [=device=], readonly - :: - The [=device=] that this {{GPUDevice}} refers to. -
- -{{GPUDevice}} has the methods listed in its WebIDL definition above. -Those not defined here are defined elsewhere in this document. - -
- : destroy() - :: - Destroys the [=device=], preventing further operations on it. - Outstanding asynchronous operations will fail. - -
- **Called on:** {{GPUDevice}} |this|. - - 1. [=Lose the device=](|this|.{{GPUDevice/[[device]]}}, - {{GPUDeviceLostReason/"destroyed"}}). -
- - Note: - Since no further operations can occur on this device, implementations can free resource - allocations and abort outstanding asynchronous operations immediately. -
- -{{GPUDevice}} objects are [=serializable objects=]. - -
- The steps to serialize a GPUDevice object, - given |value|, |serialized|, and |forStorage|, are: - 1. Set |serialized|.agentCluster to be the [=surrounding agent=]'s [=agent cluster=]. - 1. If |serialized|.agentCluster's [=cross-origin isolated=] is false, throw a "{{DataCloneError}}". - 1. If |forStorage| is `true`, throw a "{{DataCloneError}}". - 1. Set |serialized|.device to the value of |value|.{{GPUDevice/[[device]]}}. -
- -
- The steps to deserialize a GPUDevice object, - given |serialized| and |value|, are: - 1. If |serialized|.agentCluster is not the [=surrounding agent=]'s [=agent cluster=], throw a "{{DataCloneError}}". - 1. Set |value|.{{GPUDevice/[[device]]}} to |serialized|.device. -
- -Issue: `GPUDevice` doesn't really need the cross-origin policy restriction. -It should be usable from multiple agents regardless. Once we describe the serialization -of buffers, textures, and queues - the COOP+COEP logic should be moved in there. - -# Buffers # {#buffers} - -## GPUBuffer ## {#buffer-interface} - -Issue: define buffer (internal object) - -A {{GPUBuffer}} represents a block of memory that can be used in GPU operations. -Data is stored in linear layout, meaning that each byte of the allocation can be -addressed by its offset from the start of the {{GPUBuffer}}, subject to alignment -restrictions depending on the operation. Some {{GPUBuffer|GPUBuffers}} can be -mapped which makes the block of memory accessible via an {{ArrayBuffer}} called -its mapping. - -{{GPUBuffer|GPUBuffers}} are created via -{{GPUDevice/createBuffer(descriptor)|GPUDevice.createBuffer(descriptor)}} -that returns a new buffer in the [=buffer state/mapped=] or [=buffer state/unmapped=] state. - - - -{{GPUBuffer}} has the following internal slots: - -
- : \[[size]] of type {{GPUSize64}}. - :: - The length of the {{GPUBuffer}} allocation in bytes. - - : \[[usage]] of type {{GPUBufferUsageFlags}}. - :: - The allowed usages for this {{GPUBuffer}}. - - : \[[state]] of type [=buffer state=]. - :: - The current state of the {{GPUBuffer}}. - - : \[[mapping]] of type {{ArrayBuffer}} or {{Promise}} or `null`. - :: - The mapping for this {{GPUBuffer}}. The {{ArrayBuffer}} isn't directly accessible - and is instead accessed through views into it, called the mapped ranges, that are - stored in {{GPUBuffer/[[mapped_ranges]]}} - - Issue(gpuweb/gpuweb#605): Specify {{GPUBuffer/[[mapping]]}} in term of `DataBlock` similarly - to `AllocateArrayBuffer`? - - : \[[mapping_range]] of type [=list=]<[=Number=]> or `null`. - :: - The range of this {{GPUBuffer}} that is mapped. - - : \[[mapped_ranges]] of type [=list=]<{{ArrayBuffer}}> or `null`. - :: - The {{ArrayBuffer}}s returned via {{GPUBuffer/getMappedRange}} to the application. They are tracked - so they can be detached when {{GPUBuffer/unmap}} is called. - - : \[[map_mode]] of type {{GPUMapModeFlags}}. - :: - The {{GPUMapModeFlags}} of the last call to {{GPUBuffer/mapAsync()}} (if any). -
- -Issue: {{GPUBuffer/[[usage]]}} is differently named from {{GPUTexture/[[textureUsage]]}}. -We should make it consistent. - -Each {{GPUBuffer}} has a current buffer state on the [=Content timeline=] -which is one of the following: - - - "mapped" where the {{GPUBuffer}} is - available for CPU operations on its content. - - "mapped at creation" where the {{GPUBuffer}} was - just created and is available for CPU operations on its content. - - "mapping pending" where the {{GPUBuffer}} is - being made available for CPU operations on its content. - - "unmapped" where the {{GPUBuffer}} is - available for GPU operations. - - "destroyed" where the {{GPUBuffer}} is - no longer available for any operations except {{GPUBuffer/destroy}}. - -Note: -{{GPUBuffer/[[size]]}} and {{GPUBuffer/[[usage]]}} are immutable once the -{{GPUBuffer}} has been created. - -
- Note: {{GPUBuffer}} has a state machine with the following states. - ({{GPUBuffer/[[mapping]]}}, {{GPUBuffer/[[mapping_range]]}}, - and {{GPUBuffer/[[mapped_ranges]]}} are `null` when not specified.) - - - [=buffer state/unmapped=] and [=buffer state/destroyed=]. - - [=buffer state/mapped=] or [=buffer state/mapped at creation=] with an - {{ArrayBuffer}} typed {{GPUBuffer/[[mapping]]}}, a sequence of two - numbers in {{GPUBuffer/[[mapping_range]]}} and a sequence of {{ArrayBuffer}} - in {{GPUBuffer/[[mapped_ranges]]}} - - [=buffer state/mapping pending=] with a {{Promise}} typed {{GPUBuffer/[[mapping]]}}. -
- -{{GPUBuffer}} is {{Serializable}}. It is a reference to an internal buffer -object, and {{Serializable}} means that the reference can be *copied* between -realms (threads/workers), allowing multiple realms to access it concurrently. -Since {{GPUBuffer}} has internal state (mapped, destroyed), that state is -internally-synchronized - these state changes occur atomically across realms. - -## Buffer Creation ## {#buffer-creation} - -### {{GPUBufferDescriptor}} ### {#GPUBufferDescriptor} - -This specifies the options to use in creating a {{GPUBuffer}}. - - - -
- validating GPUBufferDescriptor(device, descriptor) - 1. If device is lost return `false`. - 1. If any of the bits of |descriptor|'s {{GPUBufferDescriptor/usage}} aren't present in this device's [[allowed buffer usages]] return `false`. - 1. If both the {{GPUBufferUsage/MAP_READ}} and {{GPUBufferUsage/MAP_WRITE}} bits of |descriptor|'s {{GPUBufferDescriptor/usage}} attribute are set, return `false`. - 1. Return `true`. -
- -## Buffer Usage ## {#buffer-usage} - - - -
- : createBuffer(descriptor) - :: - Creates a {{GPUBuffer}}. - -
- **Called on:** {{GPUDevice}} |this|. - - **Arguments:** -
-                |descriptor|: Description of the {{GPUBuffer}} to create.
-            
- - **Returns:** {{GPUBuffer}} - - 1. If any of the following conditions are unsatisfied, return an error buffer and stop. -
- - |this| is a [=valid=] {{GPUDevice}}. - - |descriptor|.{{GPUBufferDescriptor/usage}} is a subset of |this|.[[allowed buffer usages]]. - - If |descriptor|.{{GPUBufferDescriptor/usage}} contains {{GPUBufferUsage/MAP_READ}}: - - |descriptor|.{{GPUBufferDescriptor/usage}} contains no other flags - except {{GPUBufferUsage/COPY_DST}}. - - If |descriptor|.{{GPUBufferDescriptor/usage}} contains {{GPUBufferUsage/MAP_WRITE}}: - - |descriptor|.{{GPUBufferDescriptor/usage}} contains no other flags - except {{GPUBufferUsage/COPY_SRC}}. - - If |descriptor|.{{GPUBufferDescriptor/mappedAtCreation}} is `true`: - - |descriptor|.{{GPUBufferDescriptor/size}} is a multiple of 4. - - Issue(gpuweb/gpuweb#605): Explain that the resulting error buffer can still be mapped at creation. - - Issue(gpuweb/gpuweb#605): Explain what are a {{GPUDevice}}'s `[[allowed buffer usages]]`. -
- - 1. Let |b| be a new {{GPUBuffer}} object. - 1. Set |b|.{{GPUBuffer/[[size]]}} to |descriptor|.{{GPUBufferDescriptor/size}}. - 1. Set |b|.{{GPUBuffer/[[usage]]}} to |descriptor|.{{GPUBufferDescriptor/usage}}. - 1. If |descriptor|.{{GPUBufferDescriptor/mappedAtCreation}} is `true`: - - 1. Set |b|.{{GPUBuffer/[[mapping]]}} to a new {{ArrayBuffer}} of size |b|.{{GPUBuffer/[[size]]}}. - 1. Set |b|.{{GPUBuffer/[[mapping_range]]}} to `[0, descriptor.size]`. - 1. Set |b|.{{GPUBuffer/[[mapped_ranges]]}} to `[]`. - 1. Set |b|.{{GPUBuffer/[[state]]}} to [=buffer state/mapped at creation=]. - - Else: - - 1. Set |b|.{{GPUBuffer/[[mapping]]}} to `null`. - 1. Set |b|.{{GPUBuffer/[[mapping_range]]}} to `null`. - 1. Set |b|.{{GPUBuffer/[[mapped_ranges]]}} to `null`. - 1. Set |b|.{{GPUBuffer/[[state]]}} to [=buffer state/unmapped=]. - - 1. Set each byte of |b|'s allocation to zero. - 1. Return |b|. - - Note: it is valid to set {{GPUBufferDescriptor/mappedAtCreation}} to `true` without {{GPUBufferUsage/MAP_READ}} - or {{GPUBufferUsage/MAP_WRITE}} in {{GPUBufferDescriptor/usage}}. This can be used to set the buffer's - initial data. - -
- -
- -## Buffer Destruction ## {#buffer-destruction} - -An application that no longer requires a {{GPUBuffer}} can choose to lose -access to it before garbage collection by calling {{GPUBuffer/destroy()}}. - -Note: This allows the user agent to reclaim the GPU memory associated with the {{GPUBuffer}} -once all previously submitted operations using it are complete. - -
- : destroy() - :: - Destroys the {{GPUBuffer}}. - -
- **Called on:** {{GPUBuffer}} |this|. - - **Returns:** {{undefined}} - - 1. If the |this|.{{GPUBuffer/[[state]]}} is [=buffer state/mapped=] or [=buffer state/mapped at creation=]: - 1. Run the steps to unmap |this|. - - 1. Set |this|.{{GPUBuffer/[[state]]}} to [=buffer state/destroyed=]. - - Issue: Handle error buffers once we have a description of the error monad. -
-
- -## Buffer Mapping ## {#buffer-mapping} - -An application can request to map a {{GPUBuffer}} so that they can access its -content via {{ArrayBuffer}}s that represent part of the {{GPUBuffer}}'s -allocations. Mapping a {{GPUBuffer}} is requested asynchronously with -{{GPUBuffer/mapAsync()}} so that the user agent can ensure the GPU -finished using the {{GPUBuffer}} before the application can access its content. -Once the {{GPUBuffer}} is mapped the application can synchronously ask for access -to ranges of its content with {{GPUBuffer/getMappedRange}}. A mapped {{GPUBuffer}} -cannot be used by the GPU and must be unmapped using {{GPUBuffer/unmap}} before -work using it can be submitted to the [=Queue timeline=]. - -Issue(gpuweb/gpuweb#605): Add client-side validation that a mapped buffer can - only be unmapped and destroyed on the worker on which it was mapped. Likewise - {{GPUBuffer/getMappedRange}} can only be called on that worker. - - - -
- : mapAsync(mode, offset, size) - :: - Maps the given range of the {{GPUBuffer}} and resolves the returned {{Promise}} when the - {{GPUBuffer}}'s content is ready to be accessed with {{GPUBuffer/getMappedRange()}}. - -
- **Called on:** {{GPUBuffer}} |this|. - - **Arguments:** -
-                |mode|: Whether the buffer should be mapped for reading or writing.
-                |offset|: Offset in bytes into the buffer to the start of the range to map.
-                |size|: Size in bytes of the range to map.
-            
- - **Returns:** {{Promise}}<{{undefined}}> - - Issue(gpuweb/gpuweb#605): Handle error buffers once we have a description of the error monad. - - 1. If |size| is unspecified: - 1. Let |rangeSize| be max(0, |this|.{{GPUBuffer/[[size]]}} - |offset|). - - Otherwise, let |rangeSize| be |size|. - - 1. If any of the following conditions are unsatisfied: -
- - |this| is a [=valid=] {{GPUBuffer}}. - - |offset| is a multiple of 8. - - |rangeSize| is a multiple of 4. - - |offset| + |rangeSize| is less or equal to |this|.{{GPUBuffer/[[size]]}} - - |this|.{{GPUBuffer/[[state]]}} is [=buffer state/unmapped=] - - |mode| contains exactly one of {{GPUMapMode/READ}} or {{GPUMapMode/WRITE}}. - - If |mode| contains {{GPUMapMode/READ}} then |this|.{{GPUBuffer/[[usage]]}} must contain {{GPUBufferUsage/MAP_READ}}. - - If |mode| contains {{GPUMapMode/WRITE}} then |this|.{{GPUBuffer/[[usage]]}} must contain {{GPUBufferUsage/MAP_WRITE}}. - - Issue: Do we validate that |mode| contains only valid flags? -
- - Then: - 1. Record a validation error on the current scope. - 1. Return [=a promise rejected with=] an {{OperationError}} on the [=Device timeline=]. - - 1. Let |p| be a new {{Promise}}. - 1. Set |this|.{{GPUBuffer/[[mapping]]}} to |p|. - 1. Set |this|.{{GPUBuffer/[[state]]}} to [=buffer state/mapping pending=]. - 1. Set |this|.{{GPUBuffer/[[map_mode]]}} to |mode|. - 1. Enqueue an operation on the default queue's [=Queue timeline=] that will execute the following: -
- 1. If |this|.{{GPUBuffer/[[state]]}} is [=buffer state/mapping pending=]: - - 1. Let |m| be a new {{ArrayBuffer}} of size |rangeSize|. - 1. Set the content of |m| to the content of |this|'s allocation starting at offset |offset| and for |rangeSize| bytes. - 1. Set |this|.{{GPUBuffer/[[mapping]]}} to |m|. - 1. Set |this|.{{GPUBuffer/[[state]]}} to [=buffer state/mapped=]. - 1. Set |this|.{{GPUBuffer/[[mapping_range]]}} to [|offset|, |offset| + |rangeSize|]. - 1. Set |this|.{{GPUBuffer/[[mapped_ranges]]}} to `[]`. - - 1. Resolve |p|. -
- 1. Return |p|. -
- - : getMappedRange(offset, size) - :: - Returns a {{ArrayBuffer}} with the contents of the {{GPUBuffer}} in the given mapped range. - -
- **Called on:** {{GPUBuffer}} |this|. - - **Arguments:** -
-                |offset|: Offset in bytes into the buffer to return buffer contents from.
-                |size|: Size in bytes of the {{ArrayBuffer}} to return.
-            
- - **Returns:** {{ArrayBuffer}} - - 1. If |size| is unspecified: - 1. Let |rangeSize| be max(0, |this|.{{GPUBuffer/[[size]]}} - |offset|). - - Otherwise, let |rangeSize| be |size|. - - 1. If any of the following conditions are unsatisfied, throw an {{OperationError}} and stop. -
- - |this|.{{GPUBuffer/[[state]]}} is [=buffer state/mapped=] or [=buffer state/mapped at creation=]. - - |offset| is a multiple of 8. - - |rangeSize| is a multiple of 4. - - |offset| is greater than or equal to |this|.{{GPUBuffer/[[mapping_range]]}}[0]. - - |offset| + |rangeSize| is less than or equal to |this|.{{GPUBuffer/[[mapping_range]]}}[1]. - - [|offset|, |offset| + |rangeSize|) does not overlap another range in |this|.{{GPUBuffer/[[mapped_ranges]]}}. - - Note: It is always valid to get mapped ranges of a {{GPUBuffer}} that is - [=buffer state/mapped at creation=], even if it is [=invalid=], because - the [=Content timeline=] might not know it is invalid. - - Issue: Consider aligning mapAsync offset to 8 to match this. -
- - 1. Let |m| be a new {{ArrayBuffer}} of size |rangeSize| pointing at the content - of |this|.{{GPUBuffer/[[mapping]]}} at offset |offset| - |this|.{{GPUBuffer/[[mapping_range]]}}[0]. - - 1. [=list/Append=] |m| to |this|.{{GPUBuffer/[[mapped_ranges]]}}. - - 1. Return |m|. -
- - : unmap() - :: - Unmaps the mapped range of the {{GPUBuffer}} and makes it's contents available for use by the - GPU again. - -
- **Called on:** {{GPUBuffer}} |this|. - - **Returns:** {{undefined}} - - 1. If any of the following conditions are unsatisfied, generate a validation - error and stop. -
- - |this|.{{GPUBuffer/[[state]]}} is not [=buffer state/unmapped=] - - |this|.{{GPUBuffer/[[state]]}} is not [=buffer state/destroyed=] - - Note: It is valid to unmap an error {{GPUBuffer}} that is - [=buffer state/mapped at creation=] because the [=Content timeline=] - might not know it is an error {{GPUBuffer}}. -
- - 1. If |this|.{{GPUBuffer/[[state]]}} is [=buffer state/mapping pending=]: - - 1. [=Reject=] {{GPUBuffer/[[mapping]]}} with an {{AbortError}}. - 1. Set |this|.{{GPUBuffer/[[mapping]]}} to `null`. - - 1. If |this|.{{GPUBuffer/[[state]]}} is [=buffer state/mapped=] or [=buffer state/mapped at creation=]: - - 1. If one of the two following conditions holds: - - - |this|.{{GPUBuffer/[[state]]}} is [=buffer state/mapped at creation=] - - |this|.{{GPUBuffer/[[state]]}} is [=buffer state/mapped=] and |this|.{{GPUBuffer/[[map_mode]]}} contains {{GPUMapMode/WRITE}} - - Then: - 1. Enqueue an operation on the default queue's [=Queue timeline=] that updates the |this|.{{GPUBuffer/[[mapping_range]]}} - of |this|'s allocation to the content of |this|.{{GPUBuffer/[[mapping]]}}. - - 1. Detach each {{ArrayBuffer}} in |this|.{{GPUBuffer/[[mapped_ranges]]}} from its content. - 1. Set |this|.{{GPUBuffer/[[mapping]]}} to `null`. - 1. Set |this|.{{GPUBuffer/[[mapping_range]]}} to `null`. - 1. Set |this|.{{GPUBuffer/[[mapped_ranges]]}} to `null`. - - 1. Set |this|.{{GPUBuffer/[[state]]}} to [=buffer state/unmapped=]. - - Note: When a {{GPUBufferUsage/MAP_READ}} buffer (not currently mapped at creation) is - unmapped, any local modifications done by the application to the mapped ranges - {{ArrayBuffer}} are discarded and will not affect the content of follow-up mappings. -
-
- -# Textures and Texture Views # {#textures} - -Issue: define texture (internal object) - -Issue: define mipmap level, array layer, aspect, slice (concepts) - -## GPUTexture ## {#texture-interface} - - - -{{GPUTexture}} has the following internal slots: - -
- : \[[textureSize]] of type {{GPUExtent3D}}. - :: - The size of the {{GPUTexture}} in texels in [=mipmap level=] 0. - - : \[[mipLevelCount]] of type {{GPUIntegerCoordinate}}. - :: - The total number of the mipmap levels of the {{GPUTexture}}. - - : \[[sampleCount]] of type {{GPUSize32}}. - :: - The number of samples in each texel of the {{GPUTexture}}. - - : \[[dimension]] of type {{GPUTextureDimension}}. - :: - The dimension of the {{GPUTexture}}. - - : \[[format]] of type {{GPUTextureFormat}}. - :: - The format of the {{GPUTexture}}. - - : \[[textureUsage]] of type {{GPUTextureUsageFlags}}. - :: - The allowed usages for this {{GPUTexture}}. - -
- -
- compute render extent(baseSize, mipLevel) - - **Arguments:** - - {{GPUExtent3D}} |baseSize| - - {{GPUSize32}} |mipLevel| - - **Returns:** {{GPUExtent3DDict}} - - 1. Let |extent| be a new {{GPUExtent3DDict}} object. - 1. Set |extent|.{{GPUExtent3DDict/width}} to max(1, |baseSize|.[=Extent3D/width=] ≫ |mipLevel|). - 1. Set |extent|.{{GPUExtent3DDict/height}} to max(1, |baseSize|.[=Extent3D/height=] ≫ |mipLevel|). - 1. Set |extent|.{{GPUExtent3DDict/depthOrArrayLayers}} to 1. - 1. Return |extent|. -
- -Issue: share this definition with the part of the specification that describes sampling. - -### Texture Creation ### {#texture-creation} - - - - - - - -
- : createTexture(descriptor) - :: - Creates a {{GPUTexture}}. - -
- **Called on:** {{GPUDevice}} this. - - **Arguments:** -
-                descriptor: Description of the {{GPUTexture}} to create.
-            
- - **Returns:** {{GPUTexture}} - - Issue: Describe {{GPUDevice/createTexture()}} algorithm steps. -
-
- -### Texture Destruction ### {#texture-destruction} - -An application that no longer requires a {{GPUTexture}} can choose to lose access to it before -garbage collection by calling {{GPUTexture/destroy()}}. - -Note: This allows the user agent to reclaim the GPU memory associated with the {{GPUTexture}} once -all previously submitted operations using it are complete. - -
- : destroy() - :: - Destroys the {{GPUTexture}}. - -
- **Called on:** {{GPUTexture}} this. - - **Returns:** {{undefined}} - - Issue: Describe {{GPUTexture/destroy()}} algorithm steps. -
-
- -## GPUTextureView ## {#gpu-textureview} - - - -{{GPUTextureView}} has the following internal slots: - -
- : \[[texture]] - :: - The {{GPUTexture}} into which this is a view. - - : \[[descriptor]] - :: - The {{GPUTextureViewDescriptor}} describing this texture view. - - All optional fields of {{GPUTextureViewDescriptor}} are defined. - : \[[renderExtent]] - :: - For renderable views, this is the effective {{GPUExtent3DDict}} for rendering. - - Note: this extent depends on the {{GPUTextureViewDescriptor/baseMipLevel}}. - -
- -### Texture View Creation ### {#texture-view-creation} - - - - - - - -
- : createView(descriptor) - :: - Creates a {{GPUTextureView}}. - -
- **Called on:** {{GPUTexture}} |this|. - - **Arguments:** -
-                |descriptor|: Description of the {{GPUTextureView}} to create.
-            
- - **Returns:** |view|, of type {{GPUTextureView}}. - - 1. Set |descriptor| to the result of [$resolving GPUTextureViewDescriptor defaults$] with |descriptor|. - 1. Issue the following steps on the [=Device timeline=] of |this|: -
- 1. If any of the following requirements are unmet: -
- - |this| is [=valid=] - - If the |descriptor|.{{GPUTextureViewDescriptor/aspect}} is -
- : {{GPUTextureAspect/"stencil-only"}} - :: |this|.{{GPUTexture/[[format]]}} must be a [[#depth-formats|depth-stencil format]] - which contains a stencil aspect. - - : {{GPUTextureAspect/"depth-only"}} - :: |this|.{{GPUTexture/[[format]]}} must be a [[#depth-formats|depth-stencil format]] - which contains a depth aspect. -
- - |descriptor|.{{GPUTextureViewDescriptor/mipLevelCount}} must be > 0. - - |descriptor|.{{GPUTextureViewDescriptor/baseMipLevel}} + - |descriptor|.{{GPUTextureViewDescriptor/mipLevelCount}} must be ≤ - |this|.{{GPUTexture/[[mipLevelCount]]}}. - - |descriptor|.{{GPUTextureViewDescriptor/arrayLayerCount}} must be > 0. - - |descriptor|.{{GPUTextureViewDescriptor/baseArrayLayer}} + - |descriptor|.{{GPUTextureViewDescriptor/arrayLayerCount}} must be ≤ - the [$array layer count$] of |this|. - - |descriptor|.{{GPUTextureViewDescriptor/format}} must be |this|.{{GPUTexture/[[format]]}}. -
Allow for creating views with compatible formats as well.
- - If |descriptor|.{{GPUTextureViewDescriptor/dimension}} is: -
- : {{GPUTextureViewDimension/"1d"}} - :: |this|.{{GPUTexture/[[dimension]]}} must be {{GPUTextureDimension/"1d"}}. - :: |descriptor|.{{GPUTextureViewDescriptor/arrayLayerCount}} must be `1`. - - : {{GPUTextureViewDimension/"2d"}} - :: |this|.{{GPUTexture/[[dimension]]}} must be {{GPUTextureDimension/"2d"}}. - :: |descriptor|.{{GPUTextureViewDescriptor/arrayLayerCount}} must be `1`. - - : {{GPUTextureViewDimension/"2d-array"}} - :: |this|.{{GPUTexture/[[dimension]]}} must be {{GPUTextureDimension/"2d"}}. - - : {{GPUTextureViewDimension/"cube"}} - :: |this|.{{GPUTexture/[[dimension]]}} must be {{GPUTextureDimension/"2d"}}. - :: |descriptor|.{{GPUTextureViewDescriptor/arrayLayerCount}} must be `6`. - :: |this|.{{GPUTexture/[[textureSize]]}}.[=Extent3D/width=] must be - |this|.{{GPUTexture/[[textureSize]]}}.[=Extent3D/height=]. - - : {{GPUTextureViewDimension/"cube-array"}} - :: |this|.{{GPUTexture/[[dimension]]}} must be {{GPUTextureDimension/"2d"}}. - :: |descriptor|.{{GPUTextureViewDescriptor/arrayLayerCount}} must be a multiple of `6`. - :: |this|.{{GPUTexture/[[textureSize]]}}.[=Extent3D/width=] must be - |this|.{{GPUTexture/[[textureSize]]}}.[=Extent3D/height=]. - - : {{GPUTextureViewDimension/"3d"}} - :: |this|.{{GPUTexture/[[dimension]]}} must be {{GPUTextureDimension/"3d"}}. - :: |descriptor|.{{GPUTextureViewDescriptor/arrayLayerCount}} must be `1`. -
-
- - Then: - 1. Generate a {{GPUValidationError}} in the current scope with appropriate error message. - 1. Return a new [=invalid=] {{GPUTextureView}}. - - 1. Let |view| be a new {{GPUTextureView}} object. - 1. Set |view|.{{GPUTextureView/[[texture]]}} to |this|. - 1. Set |view|.{{GPUTextureView/[[descriptor]]}} to |descriptor|. - 1. If |this|.{{GPUTexture/[[textureUsage]]}} contains {{GPUTextureUsage/RENDER_ATTACHMENT}}: - 1. Let |renderExtent| be [$compute render extent$](|this|.{{GPUTexture/[[textureSize]]}}, |descriptor|.{{GPUTextureViewDescriptor/baseMipLevel}}). - 1. Set |view|.{{GPUTextureView/[[renderExtent]]}} to |renderExtent|. - 1. Return |view|. -
-
-
- -
- When resolving GPUTextureViewDescriptor defaults for {{GPUTextureViewDescriptor}} - |descriptor| run the following steps: - - 1. Let |resolved| be a copy of |descriptor|. - 1. If |resolved|.{{GPUTextureViewDescriptor/format}} is `undefined`, - set |resolved|.{{GPUTextureViewDescriptor/format}} to |texture|.{{GPUTexture/[[format]]}}. - 1. If |resolved|.{{GPUTextureViewDescriptor/mipLevelCount}} is `undefined`, - set |resolved|.{{GPUTextureViewDescriptor/mipLevelCount}} to |texture|.{{GPUTexture/[[mipLevelCount]]}} - − {{GPUTextureViewDescriptor/baseMipLevel}}. - 1. If |resolved|.{{GPUTextureViewDescriptor/dimension}} is `undefined` and - |texture|.{{GPUTexture/[[dimension]]}} is: -
- : {{GPUTextureDimension/"1d"}} - :: Set |resolved|.{{GPUTextureViewDescriptor/dimension}} to {{GPUTextureViewDimension/"1d"}}. - - : {{GPUTextureDimension/"2d"}} - :: Set |resolved|.{{GPUTextureViewDescriptor/dimension}} to {{GPUTextureViewDimension/"2d"}}. - - : {{GPUTextureDimension/"3d"}} - :: Set |resolved|.{{GPUTextureViewDescriptor/dimension}} to {{GPUTextureViewDimension/"3d"}}. -
- 1. If |resolved|.{{GPUTextureViewDescriptor/arrayLayerCount}} is `undefined` and - |resolved|.{{GPUTextureViewDescriptor/dimension}} is: -
- : {{GPUTextureViewDimension/"1d"}}, {{GPUTextureViewDimension/"2d"}}, or - {{GPUTextureViewDimension/"3d"}} - :: Set |resolved|.{{GPUTextureViewDescriptor/arrayLayerCount}} to `1`. - - : {{GPUTextureViewDimension/"cube"}} - :: Set |resolved|.{{GPUTextureViewDescriptor/arrayLayerCount}} to `6`. - - : {{GPUTextureViewDimension/"2d-array"}} or {{GPUTextureViewDimension/"cube-array"}} - :: Set |resolved|.{{GPUTextureViewDescriptor/arrayLayerCount}} to - |texture|.{{GPUTexture/[[textureSize]]}}.[=Extent3D/depthOrArrayLayers=] − - {{GPUTextureViewDescriptor/baseArrayLayer}}. -
- - 1. Return |resolved|. -
- -
- To determine the array layer count of {{GPUTexture}} |texture|, run the - following steps: - - 1. If |texture|.{{GPUTexture/[[dimension]]}} is: -
- : {{GPUTextureDimension/"1d"}} or {{GPUTextureDimension/"3d"}} - :: Return `1`. - - : {{GPUTextureDimension/"2d"}} - :: Return |texture|.{{GPUTexture/[[textureSize]]}}.[=Extent3D/depthOrArrayLayers=]. -
-
- -## Texture Formats ## {#texture-formats} - -The name of the format specifies the order of components, bits per component, -and data type for the component. - - * `r`, `g`, `b`, `a` = red, green, blue, alpha - * `unorm` = unsigned normalized - * `snorm` = signed normalized - * `uint` = unsigned int - * `sint` = signed int - * `float` = floating point - -If the format has the `-srgb` suffix, then sRGB conversions from gamma to linear -and vice versa are applied during the reading and writing of color values in the -shader. Compressed texture formats are provided by [=features=]. Their naming -should follow the convention here, with the texture name as a prefix. e.g. -`etc2-rgba8unorm`. - -The texel block is a single addressable element of the textures in pixel-based {{GPUTextureFormat}}s, -and a single compressed block of the textures in block-based compressed {{GPUTextureFormat}}s. - -The texel block width and texel block height specifies the dimension of one [=texel block=]. - - For pixel-based {{GPUTextureFormat}}s, the [=texel block width=] and [=texel block height=] are always 1. - - For block-based compressed {{GPUTextureFormat}}s, the [=texel block width=] is the number of texels in each row of one [=texel block=], - and the [=texel block height=] is the number of texel rows in one [=texel block=]. - -The texel block size of a {{GPUTextureFormat}} is the number of bytes to store one [=texel block=]. -The [=texel block size=] of each {{GPUTextureFormat}} is constant except for {{GPUTextureFormat/"stencil8"}}, {{GPUTextureFormat/"depth24plus"}}, and {{GPUTextureFormat/"depth24plus-stencil8"}}. - - - -The depth aspect of the {{GPUTextureFormat/"depth24plus"}}) and {{GPUTextureFormat/"depth24plus-stencil8"}}) -formats may be implemented as either a 24-bit unsigned normalized value ("depth24unorm") -or a 32-bit IEEE 754 floating point value ("depth32float"). - -Issue: add something on GPUAdapter(?) that gives an estimate of the bytes per texel of "stencil8" - -The {{GPUTextureFormat/stencil8}}) format may be implemented as -either a real "stencil8", or "depth24stencil8", where the depth aspect is -hidden and inaccessible. - -Note: -While the precision of depth32float is strictly higher than the precision of -depth24unorm for all values in the representable range (0.0 to 1.0), -note that the set of representable values is not exactly the same: -for depth24unorm, 1 ULP has a constant value of 1 / (224 − 1); -for depth32float, 1 ULP has a variable value no greater than 1 / (224). - -Issue: {{GPUTextureFormat/"rgb9e5ufloat"}} cannot be used as a color attachment. - -# Samplers # {#samplers} - -## GPUSampler ## {#sampler-interface} - -A {{GPUSampler}} encodes transformations and filtering information that can -be used in a shader to interpret texture resource data. - -{{GPUSampler|GPUSamplers}} are created via {{GPUDevice/createSampler(descriptor)|GPUDevice.createSampler(optional descriptor)}} -that returns a new sampler object. - - - -{{GPUSampler}} has the following internal slots: - -
- : \[[descriptor]], of type {{GPUSamplerDescriptor}}, readonly - :: - The {{GPUSamplerDescriptor}} with which the {{GPUSampler}} was created. - - : \[[isComparison]] of type {{boolean}}. - :: - Whether the {{GPUSampler}} is used as a comparison sampler. - - : \[[isFiltering]] of type {{boolean}}. - :: - Whether the {{GPUSampler}} weights multiple samples of a texture. -
- -## Sampler Creation ## {#sampler-creation} - -### {{GPUSamplerDescriptor}} ### {#GPUSamplerDescriptor} - -A {{GPUSamplerDescriptor}} specifies the options to use to create a {{GPUSampler}}. - - - -- {{GPUSamplerDescriptor/addressModeU}}, {{GPUSamplerDescriptor/addressModeV}}, - and {{GPUSamplerDescriptor/addressModeW}} specify the address modes for the texture width, - height, and depth coordinates, respectively. -- {{GPUSamplerDescriptor/magFilter}} specifies the sampling behavior when the sample footprint - is smaller than or equal to one texel. -- {{GPUSamplerDescriptor/minFilter}} specifies the sampling behavior when the sample footprint - is larger than one texel. -- {{GPUSamplerDescriptor/mipmapFilter}} specifies behavior for sampling between two mipmap levels. -- {{GPUSamplerDescriptor/lodMinClamp}} and {{GPUSamplerDescriptor/lodMaxClamp}} specify the minimum and - maximum levels of detail, respectively, used internally when sampling a texture. -- If {{GPUSamplerDescriptor/compare}} is provided, the sampler will be a comparison sampler with the specified - {{GPUCompareFunction}}. -- {{GPUSamplerDescriptor/maxAnisotropy}} specifies the maximum anisotropy value clamp used by the sampler. - - Note: most implementations support {{GPUSamplerDescriptor/maxAnisotropy}} values in range between 1 and 16, inclusive. - -Issue: explain how LOD is calculated and if there are differences here between platforms. -Issue: explain what anisotropic sampling is - -{{GPUAddressMode}} describes the behavior of the sampler if the sample footprint extends beyond -the bounds of the sampled texture. - -Issue: Describe a "sample footprint" in greater detail. - - - -
- : "clamp-to-edge" - :: - Texture coordinates are clamped between 0.0 and 1.0, inclusive. - - : "repeat" - :: - Texture coordinates wrap to the other side of the texture. - - : "mirror-repeat" - :: - Texture coordinates wrap to the other side of the texture, but the texture is flipped - when the integer part of the coordinate is odd. -
- -{{GPUFilterMode}} describes the behavior of the sampler if the sample footprint does not exactly -match one texel. - - - -
- : "nearest" - :: - Return the value of the texel nearest to the texture coordinates. - - : "linear" - :: - Select two texels in each dimension and return a linear interpolation between their values. -
- -{{GPUCompareFunction}} specifies the behavior of a comparison sampler. If a comparison sampler is -used in a shader, an input value is compared to the sampled texture value, and the result of this -comparison test (0.0f for pass, or 1.0f for fail) is used in the filtering operation. - -Issue: describe how filtering interacts with comparison sampling. - - - -
- : "never" - :: - Comparison tests never pass. - - : "less" - :: - A provided value passes the comparison test if it is less than the sampled value. - - : "equal" - :: - A provided value passes the comparison test if it is equal to the sampled value. - - : "less-equal" - :: - A provided value passes the comparison test if it is less than or equal to the sampled value. - - : "greater" - :: - A provided value passes the comparison test if it is greater than the sampled value. - - : "not-equal" - :: - A provided value passes the comparison test if it is not equal to the sampled value. - - : "greater-equal" - :: - A provided value passes the comparison test if it is greater than or equal to the sampled value. - - : "always" - :: - Comparison tests always pass. -
- -
- validating GPUSamplerDescriptor(device, descriptor) - **Arguments:** - - {{GPUDevice}} |device| - - {{GPUSamplerDescriptor}} |descriptor| - - **Returns:** {{boolean}} - - Return `true` if and only if all of the following conditions are satisfied: - - |device| is valid. - - |descriptor|.{{GPUSamplerDescriptor/lodMinClamp}} is greater than or equal to 0. - - |descriptor|.{{GPUSamplerDescriptor/lodMaxClamp}} is greater than or equal to - |descriptor|.{{GPUSamplerDescriptor/lodMinClamp}}. - - |descriptor|.{{GPUSamplerDescriptor/maxAnisotropy}} is greater than or equal to 1. - - When |descriptor|.{{GPUSamplerDescriptor/maxAnisotropy}} is greater than 1, - |descriptor|.{{GPUSamplerDescriptor/magFilter}}, |descriptor|.{{GPUSamplerDescriptor/minFilter}}, - and |descriptor|.{{GPUSamplerDescriptor/mipmapFilter}} must be equal to {{GPUFilterMode/"linear"}}. -
- -
- : createSampler(descriptor) - :: - Creates a {{GPUBindGroupLayout}}. - -
- **Called on:** {{GPUDevice}} this. - - **Arguments:** -
-            |descriptor|: Description of the {{GPUSampler}} to create.
-        
- - **Returns:** {{GPUSampler}} - - 1. Let |s| be a new {{GPUSampler}} object. - 1. Set |s|.{{GPUSampler/[[descriptor]]}} to |descriptor|. - 1. Set |s|.{{GPUSampler/[[isComparison]]}} to `false` if the {{GPUSamplerDescriptor/compare}} attribute - of |s|.{{GPUSampler/[[descriptor]]}} is `null` or undefined. Otherwise, set it to `true`. - 1. Set |s|.{{GPUSampler/[[isFiltering]]}} to `false` if none of {{GPUSamplerDescriptor/minFilter}}, - {{GPUSamplerDescriptor/magFilter}}, or {{GPUSamplerDescriptor/mipmapFilter}} has the value of - {{GPUFilterMode/"linear"}}. Otherwise, set it to `true`. - 1. Return |s|. - -
- Valid Usage - - If |descriptor| is not `null` or undefined: - - If [$validating GPUSamplerDescriptor$](this, |descriptor|) returns `false`: - 1. Generate a {{GPUValidationError}} in the current scope with appropriate error message. - 1. Create a new [=invalid=] {{GPUSampler}} and return the result. -
-
-
- -# Resource Binding # {#bindings} - -## GPUBindGroupLayout ## {#bind-group-layout} - -A {{GPUBindGroupLayout}} defines the interface between a set of resources bound in a {{GPUBindGroup}} and their accessibility in shader stages. - - - -### Creation ### {#bind-group-layout-creation} - -A {{GPUBindGroupLayout}} is created via {{GPUDevice/createBindGroupLayout()|GPUDevice.createBindGroupLayout()}}. - - - -A {{GPUBindGroupLayoutEntry}} describes a single shader resource binding to be included in a {{GPUBindGroupLayout}}. - - - -{{GPUBindGroupLayoutEntry}} dictionaries have the following members: - -
- : binding - :: - A unique identifier for a resource binding within a - {{GPUBindGroupLayoutEntry}}, a corresponding {{GPUBindGroupEntry}}, - and the {{GPUShaderModule}}s. - - : visibility - :: - A bitset of the members of {{GPUShaderStage}}. - Each set bit indicates that a {{GPUBindGroupLayoutEntry}}'s resource - will be accessible from the associated shader stage. - - : buffer - :: - When not `undefined` indicates the [=binding resource type=] for this {{GPUBindGroupLayoutEntry}} - is {{GPUBufferBinding}}. - - : sampler - :: - When not `undefined` indicates the [=binding resource type=] for this {{GPUBindGroupLayoutEntry}} - is {{GPUSampler}}. - - : texture - :: - When not `undefined` indicates the [=binding resource type=] for this {{GPUBindGroupLayoutEntry}} - is {{GPUTextureView}}. - - : storageTexture - :: - When not `undefined` indicates the [=binding resource type=] for this {{GPUBindGroupLayoutEntry}} - is {{GPUTextureView}}. -
- -The [=binding member=] of a {{GPUBindGroupLayoutEntry}} is determined by which member of the {{GPUBindGroupLayoutEntry}} -is defined: {{GPUBindGroupLayoutEntry/buffer}}, {{GPUBindGroupLayoutEntry/sampler}}, -{{GPUBindGroupLayoutEntry/texture}}, or {{GPUBindGroupLayoutEntry/storageTexture}}. Only one may be -defined for any given {{GPUBindGroupLayoutEntry}}. Each member has an associated {{GPUBindingResource}} -type and each [=binding type=] has an associated [=internal usage=], given by this table: - - - - - - - - - - - - - - - - - - -
Binding member - Resource type - Binding type
-
Binding usage -
{{GPUBindGroupLayoutEntry/buffer}} - {{GPUBufferBinding}} - {{GPUBufferBindingType/"uniform"}} - [=internal usage/constant=] -
{{GPUBufferBindingType/"storage"}} - [=internal usage/storage=] -
{{GPUBufferBindingType/"read-only-storage"}} - [=internal usage/storage-read=] - -
{{GPUBindGroupLayoutEntry/sampler}} - {{GPUSampler}} - {{GPUSamplerBindingType/"filtering"}} - [=internal usage/constant=] -
{{GPUSamplerBindingType/"non-filtering"}} - [=internal usage/constant=] -
{{GPUSamplerBindingType/"comparison"}} - [=internal usage/constant=] - -
{{GPUBindGroupLayoutEntry/texture}} - {{GPUTextureView}} - {{GPUTextureSampleType/"float"}} - [=internal usage/constant=] -
{{GPUTextureSampleType/"unfilterable-float"}} - [=internal usage/constant=] -
{{GPUTextureSampleType/"depth"}} - [=internal usage/constant=] -
{{GPUTextureSampleType/"sint"}} - [=internal usage/constant=] -
{{GPUTextureSampleType/"uint"}} - [=internal usage/constant=] - -
{{GPUBindGroupLayoutEntry/storageTexture}} - {{GPUTextureView}} - {{GPUStorageTextureAccess/"read-only"}} - [=internal usage/storage-read=] -
{{GPUStorageTextureAccess/"write-only"}} - [=internal usage/storage-write=] -
- -
- To get the layout entry binding type in a given {{GPUBindGroupLayoutEntry}} |entry|: - - 1. If |entry|.{{GPUBindGroupLayoutEntry/buffer}} is not `undefined`: - 1. Return |entry|.{{GPUBindGroupLayoutEntry/buffer}}.{{GPUBufferBindingLayout/type}}. - 1. If |entry|.{{GPUBindGroupLayoutEntry/sampler}} is not `undefined`: - 1. Return |entry|.{{GPUBindGroupLayoutEntry/sampler}}.{{GPUSamplerBindingLayout/type}}. - 1. If |entry|.{{GPUBindGroupLayoutEntry/texture}} is not `undefined`: - 1. Return |entry|.{{GPUBindGroupLayoutEntry/texture}}.{{GPUTextureBindingLayout/sampleType}}. - 1. If |entry|.{{GPUBindGroupLayoutEntry/storageTexture}} is not `undefined`: - 1. Return |entry|.{{GPUBindGroupLayoutEntry/storageTexture}}.{{GPUStorageTextureBindingLayout/access}}. -
- - - -{{GPUBufferBindingLayout}} dictionaries have the following members: - -
- : type - :: - Indicates the type required for buffers bound to this bindings. - - : hasDynamicOffset - :: - Indicates whether this binding requires a dynamic offset. - - : minBindingSize - :: - May be used to indicate the minimum buffer binding size. -
- - - -{{GPUSamplerBindingLayout}} dictionaries have the following members: - -
- : type - :: - Indicates the required type of a sampler bound to this bindings. -
- - - -Issue(https://github.com/gpuweb/gpuweb/issues/851): consider making {{GPUTextureBindingLayout/sampleType}} -truly optional. - -{{GPUTextureBindingLayout}} dictionaries have the following members: - -
- : sampleType - :: - Indicates the type required for texture views bound to this binding. - - : viewDimension - :: - Indicates the required {{GPUTextureViewDescriptor/dimension}} for texture views bound to - this binding. - - Note: - This enables Metal-based WebGPU implementations to back the respective bind groups with - `MTLArgumentBuffer` objects that are more efficient to bind at run-time. - - : multisampled - :: - Inicates whether or not texture views bound to this binding must be multisampled. -
- - - -Issue(https://github.com/gpuweb/gpuweb/issues/851): consider making {{GPUStorageTextureBindingLayout/format}} -truly optional. - -{{GPUStorageTextureBindingLayout}} dictionaries have the following members: - -
- : access - :: - Indicates whether texture views bound to this binding will be bound for read-only or - write-only access. - - : format - :: - The required {{GPUTextureViewDescriptor/format}} of texture views bound to this binding. - - : viewDimension - :: - Indicates the required {{GPUTextureViewDescriptor/dimension}} for texture views bound to - this binding. - - Note: - This enables Metal-based WebGPU implementations to back the respective bind groups with - `MTLArgumentBuffer` objects that are more efficient to bind at run-time. -
- -A {{GPUBindGroupLayout}} object has the following internal slots: - -
- : \[[entryMap]] of type [=ordered map=]<{{GPUSize32}}, {{GPUBindGroupLayoutEntry}}>. - :: - The map of binding indices pointing to the {{GPUBindGroupLayoutEntry}}s, - which this {{GPUBindGroupLayout}} describes. - - : \[[dynamicOffsetCount]] of type {{GPUSize32}}. - :: - The number of buffer bindings with dynamic offsets in this {{GPUBindGroupLayout}}. -
- -
- : createBindGroupLayout(descriptor) - :: - Creates a {{GPUBindGroupLayout}}. - -
- **Called on:** {{GPUDevice}} |this|. - - **Arguments:** -
-                |descriptor|: Description of the {{GPUBindGroupLayout}} to create.
-            
- - **Returns:** {{GPUBindGroupLayout}} - - 1. Let |layout| be a new valid {{GPUBindGroupLayout}} object. - 1. Let |limits| be |this|.{{GPUDevice/[[device]]}}.{{device/[[limits]]}}. - 1. Issue the following steps on the [=Device timeline=] of |this|: -
- 1. If any of the following conditions are unsatisfied: -
- - |this| is a [=valid=] {{GPUDevice}}. - - The {{GPUBindGroupLayoutEntry/binding}} of each entry in |descriptor| is unique. - - For each shader stage, the number of entries in |descriptor| with a [$layout entry binding type$] of - {{GPUBufferBindingType/"uniform"}} ≤ - |limits|.{{supported limits/maxUniformBuffersPerShaderStage}}. - - For each shader stage, the number of entries in |descriptor| with a [$layout entry binding type$] of - {{GPUBufferBindingType/"storage"}} ≤ - |limits|.{{supported limits/maxStorageBuffersPerShaderStage}}. - - For each shader stage, the number of entries in |descriptor| with a [=binding member=] of - {{GPUBindGroupLayoutEntry/texture}} ≤ - |limits|.{{supported limits/maxSampledTexturesPerShaderStage}}. - - For each shader stage, the number of entries in |descriptor| with a [=binding member=] of - {{GPUBindGroupLayoutEntry/storageTexture}} ≤ - |limits|.{{supported limits/maxStorageTexturesPerShaderStage}}. - - For each shader stage, the number of entries in |descriptor| with a [=binding member=] of - {{GPUBindGroupLayoutEntry/sampler}} ≤ - |limits|.{{supported limits/maxSamplersPerShaderStage}}. - - The number of entries in |descriptor| with a [$layout entry binding type$] of - {{GPUBufferBindingType/"uniform"}} and {{GPUBindGroupLayoutEntry/buffer}}.{{GPUBufferBindingLayout/hasDynamicOffset}} `true` ≤ - |limits|.{{supported limits/maxDynamicUniformBuffersPerPipelineLayout}}. - - The number of entries in |descriptor| with a [$layout entry binding type$] of - {{GPUBufferBindingType/"storage"}} and {{GPUBindGroupLayoutEntry/buffer}}.{{GPUBufferBindingLayout/hasDynamicOffset}} `true` ≤ - |limits|.{{supported limits/maxDynamicStorageBuffersPerPipelineLayout}}. - - - For each {{GPUBindGroupLayoutEntry}} |bindingDescriptor| in |descriptor|.{{GPUBindGroupLayoutDescriptor/entries}}: - - Let |bufferLayout| be |bindingDescriptor|.{{GPUBindGroupLayoutEntry/buffer}} - - Let |samplerLayout| be |bindingDescriptor|.{{GPUBindGroupLayoutEntry/sampler}} - - Let |textureLayout| be |bindingDescriptor|.{{GPUBindGroupLayoutEntry/texture}} - - Let |storageTextureLayout| be |bindingDescriptor|.{{GPUBindGroupLayoutEntry/storageTexture}} - - - Exactly one of |bufferLayout|, |samplerLayout|, |textureLayout|, - or |storageTextureLayout| are not `undefined`. - - - If |bindingDescriptor|.{{GPUBindGroupLayoutEntry/visibility}} includes - {{GPUShaderStage/VERTEX}}: - - The [$layout entry binding type$] of |bindingDescriptor| is not - {{GPUBufferBindingType/"storage"}} or {{GPUStorageTextureAccess/"write-only"}}. - - - If the |textureLayout| is not `undefined` and - |textureLayout|.{{GPUTextureBindingLayout/multisampled}} is `true`: - - |textureLayout|.{{GPUTextureBindingLayout/viewDimension}} is - {{GPUTextureViewDimension/"2d"}}. - - |textureLayout|.{{GPUTextureBindingLayout/sampleType}} is not - {{GPUTextureSampleType/"float"}}. - - - If |storageTextureLayout| is not `undefined`: - - |storageTextureLayout|.{{GPUStorageTextureBindingLayout/viewDimension}} is not - {{GPUTextureViewDimension/"cube"}} or {{GPUTextureViewDimension/"cube-array"}}. - - |storageTextureLayout|.{{GPUStorageTextureBindingLayout/format}} must be a format - which can support storage usage. -
- - Then: - 1. Generate a {{GPUValidationError}} in the current scope with appropriate - error message. - 1. Make |layout| [=invalid=] and return |layout|. - - 1. Set |layout|.{{GPUBindGroupLayout/[[dynamicOffsetCount]]}} to the number of - entries in |descriptor| where {{GPUBindGroupLayoutEntry/buffer}} is not `undefined` and - {{GPUBindGroupLayoutEntry/buffer}}.{{GPUBufferBindingLayout/hasDynamicOffset}} is `true`. - 1. For each {{GPUBindGroupLayoutEntry}} |bindingDescriptor| in - |descriptor|.{{GPUBindGroupLayoutDescriptor/entries}}: - 1. Insert |bindingDescriptor| into |layout|.{{GPUBindGroupLayout/[[entryMap]]}} - with the key of |bindingDescriptor|.{{GPUBindGroupLayoutEntry/binding}}. -
- 1. Return |layout|. - -
-
- -### Compatibility ### {#bind-group-compatibility} - -
-Two {{GPUBindGroupLayout}} objects |a| and |b| are considered group-equivalent -if and only if, for any binding number |binding|, one of the following conditions is satisfied: - - it's missing from both |a|.{{GPUBindGroupLayout/[[entryMap]]}} and |b|.{{GPUBindGroupLayout/[[entryMap]]}}. - - |a|.{{GPUBindGroupLayout/[[entryMap]]}}[|binding|] == |b|.{{GPUBindGroupLayout/[[entryMap]]}}[|binding|] -
- -If bind groups layouts are [=group-equivalent=] they can be interchangeably used in all contents. - -## GPUBindGroup ## {#gpu-bind-group} - -A {{GPUBindGroup}} defines a set of resources to be bound together in a group - and how the resources are used in shader stages. - - - -### Bind Group Creation ### {#bind-group-creation} - -A {{GPUBindGroup}} is created via {{GPUDevice/createBindGroup()|GPUDevice.createBindGroup()}}. - - - -A {{GPUBindGroupEntry}} describes a single resource to be bound in a {{GPUBindGroup}}. - - - - - - * {{GPUBufferBinding/size}}: If undefined, specifies the range starting at - {{GPUBufferBinding/offset}} and ending at the end of the buffer. - -A {{GPUBindGroup}} object has the following internal slots: - -
- : \[[layout]] of type {{GPUBindGroupLayout}}. - :: - The {{GPUBindGroupLayout}} associated with this {{GPUBindGroup}}. - - : \[[entries]] of type sequence<{{GPUBindGroupEntry}}>. - :: - The set of {{GPUBindGroupEntry}}s this {{GPUBindGroup}} describes. - - : \[[usedResources]] of type [=ordered map=]<[=subresource=], [=list=]<[=internal usage=]>>. - :: - The set of buffer and texture [=subresource=]s used by this bind group, - associated with lists of the [=internal usage=] flags. -
- -
- : createBindGroup(descriptor) - :: - Creates a {{GPUBindGroup}}. - -
- **Called on:** {{GPUDevice}} |this|. - - **Arguments:** -
-                |descriptor|: Description of the {{GPUBindGroup}} to create.
-            
- - **Returns:** {{GPUBindGroup}} - - 1. Let |bindGroup| be a new valid {{GPUBindGroup}} object. - 1. Let |limits| be |this|.{{GPUDevice/[[device]]}}.{{device/[[limits]]}}.{{supported limits/maxUniformBufferBindingSize}}. - 1. Issue the following steps on the [=Device timeline=] of |this|: -
- 1. If any of the following conditions are unsatisfied: -
- - |this| is a [=valid=] {{GPUDevice}}. - - |descriptor|.{{GPUBindGroupDescriptor/layout}} is [$valid to use with$] |this|. - - The number of {{GPUBindGroupLayoutDescriptor/entries}} of - |descriptor|.{{GPUBindGroupDescriptor/layout}} is exactly equal to - the number of |descriptor|.{{GPUBindGroupDescriptor/entries}}. - - For each {{GPUBindGroupEntry}} |bindingDescriptor| in - |descriptor|.{{GPUBindGroupDescriptor/entries}}: - - Let |resource| be |bindingDescriptor|.{{GPUBindGroupEntry/resource}}. - - There is exactly one {{GPUBindGroupLayoutEntry}} |layoutBinding| - in |descriptor|.{{GPUBindGroupDescriptor/layout}}.{{GPUBindGroupLayoutDescriptor/entries}} - such that |layoutBinding|.{{GPUBindGroupLayoutEntry/binding}} equals to - |bindingDescriptor|.{{GPUBindGroupEntry/binding}}. - - - If the defined [=binding member=] for |layoutBinding| is -
- : {{GPUBindGroupLayoutEntry/sampler}} - :: - - |resource| is a {{GPUSampler}}. - - |resource| is [$valid to use with$] |this|. - - If the [$layout entry binding type$] of |layoutBinding| is -
- : {{GPUSamplerBindingType/"filtering"}} - :: |resource|.{{GPUSampler/[[isComparison]]}} is `false`. - - : {{GPUSamplerBindingType/"non-filtering"}} - :: - |resource|.{{GPUSampler/[[isFiltering]]}} is `false`. - |resource|.{{GPUSampler/[[isComparison]]}} is `false`. - - : {{GPUSamplerBindingType/"comparison"}} - :: |resource|.{{GPUSampler/[[isComparison]]}} is `true`. -
- - : {{GPUBindGroupLayoutEntry/texture}} - :: - - |resource| is a {{GPUTextureView}}. - - |resource| is [$valid to use with$] |this|. - - Let |texture| be |resource|.{{GPUTextureView/[[texture]]}}. - - |layoutBinding|.{{GPUBindGroupLayoutEntry/texture}}.{{GPUTextureBindingLayout/viewDimension}} - is equal to |resource|'s {{GPUTextureViewDescriptor/dimension}}. - - |layoutBinding|.{{GPUBindGroupLayoutEntry/texture}}.{{GPUTextureBindingLayout/sampleType}} - is compatible with |resource|'s {{GPUTextureViewDescriptor/format}}. - - |texture|'s {{GPUTextureDescriptor/usage}} includes {{GPUTextureUsage/SAMPLED}}. - - If |layoutBinding|.{{GPUBindGroupLayoutEntry/texture}}.{{GPUTextureBindingLayout/multisampled}} - is `true`, |texture|'s {{GPUTextureDescriptor/sampleCount}} - > `1`, Otherwise |texture|'s {{GPUTextureDescriptor/sampleCount}} is `1`. - - : {{GPUBindGroupLayoutEntry/storageTexture}} - :: - - |resource| is a {{GPUTextureView}}. - - |resource| is [$valid to use with$] |this|. - - Let |texture| be |resource|.{{GPUTextureView/[[texture]]}}. - - |layoutBinding|.{{GPUBindGroupLayoutEntry/storageTexture}}.{{GPUStorageTextureBindingLayout/viewDimension}} - is equal to |resource|'s {{GPUTextureViewDescriptor/dimension}}. - - |layoutBinding|.{{GPUBindGroupLayoutEntry/storageTexture}}.{{GPUStorageTextureBindingLayout/format}} - is equal to |resource|.{{GPUTextureView/[[descriptor]]}}.{{GPUTextureViewDescriptor/format}}. - - |texture|'s {{GPUTextureDescriptor/usage}} includes {{GPUTextureUsage/STORAGE}}. - - : {{GPUBindGroupLayoutEntry/buffer}} - :: - - |resource| is a {{GPUBufferBinding}}. - - |resource|.{{GPUBufferBinding/buffer}} is [$valid to use with$] |this|. - - The bound part designated by |resource|.{{GPUBufferBinding/offset}} and - |resource|.{{GPUBufferBinding/size}} resides inside the buffer. - - If |layoutBinding|.{{GPUBindGroupLayoutEntry/buffer}}.{{GPUBufferBindingLayout/minBindingSize}} - is not `undefined`: - - The effective binding size, that is either explict in - |resource|.{{GPUBufferBinding/size}} or derived from - |resource|.{{GPUBufferBinding/offset}} and the full - size of the buffer, is greater than or equal to - |layoutBinding|.{{GPUBindGroupLayoutEntry/buffer}}.{{GPUBufferBindingLayout/minBindingSize}}. - - - If |layoutBinding|.{{GPUBindGroupLayoutEntry/buffer}}.{{GPUBufferBindingLayout/type}} is -
- : {{GPUBufferBindingType/"uniform"}} - :: |resource|.{{GPUBufferBinding/buffer}}.{{GPUBufferDescriptor/usage}} - includes {{GPUBufferUsage/UNIFORM}}. - :: |resource|.{{GPUBufferBinding/size}} ≤ - |limits|.{{supported limits/maxUniformBufferBindingSize}}. - :: Issue: This validation should take into account the default when {{GPUBufferBinding/size}} is not set. - Also should {{GPUBufferBinding/size}} default to the `buffer.byteLength - offset` or - `min(buffer.byteLength - offset, limits.maxUniformBufferBindingSize)`? - - : {{GPUBufferBindingType/"storage"}} or - {{GPUBufferBindingType/"read-only-storage"}} - :: |resource|.{{GPUBufferBinding/buffer}}.{{GPUBufferDescriptor/usage}} - includes {{GPUBufferUsage/STORAGE}}. - :: |resource|.{{GPUBufferBinding/size}} ≤ - |limits|.{{supported limits/maxStorageBufferBindingSize}}. -
- -
- -
- - Issue: define the association between texture formats and component types - - Then: - 1. Generate a {{GPUValidationError}} in the current scope with appropriate - error message. - 1. Make |bindGroup| [=invalid=] and return |bindGroup|. - - 1. Let |bindGroup|.{{GPUBindGroup/[[layout]]}} = - |descriptor|.{{GPUBindGroupDescriptor/layout}}. - 1. Let |bindGroup|.{{GPUBindGroup/[[entries]]}} = - |descriptor|.{{GPUBindGroupDescriptor/entries}}. - 1. Let |bindGroup|.{{GPUBindGroup/[[usedResources]]}} = {}. - - 1. For each {{GPUBindGroupEntry}} |bindingDescriptor| in - |descriptor|.{{GPUBindGroupDescriptor/entries}}: - 1. Let |internalUsage| be the [=binding usage=] for |layoutBinding|. - 1. Each [=subresource=] seen by |resource| is added to {{GPUBindGroup/[[usedResources]]}} as |internalUsage|. -
- 1. Return |bindGroup|. - - Issue: define the "effective buffer binding size" separately. -
-
- -## GPUPipelineLayout ## {#pipeline-layout} - -A {{GPUPipelineLayout}} defines the mapping between resources of all {{GPUBindGroup}} objects set up during command encoding in {{GPUProgrammablePassEncoder/setBindGroup(index, bindGroup, dynamicOffsets)|setBindGroup}}, and the shaders of the pipeline set by {{GPURenderEncoderBase/setPipeline(pipeline)|GPURenderEncoderBase.setPipeline}} or {{GPUComputePassEncoder/setPipeline(pipeline)|GPUComputePassEncoder.setPipeline}}. - -The full binding address of a resource can be defined as a trio of: - 1. shader stage mask, to which the resource is visible - 2. bind group index - 3. binding number - -The components of this address can also be seen as the binding space of a pipeline. A {{GPUBindGroup}} (with the corresponding {{GPUBindGroupLayout}}) covers that space for a fixed bind group index. The contained bindings need to be a superset of the resources used by the shader at this bind group index. - - - -{{GPUPipelineLayout}} has the following internal slots: - -
- : \[[bindGroupLayouts]] of type [=list=]<{{GPUBindGroupLayout}}>. - :: - The {{GPUBindGroupLayout}} objects provided at creation in {{GPUPipelineLayoutDescriptor/bindGroupLayouts|GPUPipelineLayoutDescriptor.bindGroupLayouts}}. -
- -Note: using the same {{GPUPipelineLayout}} for many {{GPURenderPipeline}} or {{GPUComputePipeline}} pipelines guarantees that the user agent doesn't need to rebind any resources internally when there is a switch between these pipelines. - -
-{{GPUComputePipeline}} object X was created with {{GPUPipelineLayout/[[bindGroupLayouts]]|GPUPipelineLayout.bindGroupLayouts}} A, B, C. {{GPUComputePipeline}} object Y was created with {{GPUPipelineLayout/[[bindGroupLayouts]]|GPUPipelineLayout.bindGroupLayouts}} A, D, C. Supposing the command encoding sequence has two dispatches: - - 1. {{GPUProgrammablePassEncoder/setBindGroup(index, bindGroup, dynamicOffsets)|setBindGroup(0, ...)}} - 1. {{GPUProgrammablePassEncoder/setBindGroup(index, bindGroup, dynamicOffsets)|setBindGroup(1, ...)}} - 1. {{GPUProgrammablePassEncoder/setBindGroup(index, bindGroup, dynamicOffsets)|setBindGroup(2, ...)}} - 1. {{GPUComputePassEncoder/setPipeline(pipeline)|setPipeline(X)}} - 1. {{GPUComputePassEncoder/dispatch(x, y, z)|dispatch()}} - 1. {{GPUProgrammablePassEncoder/setBindGroup(index, bindGroup, dynamicOffsets)|setBindGroup(1, ...)}} - 1. {{GPUComputePassEncoder/setPipeline(pipeline)|setPipeline(Y)}} - 1. {{GPUComputePassEncoder/dispatch(x, y, z)|dispatch()}} - -In this scenario, the user agent would have to re-bind the group slot 2 for the second dispatch, even though neither the {{GPUBindGroupLayout}} at index 2 of {{GPUPipelineLayout/[[bindGroupLayouts]]|GPUPipelineLayout.bindGrouplayouts}}, or the {{GPUBindGroup}} at slot 2, change. -
- -Issue: should this example and the note be moved to some "best practices" document? - -Note: the expected usage of the {{GPUPipelineLayout}} is placing the most common and the least frequently changing bind groups at the "bottom" of the layout, meaning lower bind group slot numbers, like 0 or 1. The more frequently a bind group needs to change between draw calls, the higher its index should be. This general guideline allows the user agent to minimize state changes between draw calls, and consequently lower the CPU overhead. - -### Creation ### {#pipeline-layout-creation} - -A {{GPUPipelineLayout}} is created via {{GPUDevice/createPipelineLayout()|GPUDevice.createPipelineLayout()}}. - - - -
- : createPipelineLayout(descriptor) - :: - Creates a {{GPUPipelineLayout}}. - -
- **Called on:** {{GPUDevice}} |this|. - - **Arguments:** -
-                |descriptor|: Description of the {{GPUPipelineLayout}} to create.
-            
- - **Returns:** {{GPUPipelineLayout}} - - 1. If any of the following conditions are unsatisfied: -
- - |this| is a [=valid=] {{GPUDevice}}. - - There are - |this|.{{GPUDevice/[[device]]}}.{{device/[[limits]]}}.{{supported limits/maxBindGroups}} - or fewer elements in - |descriptor|.{{GPUPipelineLayoutDescriptor/bindGroupLayouts}}. - - Every {{GPUBindGroupLayout}} in |descriptor|.{{GPUPipelineLayoutDescriptor/bindGroupLayouts}} - is [$valid to use with$] |this|. -
- - Then: - 1. Generate a {{GPUValidationError}} in the current scope with appropriate error message. - 1. Create a new [=invalid=] {{GPUPipelineLayout}} and return the result. - - 1. Let |pl| be a new {{GPUPipelineLayout}} object. - 1. Set the |pl|.{{GPUPipelineLayout/[[bindGroupLayouts]]}} to - |descriptor|.{{GPUPipelineLayoutDescriptor/bindGroupLayouts}}. - 1. Return |pl|. - - Issue: there will be more limits applicable to the whole pipeline layout. -
-
- -Note: two {{GPUPipelineLayout}} objects are considered equivalent for any usage -if their internal {{GPUPipelineLayout/[[bindGroupLayouts]]}} sequences contain -{{GPUBindGroupLayout}} objects that are [=group-equivalent=]. - -# Shader Modules # {#shader-modules} - -## GPUShaderModule ## {#shader-module} - - - -{{GPUShaderModule}} is {{Serializable}}. It is a reference to an internal -shader module object, and {{Serializable}} means that the reference can be -*copied* between realms (threads/workers), allowing multiple realms to access -it concurrently. Since {{GPUShaderModule}} is immutable, there are no race -conditions. - -### Shader Module Creation ### {#shader-module-creation} - - - -{{GPUShaderModuleDescriptor/sourceMap}}, if defined, MAY be interpreted as a -source-map-v3 format. (https://sourcemaps.info/spec.html) -Source maps are optional, but serve as a standardized way to support dev-tool -integration such as source-language debugging. - -
- : createShaderModule(descriptor) - :: - Creates a {{GPUShaderModule}}. - -
- **Called on:** {{GPUDevice}} this. - - **Arguments:** -
-                descriptor: Description of the {{GPUShaderModule}} to create.
-            
- - **Returns:** {{GPUShaderModule}} - - Issue: Describe {{GPUDevice/createShaderModule()}} algorithm steps. -
-
- -### Shader Module Compilation Information ### {#shader-module-compilation-information} - -
- : compilationInfo() - :: - Returns any messages generated during the {{GPUShaderModule}}'s compilation. - -
- **Called on:** {{GPUShaderModule}} this. - - **Returns:** {{Promise}}<{{GPUCompilationInfo}}> - - Issue: Describe {{GPUShaderModule/compilationInfo()}} algorithm steps. -
-
- -# Pipelines # {#pipelines} - -A pipeline, be it {{GPUComputePipeline}} or {{GPURenderPipeline}}, -represents the complete function done by a combination of the GPU hardware, the driver, -and the user agent, that process the input data in the shape of bindings and vertex buffers, -and produces some output, like the colors in the output render targets. - -Structurally, the [=pipeline=] consists of a sequence of programmable stages (shaders) -and fixed-function states, such as the blending modes. - -Note: Internally, depending on the target platform, -the driver may convert some of the fixed-function states into shader code, -and link it together with the shaders provided by the user. -This linking is one of the reason the object is created as a whole. - -This combination state is created as a single object -(by {{GPUDevice/createComputePipeline(descriptor)|GPUDevice.createComputePipeline()}} or {{GPUDevice/createRenderPipeline(descriptor)|GPUDevice.createRenderPipeline()}}), -and switched as one -(by {{GPUComputePassEncoder/setPipeline(pipeline)|GPUComputePassEncoder.setPipeline}} or {{GPURenderEncoderBase/setPipeline(pipeline)|GPURenderEncoderBase.setPipeline}} correspondingly). - -## Base pipelines ## {#pipeline-base} - - - -{{GPUPipelineBase}} has the following internal slots: - -
- : \[[layout]] of type `GPUPipelineLayout`. - :: - The definition of the layout of resources which can be used with `this`. -
- -{{GPUPipelineBase}} has the following methods: - -
- : getBindGroupLayout(index) - :: - Gets a {{GPUBindGroupLayout}} that is compatible with the {{GPUPipelineBase}}'s - {{GPUBindGroupLayout}} at `index`. - -
- **Called on:** {{GPUPipelineBase}} |this|. - - **Arguments:** -
-                |index|: Index into the pipeline layout's {{GPUPipelineLayout/[[bindGroupLayouts]]}}
-                    sequence.
-            
- - **Returns:** {{GPUBindGroupLayout}} - - 1. If |index| ≥ - |this|.{{GPUObjectBase/[[device]]}}.{{device/[[limits]]}}.{{supported limits/maxBindGroups}}: - 1. Throw a {{RangeError}}. - - 1. If |this| is not [=valid=]: - 1. Return a new error {{GPUBindGroupLayout}}. - - 1. Return a new {{GPUBindGroupLayout}} object that references the same internal object as - |this|.{{GPUPipelineBase/[[layout]]}}.{{GPUPipelineLayout/[[bindGroupLayouts]]}}[|index|]. - - Issue: Specify this more properly once we have internal objects for {{GPUBindGroupLayout}}. - Alternatively only spec is as a new internal objects that's [=group-equivalent=] - - Note: Only returning new {{GPUBindGroupLayout}} objects ensures no synchronization is necessary - between the [=Content timeline=] and the [=Device timeline=]. -
-
- -### Default pipeline layout ### {#default-pipeline-layout} - -A {{GPUPipelineBase}} object that was created without a {{GPUPipelineDescriptorBase/layout}} -has a default layout created and used instead. - -
- - 1. Let |groupDescs| be a sequence of |device|.{{device/[[limits]]}}.{{supported limits/maxBindGroups}} - new {{GPUBindGroupLayoutDescriptor}} objects. - 1. For each |groupDesc| in |groupDescs|: - - 1. Set |groupDesc|.{{GPUBindGroupLayoutDescriptor/entries}} to an empty sequence. - - 1. For each {{GPUProgrammableStage}} |stageDesc| in the descriptor used to create the pipeline: - - 1. Let |stageInfo| be the "reflection information" for |stageDesc|. - - Issue: Define the reflection information concept so that this spec can interface with the WGSL - spec and get information what the interface is for a {{GPUShaderModule}} for a specific - entrypoint. - - 1. Let |shaderStage| be the {{GPUShaderStageFlags}} for |stageDesc|.{{GPUProgrammableStage/entryPoint}} - in |stageDesc|.{{GPUProgrammableStage/module}}. - 1. For each resource |resource| in |stageInfo|'s resource interface: - - 1. Let |group| be |resource|'s "group" decoration. - 1. Let |binding| be |resource|'s "binding" decoration. - 1. Let |entry| be a new {{GPUBindGroupLayoutEntry}}. - 1. Set |entry|.{{GPUBindGroupLayoutEntry/binding}} to |binding|. - 1. Set |entry|.{{GPUBindGroupLayoutEntry/visibility}} to |shaderStage|. - 1. If |resource| is for a sampler binding: - - 1. Let |samplerLayout| be a new {{GPUSamplerBindingLayout}}. - 1. Set |entry|.{{GPUBindGroupLayoutEntry/sampler}} to |samplerLayout|. - - 1. If |resource| is for a comparison sampler binding: - - 1. Let |samplerLayout| be a new {{GPUSamplerBindingLayout}}. - 1. Set |samplerLayout|.{{GPUSamplerBindingLayout/type}} to {{GPUSamplerBindingType/"comparison"}}. - 1. Set |entry|.{{GPUBindGroupLayoutEntry/sampler}} to |samplerLayout|. - - 1. If |resource| is for a buffer binding: - - 1. Let |bufferLayout| be a new {{GPUBufferBindingLayout}}. - - 1. Set |bufferLayout|.{{GPUBufferBindingLayout/minBindingSize}} to |resource|'s minimum buffer binding size. - - Issue: link to a definition for "minimum buffer binding size" in the "reflection information". - - 1. If |resource| is for a read-only storage buffer: - - 1. Set |bufferLayout|.{{GPUBufferBindingLayout/type}} to {{GPUBufferBindingType/"read-only-storage"}}. - - 1. If |resource| is for a storage buffer: - - 1. Set |bufferLayout|.{{GPUBufferBindingLayout/type}} to {{GPUBufferBindingType/"storage"}}. - - 1. Set |entry|.{{GPUBindGroupLayoutEntry/buffer}} to |bufferLayout|. - - 1. If |resource| is for a sampled texture binding: - - 1. Let |textureLayout| be a new {{GPUTextureBindingLayout}}. - - 1. Set |textureLayout|.{{GPUTextureBindingLayout/sampleType}} to |resource|'s component type. - 1. Set |textureLayout|.{{GPUTextureBindingLayout/viewDimension}} to |resource|'s dimension. - 1. If |resource| is for a multisampled texture: - - 1. Set |textureLayout|.{{GPUTextureBindingLayout/multisampled}} to `true`. - - 1. Set |entry|.{{GPUBindGroupLayoutEntry/texture}} to |textureLayout|. - - 1. If |resource| is for a storage texture binding: - - 1. Let |storageTextureLayout| be a new {{GPUStorageTextureBindingLayout}}. - 1. Set |storageTextureLayout|.{{GPUStorageTextureBindingLayout/format}} to |resource|'s format. - 1. Set |storageTextureLayout|.{{GPUStorageTextureBindingLayout/viewDimension}} to |resource|'s dimension. - - 1. If |resource| is for a read-only storage texture: - - 1. Set |storageTextureLayout|.{{GPUStorageTextureBindingLayout/access}} to {{GPUStorageTextureAccess/"read-only"}}. - - 1. If |resource| is for a write-only storage texture: - - 1. Set |storageTextureLayout|.{{GPUStorageTextureBindingLayout/access}} to {{GPUStorageTextureAccess/"write-only"}}. - - 1. Set |entry|.{{GPUBindGroupLayoutEntry/storageTexture}} to |storageTextureLayout|. - - 1. If |groupDescs|[|group|] has an entry |previousEntry| with {{GPUBindGroupLayoutEntry/binding}} equal to |binding|: - - 1. If |entry| has different {{GPUBindGroupLayoutEntry/visibility}} than |previousEntry|: - - 1. Add the bits set in |entry|.{{GPUBindGroupLayoutEntry/visibility}} into |previousEntry|.{{GPUBindGroupLayoutEntry/visibility}} - - 1. If |resource| is for a buffer binding and |entry| has greater - {{GPUBindGroupLayoutEntry/buffer}}.{{GPUBufferBindingLayout/minBindingSize}} - than |previousEntry|: - - 1. Set |previousEntry|.{{GPUBindGroupLayoutEntry/buffer}}.{{GPUBufferBindingLayout/minBindingSize}} - to |entry|.{{GPUBindGroupLayoutEntry/buffer}}.{{GPUBufferBindingLayout/minBindingSize}}. - - 1. If any other property is unequal between |entry| and |previousEntry|: - - 1. Return `null` (which will cause the creation of the pipeline to fail). - - 1. Else - - 1. Append |entry| to |groupDescs|[|group|]. - - 1. Let |groupLayouts| be a new sequence. - 1. For each |groupDesc| in |groupDescs|: - - 1. Append |device|.{{GPUDevice/createBindGroupLayout()}}(|groupDesc|) to |groupLayouts|. - - 1. Let |desc| be a new {{GPUPipelineLayoutDescriptor}}. - 1. Set |desc|.{{GPUPipelineLayoutDescriptor/bindGroupLayouts}} to |groupLayouts|. - 1. Return |device|.{{GPUDevice/createPipelineLayout()}}(|desc|). - - Issue: This fills the pipeline layout with empty bindgroups. Revisit once the behavior of empty bindgroups is specified. - -
- -### GPUProgrammableStage ### {#GPUProgrammableStage} - - - -A {{GPUProgrammableStage}} describes the entry point in the user-provided -{{GPUShaderModule}} that controls one of the programmable stages of a [=pipeline=]. - -
- validating GPUProgrammableStage(stage, descriptor, layout) - **Arguments:** - - {{GPUShaderStage}} |stage| - - {{GPUProgrammableStage}} |descriptor| - - {{GPUPipelineLayout}} |layout| - - Return `true` if all of the following conditions are satisfied: - - - The |descriptor|.{{GPUProgrammableStage/module}} is [=valid=] {{GPUShaderModule}}. - - The |descriptor|.{{GPUProgrammableStage/module}} contains - an entry point at |stage| named |descriptor|.{{GPUProgrammableStage/entryPoint}}. - - For each |binding| that is [=statically used=] by the shader entry point, - the [$validating shader binding$](|binding|, |layout|) returns `true`. - - For each texture sampling shader call that is [=statically used=] by the entry point: - 1. Let |texture| be the {{GPUBindGroupLayoutEntry}} corresponding to the sampled texture in the call. - 1. Let |sampler| be the {{GPUBindGroupLayoutEntry}} corresponding to the used sampler in the call. - 1. One of the following conditions is `false`: - - |texture|.{{GPUTextureBindingLayout/sampleType}} is {{GPUTextureSampleType/"unfilterable-float"}}: - - |sampler|.{{GPUSamplerBindingLayout/type}} is {{GPUSamplerBindingType/"filtering"}}. - -
- -
- validating shader binding(binding, layout) - **Arguments:** - - shader |binding|, reflected from the shader module - - {{GPUPipelineLayout}} |layout| - - Consider the shader |binding| annotation of |bindIndex| for the - binding index and |bindGroup| for the bind group index. - - Return `true` if all of the following conditions are satisfied: - - - |layout|.{{GPUPipelineLayout/[[bindGroupLayouts]]}}[|bindGroup|] contains - a {{GPUBindGroupLayoutEntry}} |entry| whose |entry|.{{GPUBindGroupLayoutEntry/binding}} == |bindIndex|. - - If the defined [=binding member=] for |entry| is: -
- : {{GPUBindGroupLayoutEntry/buffer}} - :: If |entry|.{{GPUBindGroupLayoutEntry/buffer}}.{{GPUBufferBindingLayout/type}} is: -
- : {{GPUBufferBindingType/"uniform"}} - :: The |binding| is a uniform buffer. - : {{GPUBufferBindingType/"storage"}} - :: The |binding| is a storage buffer. - : {{GPUBufferBindingType/"read-only-storage"}} - :: The |binding| is a read-only storage buffer. -
- :: If |entry|.{{GPUBindGroupLayoutEntry/buffer}}.{{GPUBufferBindingLayout/minBindingSize}} is not `0`: - - If the last field of the corresponding structure defined in the shader has an unbounded array type, - then the value of |entry|.{{GPUBindGroupLayoutEntry/buffer}}.{{GPUBufferBindingLayout/minBindingSize}} - must be greater than or equal to the byte offset of that field plus the stride of the unbounded array. - - If the corresponding shader structure doesn't end with an unbounded array type, - then the value of |entry|.{{GPUBindGroupLayoutEntry/buffer}}.{{GPUBufferBindingLayout/minBindingSize}} - must be greater than or equal to the size of the structure. - - : {{GPUBindGroupLayoutEntry/sampler}} - :: If |entry|.{{GPUBindGroupLayoutEntry/sampler}}.{{GPUSamplerBindingLayout/type}} is: -
- : {{GPUSamplerBindingType/"filtering"}} - :: the |binding| is a non-comparison sampler - : {{GPUSamplerBindingType/"non-filtering"}} - :: the |binding| is a non-comparison sampler - : {{GPUSamplerBindingType/"comparison"}} - :: the |binding| is a comparison sampler -
- - : {{GPUBindGroupLayoutEntry/texture}} - :: If |entry|.{{GPUBindGroupLayoutEntry/texture}}.{{GPUTextureBindingLayout/multisampled}} is: -
- : `true` - :: the |binding| is a multisampled texture. - : `false` - :: The |binding| is a sampled texture with a sample count of 1. -
- :: The component type of the texture matches - |entry|.{{GPUBindGroupLayoutEntry/texture}}.{{GPUTextureBindingLayout/sampleType}}. - :: The shader view dimension of the texture matches - |entry|.{{GPUBindGroupLayoutEntry/texture}}.{{GPUTextureBindingLayout/viewDimension}}. - - : {{GPUBindGroupLayoutEntry/storageTexture}} - :: If |entry|.{{GPUBindGroupLayoutEntry/storageTexture}}.{{GPUStorageTextureBindingLayout/access}} is: -
- : {{GPUStorageTextureAccess/"read-only"}} - :: The |binding| is a read-only storage texture. - : {{GPUStorageTextureAccess/"write-only"}} - :: The |binding| is a writable storage texture. -
- :: The format of the storage texture matches - |entry|.{{GPUBindGroupLayoutEntry/storageTexture}}.{{GPUStorageTextureBindingLayout/format}}. - :: The shader view dimension of the storage texture matches - |entry|.{{GPUBindGroupLayoutEntry/storageTexture}}.{{GPUStorageTextureBindingLayout/viewDimension}}. - -
-
- -A resource binding is considered to be statically used by a shader entry point -if and only if it's reachable by the control flow graph of the shader module, -starting at the entry point. - -## GPUComputePipeline ## {#compute-pipeline} - -A {{GPUComputePipeline}} is a kind of [=pipeline=] that controls the compute shader stage, -and can be used in {{GPUComputePassEncoder}}. - -Compute inputs and outputs are all contained in the bindings, -according to the given {{GPUPipelineLayout}}. -The outputs correspond to {{GPUBindGroupLayoutEntry/buffer}} bindings with a type of {{GPUBufferBindingType/"storage"}} -and {{GPUBindGroupLayoutEntry/storageTexture}} bindings with a type of {{GPUStorageTextureAccess/"write-only"}}. - -Stages of a compute [=pipeline=]: - 1. Compute shader - - - -### Creation ### {#compute-pipeline-creation} - - - -
- : createComputePipeline(descriptor) - :: - Creates a {{GPUComputePipeline}}. - -
- **Called on:** {{GPUDevice}} |this|. - - **Arguments:** -
-                |descriptor|: Description of the {{GPUComputePipeline}} to create.
-            
- - **Returns:** {{GPUComputePipeline}} - - If any of the following conditions are unsatisfied: -
- - |this| is a [=valid=] {{GPUDevice}}. - - |descriptor|.{{GPUPipelineDescriptorBase/layout}} is [$valid to use with$] |this|. - - [$validating GPUProgrammableStage$]({{GPUShaderStage/COMPUTE}}, - |descriptor|.{{GPUComputePipelineDescriptor/compute}}, - |descriptor|.{{GPUPipelineDescriptorBase/layout}}) succeeds. -
- - Then: - 1. Generate a {{GPUValidationError}} in the current scope with appropriate error message. - 1. Create a new [=invalid=] {{GPUComputePipeline}} and return the result. -
- - : createComputePipelineAsync(descriptor) - :: - Creates a {{GPUComputePipeline}}. The returned {{Promise}} resolves when the created pipeline - is ready to be used without additional delay. - - If pipeline creation fails, the returned {{Promise}} resolves to an [=invalid=] - {{GPUComputePipeline}} object. - - Note: Use of this method is preferred whenever possible, as it prevents blocking the - [=queue timeline=] work on pipeline compilation. - -
- **Called on:** {{GPUDevice}} |this|. - - **Arguments:** -
-                |descriptor|: Description of the {{GPUComputePipeline}} to create.
-            
- - **Returns:** {{Promise}}<{{GPUComputePipeline}}> - - 1. Let |promise| be [=a new promise=]. - 1. Issue the following steps on the [=Device timeline=] of |this|: -
- 1. Let |pipeline| be a new {{GPUComputePipeline}} created as if - |this|.{{GPUDevice/createComputePipeline()}} was called with |descriptor|; - - 1. When |pipeline| is ready to be used, [=resolve=] |promise| with |pipeline|. -
- 1. Return |promise|. -
-
- -## GPURenderPipeline ## {#render-pipeline} - -A {{GPURenderPipeline}} is a kind of [=pipeline=] that controls the vertex -and fragment shader stages, and can be used in {{GPURenderPassEncoder}} -as well as {{GPURenderBundleEncoder}}. - -Render [=pipeline=] inputs are: - - bindings, according to the given {{GPUPipelineLayout}} - - vertex and index buffers, described by {{GPUVertexState}} - - the color attachments, described by {{GPUColorTargetState}} - - optionally, the depth-stencil attachment, described by {{GPUDepthStencilState}} - -Render [=pipeline=] outputs are: - - {{GPUBindGroupLayoutEntry/buffer}} bindings with a {{GPUBufferBindingLayout/type}} of {{GPUBufferBindingType/"storage"}} - - {{GPUBindGroupLayoutEntry/storageTexture}} bindings with a {{GPUStorageTextureBindingLayout/access}} of {{GPUStorageTextureAccess/"write-only"}} - - the color attachments, described by {{GPUColorTargetState}} - - optionally, depth-stencil attachment, described by {{GPUDepthStencilState}} - -A render [=pipeline=] is comprised of the following render stages: - 1. Vertex fetch, controlled by {{GPUVertexState/buffers|GPUVertexState.buffers}} - 2. Vertex shader, controlled by {{GPUVertexState}} - 3. Primitive assembly, controlled by {{GPUPrimitiveState}} - 4. Rasterization, controlled by {{GPUPrimitiveState}}, {{GPUDepthStencilState}}, and {{GPUMultisampleState}} - 5. Fragment shader, controlled by {{GPUFragmentState}} - 6. Stencil test and operation, controlled by {{GPUDepthStencilState}} - 7. Depth test and write, controlled by {{GPUDepthStencilState}} - 8. Output merging, controlled by {{GPUFragmentState/targets|GPUFragmentState.targets}} - - - -{{GPURenderPipeline}} has the following internal slots: - -
- : \[[descriptor]], of type {{GPURenderPipelineDescriptor}} - :: - The {{GPURenderPipelineDescriptor}} describing this pipeline. - - All optional fields of {{GPURenderPipelineDescriptor}} are defined. - - : \[[strip_index_format]], of type {{GPUIndexFormat}}? - :: - The format index data this pipeline requires if using a strip primitive topology, - initially `undefined`. -
- -### Creation ### {#render-pipeline-creation} - - - -A {{GPURenderPipelineDescriptor}} describes the state of a render [=pipeline=] by -configuring each of the [=render stages=]. See [[#rendering-operations]] for the -details. - -- {{GPURenderPipelineDescriptor/vertex}} describes - the vertex shader entry point of the [=pipeline=] and its input buffer layouts. -- {{GPURenderPipelineDescriptor/primitive}} describes the - the primitive-related properties of the [=pipeline=]. -- {{GPURenderPipelineDescriptor/depthStencil}} describes - the optional depth-stencil properties, including the testing, operations, and bias. -- {{GPURenderPipelineDescriptor/multisample}} describes - the multi-sampling properties of the [=pipeline=]. -- {{GPURenderPipelineDescriptor/fragment}} describes - the fragment shader entry point of the [=pipeline=] and its output colors. - If it's `null`, the [[#no-color-output]] mode is enabled. - -
- : createRenderPipeline(descriptor) - :: - Creates a {{GPURenderPipeline}}. - -
- **Called on:** {{GPUDevice}} |this|. - - **Arguments:** -
-                |descriptor|: Description of the {{GPURenderPipeline}} to create.
-            
- - **Returns:** {{GPURenderPipeline}} - - 1. Let |pipeline| be a new valid {{GPURenderPipeline}} object. - 1. Issue the following steps on the [=Device timeline=] of |this|: -
- 1. If any of the following conditions are unsatisfied: -
- - |this| is a [=valid=] {{GPUDevice}}. - - |descriptor|.{{GPUPipelineDescriptorBase/layout}} is [$valid to use with$] |this|. - - [$validating GPURenderPipelineDescriptor$](|descriptor|, |this|) succeeds. -
- - Then: - 1. Generate a {{GPUValidationError}} in the current scope with appropriate - error message. - 1. Make |pipeline| [=invalid=]. - - 1. Set |pipeline|.{{GPURenderPipeline/[[descriptor]]}} to |descriptor|. - 1. Set |pipeline|.{{GPURenderPipeline/[[strip_index_format]]}} to - |descriptor|.{{GPURenderPipelineDescriptor/primitive}}.{{GPUPrimitiveState/stripIndexFormat}}. - -
- 1. Return |pipeline|. - - Issue: need description of the render states. -
- - : createRenderPipelineAsync(descriptor) - :: - Creates a {{GPURenderPipeline}}. The returned {{Promise}} resolves when the created pipeline - is ready to be used without additional delay. - - If pipeline creation fails, the returned {{Promise}} resolves to an [=invalid=] - {{GPURenderPipeline}} object. - - Note: Use of this method is preferred whenever possible, as it prevents blocking the - [=queue timeline=] work on pipeline compilation. - -
- **Called on:** {{GPUDevice}} |this|. - - **Arguments:** -
-                |descriptor|: Description of the {{GPURenderPipeline}} to create.
-            
- - **Returns:** {{Promise}}<{{GPURenderPipeline}}> - - 1. Let |promise| be [=a new promise=]. - 1. Issue the following steps on the [=Device timeline=] of |this|: -
- 1. Let |pipeline| be a new {{GPURenderPipeline}} created as if - |this|.{{GPUDevice/createRenderPipeline()}} was called with |descriptor|; - - 1. When |pipeline| is ready to be used, [=resolve=] |promise| with |pipeline|. -
- 1. Return |promise|. -
-
- -
- validating GPURenderPipelineDescriptor(descriptor, device) - **Arguments:** - - {{GPURenderPipelineDescriptor}} |descriptor| - - {{GPUDevice}} |device| - - Return `true` if all of the following conditions are satisfied: - - - [$validating GPUProgrammableStage$]({{GPUShaderStage/VERTEX}}, - |descriptor|.{{GPURenderPipelineDescriptor/vertex}}, - |descriptor|.{{GPUPipelineDescriptorBase/layout}}) succeeds. - - [$validating GPUVertexState$](|device|, |descriptor|.{{GPURenderPipelineDescriptor/vertex}}, - |descriptor|.{{GPURenderPipelineDescriptor/vertex}}) succeeds. - - If |descriptor|.{{GPURenderPipelineDescriptor/fragment}} is not `null`: - - [$validating GPUProgrammableStage$]({{GPUShaderStage/FRAGMENT}}, - |descriptor|.{{GPURenderPipelineDescriptor/fragment}}, - |descriptor|.{{GPUPipelineDescriptorBase/layout}}) succeeds. - - [$validating GPUFragmentState$](|descriptor|.{{GPURenderPipelineDescriptor/fragment}}) succeeds. - - If the output SV_Coverage semantics is [=statically used=] by - |descriptor|.{{GPURenderPipelineDescriptor/fragment}}: - - |descriptor|.{{GPURenderPipelineDescriptor/multisample}}.{{GPUMultisampleState/alphaToCoverageEnabled}} is `false`. - - [$validating GPUPrimitiveState$](|descriptor|.{{GPURenderPipelineDescriptor/primitive}}, |device|.{{device/[[features]]}}) succeeds. - - if |descriptor|.{{GPURenderPipelineDescriptor/depthStencil}} is not `null`: - - [$validating GPUDepthStencilState$](|descriptor|.{{GPURenderPipelineDescriptor/depthStencil}}) succeeds. - - [$validating GPUMultisampleState$](|descriptor|.{{GPURenderPipelineDescriptor/multisample}}) succeeds. -
- -Issue: validate interface matching rules between VS and FS. - -Issue: should we validate that `cullMode` is none for points and lines? - -Issue: define what "compatible" means for render target formats. - -Issue: need a proper limit for the maximum number of color targets. - -### Primitive State ### {#primitive-state} - - - - - -
- validating GPUPrimitiveState(|descriptor|, |features|) - **Arguments:** - - {{GPUPrimitiveState}} |descriptor| - - [=list=]<{{GPUFeatureName}}> |features| - - Return `true` if all of the following conditions are satisfied: - - If |descriptor|.{{GPUPrimitiveState/topology}} is: -
- : {{GPUPrimitiveTopology/"line-strip"}} or - {{GPUPrimitiveTopology/"triangle-strip"}} - :: |descriptor|.{{GPUPrimitiveState/stripIndexFormat}} is not `undefined` - : Otherwise - :: |descriptor|.{{GPUPrimitiveState/stripIndexFormat}} is `undefined` -
- - If |descriptor|.{{GPUPrimitiveState/clampDepth}} is `true`: - - |features| must [=list/contain=] {{GPUFeatureName/"depth-clamping"}}. -
- - - - - -### Multisample State ### {#multisample-state} - - - -
- validating GPUMultisampleState(|descriptor|) - **Arguments:** - - {{GPUMultisampleState}} |descriptor| - - Return `true` if all of the following conditions are satisfied: - - If |descriptor|.{{GPUMultisampleState/alphaToCoverageEnabled}} is `true`: - - |descriptor|.{{GPUMultisampleState/count}} is greater than 1. -
- -### Fragment State ### {#fragment-state} - - - -
- validating GPUFragmentState(|descriptor|) - Return `true` if all of the following requirements are met: - - - |descriptor|.{{GPUFragmentState/targets}}.length must be ≤ 4. - - For each |colorState| layout descriptor in the list |descriptor|.{{GPUFragmentState/targets}}: - - |colorState|.{{GPUColorTargetState/format}} must be listed in [[#plain-color-formats]] - with {{GPUTextureUsage/RENDER_ATTACHMENT}} capability. - - If |colorState|.{{GPUColorTargetState/blend}} is not `undefined`: - - The |colorState|.{{GPUColorTargetState/format}} must be filterable - according to the [[#plain-color-formats]] table. - - |colorState|.{{GPUColorTargetState/blend}}.{{GPUBlendState/color}} - must be a [=valid GPUBlendComponent=]. - - |colorState|.{{GPUColorTargetState/blend}}.{{GPUBlendState/alpha}} - must be a [=valid GPUBlendComponent=]. - - |colorState|.{{GPUColorTargetState/writeMask}} must be < 16. - - |descriptor|.{{GPUProgrammableStage/module}} must contain an output variable that: - - is [=statically used=] by |descriptor|.{{GPUProgrammableStage/entryPoint}}, and - - has a type that is compatible with |colorState|.{{GPUColorTargetState/format}}. -
- -
- |component| is a valid GPUBlendComponent if it meets the following requirements: - - - If |component|.{{GPUBlendComponent/operation}} is - {{GPUBlendOperation/"min"}} or {{GPUBlendOperation/"max"}}: - - |component|.{{GPUBlendComponent/srcFactor}} and - |component|.{{GPUBlendComponent/dstFactor}} must both be {{GPUBlendFactor/"one"}}. -
- -Issue: define the area of reach for "statically used" things of `GPUProgrammableStage` - -### Color Target State ### {#color-target-state} - - - - - - - -#### Blend State #### {#blend-state} - - - - - - - -### Depth/Stencil State ### {#depth-stencil-state} - - - - - - - -
- validating GPUDepthStencilState(descriptor) - **Arguments:** - - {{GPUDepthStencilState}} |descriptor| - - Return `true`, if and only if, all of the following conditions are satisfied: - - - |descriptor|.{{GPUDepthStencilState/format}} is listed in {#depth-formats}. - - if |descriptor|.{{GPUDepthStencilState/depthWriteEnabled}} is `true` or - |descriptor|.{{GPUDepthStencilState/depthCompare}} is not {{GPUCompareFunction/"always"}}: - - |descriptor|.{{GPUDepthStencilState/format}} must have a depth component. - - if |descriptor|.{{GPUDepthStencilState/stencilFront}} or - |descriptor|.{{GPUDepthStencilState/stencilBack}} are not default values: - - |descriptor|.{{GPUDepthStencilState/format}} must have a stencil component. - - Issue: how can this algorithm support depth/stencil formats that are added in extensions? -
- -### Vertex State ### {#vertex-state} - - - -The index format determines both the data type of index values in a buffer and, when used with -strip primitive topologies ({{GPUPrimitiveTopology/"line-strip"}} or -{{GPUPrimitiveTopology/"triangle-strip"}}) also specifies the primitive restart value. The -primitive restart value indicates which index value indicates that a new primitive -should be started rather than continuing to construct the triangle strip with the prior indexed -vertices. - -{{GPUPrimitiveState}}s that specify a strip primitive topology must specify a -{{GPUPrimitiveState/stripIndexFormat}} so that the [=primitive restart value=] that will be used -is known at pipeline creation time. {{GPUPrimitiveState}}s that specify a list primitive -topology must set {{GPUPrimitiveState/stripIndexFormat}} to `undefined`, and will use the index -format passed to {{GPURenderEncoderBase/setIndexBuffer()}} when rendering. - - - - - - - - - - - - - - - - - - -
Index formatPrimitive restart value
{{GPUIndexFormat/"uint16"}}0xFFFF
{{GPUIndexFormat/"uint32"}}0xFFFFFFFF
- -#### Vertex Formats #### {#vertex-formats} - -The name of the format specifies the order of components, bits per component, -and data type for the component. - - * `unorm` = unsigned normalized - * `snorm` = signed normalized - * `uint` = unsigned int - * `sint` = signed int - * `float` = floating point - - - - - - - -A vertex buffer is, conceptually, a view into buffer memory as an *array of structures*. -{{GPUVertexBufferLayout/arrayStride}} is the stride, in bytes, between *elements* of that array. -Each element of a vertex buffer is like a *structure* with a memory layout defined by its -{{GPUVertexBufferLayout/attributes}}, which describe the *members* of the structure. - -Each {{GPUVertexAttribute}} describes its -{{GPUVertexAttribute/format}} and its -{{GPUVertexAttribute/offset}}, in bytes, within the structure. - -Each attribute appears as a separate input in a vertex shader, each bound by a numeric *location*, -which is specified by {{GPUVertexAttribute/shaderLocation}}. -Every location must be unique within the {{GPUVertexState}}. - - - - - -
- validating GPUVertexBufferLayout(device, descriptor, vertexStage) - **Arguments:** - - {{GPUDevice}} |device| - - {{GPUVertexBufferLayout}} |descriptor| - - {{GPUProgrammableStage}} |vertexStage| - - Return `true`, if and only if, all of the following conditions are satisfied: - - - |descriptor|.{{GPUVertexBufferLayout/arrayStride}} ≤ - |device|.{{GPUDevice/[[device]]}}.{{device/[[limits]]}}.{{supported limits/maxVertexBufferArrayStride}}. - - |descriptor|.{{GPUVertexBufferLayout/arrayStride}} is a multiple of 4. - - For each attribute |attrib| in the list |descriptor|.{{GPUVertexBufferLayout/attributes}}: - - If |descriptor|.{{GPUVertexBufferLayout/arrayStride}} is zero: - - |attrib|.{{GPUVertexAttribute/offset}} + sizeof(|attrib|.{{GPUVertexAttribute/format}}) ≤ - |device|.{{GPUDevice/[[device]]}}.{{device/[[limits]]}}.{{supported limits/maxVertexBufferArrayStride}}. - - Otherwise: - - |attrib|.{{GPUVertexAttribute/offset}} + sizeof(|attrib|.{{GPUVertexAttribute/format}}) ≤ - |descriptor|.{{GPUVertexBufferLayout/arrayStride}}. - - |attrib|.{{GPUVertexAttribute/offset}} is a multiple of the size of one component of - |attrib|.{{GPUVertexAttribute/format}}. - - |attrib|.{{GPUVertexAttribute/shaderLocation}} is less than - |device|.{{GPUDevice/[[device]]}}.{{device/[[limits]]}}.{{supported limits/maxVertexAttributes}}. - - For every vertex attribute in the shader reflection of |vertexStage|.{{GPUProgrammableStage/module}} - that is know to be [=statically used=] by |vertexStage|.{{GPUProgrammableStage/entryPoint}}, - there is a corresponding |attrib| element of |descriptor|.{{GPUVertexBufferLayout/attributes}} for which - all of the following are true: - - The shader format is |attrib|.{{GPUVertexAttribute/format}}. - - The shader location is |attrib|.{{GPUVertexAttribute/shaderLocation}}. -
- -
- validating GPUVertexState(device, descriptor) - **Arguments:** - - {{GPUDevice}} |device| - - {{GPUVertexState}} |descriptor| - - Return `true`, if and only if, all of the following conditions are satisfied: - - - |descriptor|.{{GPUVertexState/buffers}}.length is less than or equal to - |device|.{{GPUDevice/[[device]]}}.{{device/[[limits]]}}.{{supported limits/maxVertexBuffers}}. - - Each |vertexBuffer| layout descriptor in the list |descriptor|.{{GPUVertexState/buffers}} - passes [$validating GPUVertexBufferLayout$](|device|, |vertexBuffer|, |descriptor|) - - The sum of |vertexBuffer|.{{GPUVertexBufferLayout/attributes}}.length, - over every |vertexBuffer| in |descriptor|.{{GPUVertexState/buffers}}, - is less than or equal to - |device|.{{GPUDevice/[[device]]}}.{{device/[[limits]]}}.{{supported limits/maxVertexAttributes}}. - - Each |attrib| in the union of all {{GPUVertexAttribute}} - across |descriptor|.{{GPUVertexState/buffers}} has a distinct - |attrib|.{{GPUVertexAttribute/shaderLocation}} value. -
- -# Command Buffers # {#command-buffers} - -Command buffers are pre-recorded lists of [=GPU commands=] that can be submitted to a {{GPUQueue}} -for execution. Each GPU command represents a task to be performed on the GPU, such as -setting state, drawing, copying resources, etc. - -## GPUCommandBuffer ## {#command-buffer} - - - -{{GPUCommandBuffer}} has the following attributes: - -
- : executionTime of type Promise<{{double}}>, readonly - :: - The total time, in seconds, that the GPU took to execute this command buffer. - - Note: - If {{GPUCommandEncoderDescriptor/measureExecutionTime}} is `true`, - this resolves after the command buffer executes. - Otherwise, this rejects with an {{OperationError}}. - -
- Specify the creation and resolution of the promise. - - In {{GPUCommandEncoder/finish()}}, it should be specified that a - new promise is created and stored in this attribute. - The promise starts rejected if {{GPUCommandEncoderDescriptor/measureExecutionTime}} - is `false`. If the finish() fails, then the promise resolves to 0. - - In {{GPUQueue/submit()}}, it should be specified that (if - {{GPUCommandEncoderDescriptor/measureExecutionTime}} is set), work - is issued to read back the execution time, and, when that completes, - the promise is resolved with that value. - If the submit() fails, then the promise resolves to 0. -
-
- -{{GPUCommandBuffer}} has the following internal slots: - -
- : \[[command_list]] of type [=list=]<[=GPU command=]>. - :: - A [=list=] of [=GPU commands=] to be executed on the [=Queue timeline=] when this command - buffer is submitted. -
- -### Creation ### {#command-buffer-creation} - - - - -# Command Encoding # {#command-encoding} - -## GPUCommandEncoder ## {#command-encoder} - - - -{{GPUCommandEncoder}} has the following internal slots: - -
- : \[[command_list]] of type [=list=]<[=GPU command=]>. - :: - A [=list=] of [=GPU command=] to be executed on the [=Queue timeline=] when the - {{GPUCommandBuffer}} this encoder produces is submitted. - - : \[[state]] of type {{encoder state}}. - :: - The current state of the {{GPUCommandEncoder}}, initially set to {{encoder state/open}}. - - : \[[debug_group_stack]] of type [=stack=]<{{USVString}}>. - :: - A stack of active debug group labels. -
- -Each {{GPUCommandEncoder}} has a current encoder state on the [=Content timeline=] -which may be one of the following: - -
- : "open" - :: - Indicates the {{GPUCommandEncoder}} is available to begin new operations. The {{GPUCommandEncoder/[[state]]}} is - {{encoder state/open}} any time the {{GPUCommandEncoder}} is [=valid=] and has no active - {{GPURenderPassEncoder}} or {{GPUComputePassEncoder}}. - - : "encoding a render pass" - :: - Indicates the {{GPUCommandEncoder}} has an active {{GPURenderPassEncoder}}. The - {{GPUCommandEncoder/[[state]]}} becomes {{encoder state/encoding a render pass}} once - {{GPUCommandEncoder/beginRenderPass()}} is called sucessfully until {{GPURenderPassEncoder/endPass()}} is called - on the returned {{GPURenderPassEncoder}}, at which point the {{GPUCommandEncoder/[[state]]}} - (if the encoder is still valid) reverts to {{encoder state/open}}. - - : "encoding a compute pass" - :: - Indicates the {{GPUCommandEncoder}} has an active {{GPUComputePassEncoder}}. The - {{GPUCommandEncoder/[[state]]}} becomes {{encoder state/encoding a compute pass}} once - {{GPUCommandEncoder/beginComputePass()}} is called sucessfully until {{GPUComputePassEncoder/endPass()}} is - called on the returned {{GPUComputePassEncoder}}, at which point the {{GPUCommandEncoder/[[state]]}} - (if the encoder is still valid) reverts to {{encoder state/open}}. - - : "closed" - :: - Indicates the {{GPUCommandEncoder}} is no longer available for any operations. The - {{GPUCommandEncoder/[[state]]}} becomes {{encoder state/closed}} once {{GPUCommandEncoder/finish()}} is called - or the {{GPUCommandEncoder}} otherwise becomes [=invalid=]. -
- -### Creation ### {#command-encoder-creation} - - - -
- : measureExecutionTime - :: - Enable measurement of the GPU execution time of the entire command buffer. -
- -
- : createCommandEncoder(descriptor) - :: - Creates a {{GPUCommandEncoder}}. - -
- **Called on:** {{GPUDevice}} this. - - **Arguments:** -
-                descriptor: Description of the {{GPUCommandEncoder}} to create.
-            
- - **Returns:** {{GPUCommandEncoder}} - - Issue: Describe {{GPUDevice/createCommandEncoder()}} algorithm steps. -
-
- -## Pass Encoding ## {#command-encoder-pass-encoding} - -
- : beginRenderPass(descriptor) - :: - Begins encoding a render pass described by |descriptor|. - -
- **Called on:** {{GPUCommandEncoder}} |this|. - - **Arguments:** -
-                |descriptor|: Description of the {{GPURenderPassEncoder}} to create.
-            
- - **Returns:** {{GPURenderPassEncoder}} - - Issue the following steps on the [=Device timeline=] of |this|: -
- 1. If any of the following conditions are unsatisfied, generate a validation - error and stop. -
- - |this|.{{GPUCommandEncoder/[[state]]}} is {{encoder state/open}}. - - |descriptor| meets the - [$GPURenderPassDescriptor/Valid Usage$] rules. -
- 1. Set |this|.{{GPUCommandEncoder/[[state]]}} to {{encoder state/encoding a render pass}}. - 1. For each |colorAttachment| in |descriptor|.{{GPURenderPassDescriptor/colorAttachments}}: - 1. The [=texture subresource=] seen by |colorAttachment|.{{GPURenderPassColorAttachment/view}} - is considered to be used as [=internal usage/attachment=] for the - duration of the render pass. - 1. Let |depthStencilAttachment| be |descriptor|.{{GPURenderPassDescriptor/depthStencilAttachment}}. - 1. If |depthStencilAttachment| is not `null`: - 1. if |depthStencilAttachment|.{{GPURenderPassDepthStencilAttachment/depthReadOnly}} and - {{GPURenderPassDepthStencilAttachment/stencilReadOnly}} are set - 1. The [=texture subresources=] seen by |depthStencilAttachment|.{{GPURenderPassDepthStencilAttachment/view}} - are considered to be used as [=internal usage/attachment-read=] for the duration of the render pass. - 1. Else, the [=texture subresource=] seen by |depthStencilAttachment|.{{GPURenderPassDepthStencilAttachment/view}} - is considered to be used as [=internal usage/attachment=] for the duration of the render pass. -
- - Issue: specify the behavior of read-only depth/stencil - Issue: Enqueue attachment loads (with loadOp clear). -
- - : beginComputePass(descriptor) - :: - Begins encoding a compute pass described by |descriptor|. - -
- **Called on:** {{GPUCommandEncoder}} |this|. - - **Arguments:** -
-                descriptor:
-            
- - **Returns:** {{GPUComputePassEncoder}} - - Issue the following steps on the [=Device timeline=] of |this|: -
- 1. If any of the following conditions are unsatisfied, generate a validation - error and stop. -
- - |this|.{{GPUCommandEncoder/[[state]]}} is {{encoder state/open}}. -
- 1. Set |this|.{{GPUCommandEncoder/[[state]]}} to {{encoder state/encoding a compute pass}}. -
-
-
- -## Copy Commands ## {#copy-commands} - -### GPUImageDataLayout ### {#gpu-image-data-layout} - - - -A {{GPUImageDataLayout}} is a layout of images within some linear memory. -It's used when copying data between a [=texture=] and a [=buffer=], or when scheduling a -write into a [=texture=] from the {{GPUQueue}}. - - - For {{GPUTextureDimension/2d}} textures, data is copied between one or multiple contiguous [=images=] and [=array layers=]. - - For {{GPUTextureDimension/3d}} textures, data is copied between one or multiple contiguous [=images=] and depth [=slices=]. - -Operations that copy between byte arrays and textures always work with rows of [=texel blocks=], -which we'll call block rows. It's not possible to update only a part of a [=texel block=]. - -Issue: Define images more precisely. In particular, define them as being comprised of [=texel blocks=]. - -Issue: Define the exact copy semantics, by reference to common algorithms shared by the copy methods. - -
- : bytesPerRow - :: - The stride, in bytes, between the beginning of each [=block row=] and the subsequent [=block row=]. - - Required if there are multiple [=block rows=] (i.e. the height or depth is more than one block). - - : rowsPerImage - :: - Number of [=block rows=] per single [=image=] of the [=texture=]. - {{GPUImageDataLayout/rowsPerImage}} × - {{GPUImageDataLayout/bytesPerRow}} is the stride, in bytes, between the beginning of each [=image=] of data and the subsequent [=image=]. - - Required if there are multiple [=images=] (i.e. the depth is more than one). -
- -### GPUImageCopyBuffer ### {#gpu-image-copy-buffer} - - - -A {{GPUImageCopyBuffer}} contains the actual [=texture=] data placed in a [=buffer=] according to {{GPUImageDataLayout}}. - -
-validating GPUImageCopyBuffer - - **Arguments:** - - {{GPUImageCopyBuffer}} |imageCopyBuffer| - - **Returns:** {{boolean}} - - Return `true` if and only if all of the following conditions are satisfied: - - |imageCopyBuffer|.{{GPUImageCopyBuffer/buffer}} must be a [=valid=] {{GPUBuffer}}. - - |imageCopyBuffer|.{{GPUImageDataLayout/bytesPerRow}} must be a multiple of 256. - -
- -### GPUImageCopyTexture ### {#gpu-image-copy-texture} - - - -A {{GPUImageCopyTexture}} is a view of a sub-region of one or multiple contiguous [=texture subresources=] with the initial -offset {{GPUOrigin3D}} in texels, used when copying data from or to a {{GPUTexture}}. - - * {{GPUImageCopyTexture/origin}}: If unspecified, defaults to `[0, 0, 0]`. - -
-validating GPUImageCopyTexture - - **Arguments:** - - {{GPUImageCopyTexture}} |imageCopyTexture| - - {{GPUExtent3D}} |copySize| - - **Returns:** {{boolean}} - - Let: - - |blockWidth| be the [=texel block width=] of |imageCopyTexture|.{{GPUImageCopyTexture/texture}}.{{GPUTexture/[[format]]}}. - - |blockHeight| be the [=texel block height=] of |imageCopyTexture|.{{GPUImageCopyTexture/texture}}.{{GPUTexture/[[format]]}}. - - Return `true` if and only if all of the following conditions apply: - - |imageCopyTexture|.{{GPUImageCopyTexture/texture}} must be a [=valid=] {{GPUTexture}}. - - |imageCopyTexture|.{{GPUImageCopyTexture/mipLevel}} must be less than the {{GPUTexture/[[mipLevelCount]]}} of - |imageCopyTexture|.{{GPUImageCopyTexture/texture}}. - - |imageCopyTexture|.{{GPUImageCopyTexture/origin}}.[=Origin3D/x=] must be a multiple of |blockWidth|. - - |imageCopyTexture|.{{GPUImageCopyTexture/origin}}.[=Origin3D/y=] must be a multiple of |blockHeight|. - - The [=imageCopyTexture subresource size=] of |imageCopyTexture| is equal to |copySize| if either of - the following conditions is true: - - |imageCopyTexture|.{{GPUImageCopyTexture/texture}}.{{GPUTexture/[[format]]}} is a depth-stencil format. - - |imageCopyTexture|.{{GPUImageCopyTexture/texture}}.{{GPUTexture/[[sampleCount]]}} is greater than 1. - -
- -Issue(gpuweb/gpuweb#69): Define the copies with {{GPUTextureDimension/1d}} and {{GPUTextureDimension/3d}} textures. - -### GPUImageCopyImageBitmap ### {#gpu-image-copy-image-bitmap-copy} - - - - * {{GPUImageCopyImageBitmap/origin}}: If unspecified, defaults to `[0, 0]`. - -
- : copyBufferToBuffer(source, sourceOffset, destination, destinationOffset, size) - :: - Encode a command into the {{GPUCommandEncoder}} that copies data from a sub-region of a - {{GPUBuffer}} to a sub-region of another {{GPUBuffer}}. - -
- **Called on:** {{GPUCommandEncoder}} |this|. - - **Arguments:** -
-                |source|: The {{GPUBuffer}} to copy from.
-                |sourceOffset|: Offset in bytes into |source| to begin copying from.
-                |destination|: The {{GPUBuffer}} to copy to.
-                |destinationOffset|: Offset in bytes into |destination| to place the copied data.
-                |size|: Bytes to copy.
-            
- - **Returns:** {{undefined}} - - If any of the following conditions are unsatisfied, generate a validation error and stop. -
- - |this|.{{GPUCommandEncoder/[[state]]}} is {{encoder state/open}}. - - |source| is [$valid to use with$] |this|. - - |destination| is [$valid to use with$] |this|. - - |source|.{{GPUBuffer/[[usage]]}} contains {{GPUBufferUsage/COPY_SRC}}. - - |destination|.{{GPUBuffer/[[usage]]}} contains {{GPUBufferUsage/COPY_DST}}. - - |size| is a multiple of 4. - - |sourceOffset| is a multiple of 4. - - |destinationOffset| is a multiple of 4. - - (|sourceOffset| + |size|) does not overflow a {{GPUSize64}}. - - (|destinationOffset| + |size|) does not overflow a {{GPUSize64}}. - - |source|.{{GPUBuffer/[[size]]}} is greater than or equal to (|sourceOffset| + |size|). - - |destination|.{{GPUBuffer/[[size]]}} is greater than or equal to (|destinationOffset| + |size|). - - |source| and |destination| are not the same {{GPUBuffer}}. -
- - Issue(gpuweb/gpuweb#21): Define the state machine for GPUCommandEncoder. - - Issue(gpuweb/gpuweb#69): figure out how to handle overflows in the spec. -
-
- -### Copy Between Buffer and Texture ### {#copy-between-buffer-texture} - -WebGPU provides {{GPUCommandEncoder/copyBufferToTexture()}} for buffer-to-texture copies and -{{GPUCommandEncoder/copyTextureToBuffer()}} for texture-to-buffer copies. - -The following definitions and validation rules apply to both {{GPUCommandEncoder/copyBufferToTexture()}} -and {{GPUCommandEncoder/copyTextureToBuffer()}}. - -[=imageCopyTexture subresource size=] and [=Valid Texture Copy Range=] also applies to -{{GPUCommandEncoder/copyTextureToTexture()}}. - -
- -imageCopyTexture subresource size - - **Arguments:** - - {{GPUImageCopyTexture}} |imageCopyTexture| - - **Returns:** {{GPUExtent3D}} - - The [=imageCopyTexture subresource size=] of |imageCopyTexture| is calculated as follows: - - Its [=Extent3D/width=], [=Extent3D/height=] and [=Extent3D/depthOrArrayLayers=] are the width, height, and depth, respectively, - of the [=physical size=] of |imageCopyTexture|.{{GPUImageCopyTexture/texture}} [=subresource=] at [=mipmap level=] - |imageCopyTexture|.{{GPUImageCopyTexture/mipLevel}}. - -
- -Issue: define this as an algorithm with (texture, mipmapLevel) parameters and use the call syntax instead of referring to the definition by label. - -
- validating linear texture data(layout, byteSize, format, copyExtent) - - **Arguments:** - : {{GPUImageDataLayout}} |layout| - :: Layout of the linear texture data. - : {{GPUSize64}} |byteSize| - :: Total size of the linear data, in bytes. - : {{GPUTextureFormat}} |format| - :: Format of the texture. - : {{GPUExtent3D}} |copyExtent| - :: Extent of the texture to copy. - - 1. Let |blockWidth|, |blockHeight|, and |blockSize| be the - [=texel block width=], [=texel block height|height=], and - [=texel block size|size=] of |format|. - - 1. It is assumed that |copyExtent|.[=Extent3D/width=] is a multiple of |blockWidth| - and |copyExtent|.[=Extent3D/height=] is a multiple of |blockHeight|. Let: - - |widthInBlocks| be |copyExtent|.[=Extent3D/width=] ÷ |blockWidth|. - - |heightInBlocks| be |copyExtent|.[=Extent3D/height=] ÷ |blockHeight|. - - |bytesInLastRow| be |blockSize| × |widthInBlocks|. - - 1. Fail if the following conditions are not satisfied: -
- - If |heightInBlocks| > 1, - |layout|.{{GPUImageDataLayout/bytesPerRow}} must be specified. - - If |copyExtent|.[=Extent3D/depthOrArrayLayers=] > 1, - |layout|.{{GPUImageDataLayout/bytesPerRow}} and - |layout|.{{GPUImageDataLayout/rowsPerImage}} must be specified. - - If specified, |layout|.{{GPUImageDataLayout/bytesPerRow}} - must be greater than or equal to |bytesInLastRow|. - - If specified, |layout|.{{GPUImageDataLayout/rowsPerImage}} - must be greater than or equal to |heightInBlocks|. -
- - 1. Let |requiredBytesInCopy| be 0. - - 1. If |copyExtent|.[=Extent3D/depthOrArrayLayers=] > 1: - 1. Let |bytesPerImage| be - |layout|.{{GPUImageDataLayout/bytesPerRow}} × - |layout|.{{GPUImageDataLayout/rowsPerImage}}. - 1. Let |bytesBeforeLastImage| be - |bytesPerImage| × (|copyExtent|.[=Extent3D/depthOrArrayLayers=] − 1). - 1. Add |bytesBeforeLastImage| to |requiredBytesInCopy|. - - 1. If |copyExtent|.[=Extent3D/depthOrArrayLayers=] > 0: - - 1. If |heightInBlocks| > 1, add - |layout|.{{GPUImageDataLayout/bytesPerRow}} × - (|heightInBlocks| − 1) - to |requiredBytesInCopy|. - - 1. If |heightInBlocks| > 0, add - |bytesInLastRow| to |requiredBytesInCopy|. - - 1. Fail if the following conditions are not satisfied: -
- - |layout|.{{GPUImageDataLayout/offset}} + |requiredBytesInCopy| ≤ |byteSize|. -
-
- -
- -Valid Texture Copy Range - -Given a {{GPUImageCopyTexture}} |imageCopyTexture| and a {{GPUExtent3D}} |copySize|, let - - |blockWidth| be the [=texel block width=] of |imageCopyTexture|.{{GPUImageCopyTexture/texture}}.{{GPUTexture/[[format]]}}. - - |blockHeight| be the [=texel block height=] of |imageCopyTexture|.{{GPUImageCopyTexture/texture}}.{{GPUTexture/[[format]]}}. - -The following validation rules apply: - - - If the {{GPUTexture/[[dimension]]}} of |imageCopyTexture|.{{GPUImageCopyTexture/texture}} is - {{GPUTextureDimension/1d}}: - - Both |copySize|.[=Extent3D/height=] and [=Extent3D/depthOrArrayLayers=] must be 1. - - If the {{GPUTexture/[[dimension]]}} of |imageCopyTexture|.{{GPUImageCopyTexture/texture}} is - {{GPUTextureDimension/2d}}: - - (|imageCopyTexture|.{{GPUImageCopyTexture/origin}}.[=Origin3D/x=] + |copySize|.[=Extent3D/width=]), - (|imageCopyTexture|.{{GPUImageCopyTexture/origin}}.[=Origin3D/y=] + |copySize|.[=Extent3D/height=]), and - (|imageCopyTexture|.{{GPUImageCopyTexture/origin}}.[=Origin3D/z=] + |copySize|.[=Extent3D/depthOrArrayLayers=]) - must be less than or equal to the - [=Extent3D/width=], [=Extent3D/height=], and [=Extent3D/depthOrArrayLayers=], respectively, - of the [=imageCopyTexture subresource size=] of |imageCopyTexture|. - - |copySize|.[=Extent3D/width=] must be a multiple of |blockWidth|. - - |copySize|.[=Extent3D/height=] must be a multiple of |blockHeight|. - -
- -Issue(gpuweb/gpuweb#69): Define the copies with {{GPUTextureDimension/1d}} and -{{GPUTextureDimension/3d}} textures. - -Issue(gpuweb/gpuweb#537): Additional restrictions on rowsPerImage if needed. - -Issue(gpuweb/gpuweb#652): Define the copies with {{GPUTextureFormat/"depth24plus"}}, -{{GPUTextureFormat/"depth24plus-stencil8"}}, and {{GPUTextureFormat/"stencil8"}}. - -Issue: convert "Valid Texture Copy Range" into an algorithm with parameters, similar to "validating linear texture data" - -
- : copyBufferToTexture(source, destination, copySize) - :: - Encode a command into the {{GPUCommandEncoder}} that copies data from a sub-region of a - {{GPUBuffer}} to a sub-region of one or multiple continuous [=texture subresources=]. - -
- **Called on:** {{GPUCommandEncoder}} |this|. - - **Arguments:** -
-                |source|: Combined with |copySize|, defines the region of the source buffer.
-                |destination|: Combined with |copySize|, defines the region of the destination [=texture subresource=].
-                |copySize|:
-            
- - **Returns:** {{undefined}} - - If any of the following conditions are unsatisfied, generate a validation error and stop. -
- - |this|.{{GPUCommandEncoder/[[state]]}} is {{encoder state/open}}. - - [$validating GPUImageCopyBuffer$](|source|) returns `true`. - - |source|.{{GPUImageCopyBuffer/buffer}}.{{GPUBuffer/[[usage]]}} contains - {{GPUBufferUsage/COPY_SRC}}. - - [$validating GPUImageCopyTexture$](|destination|, |copySize|) returns `true`. - - |destination|.{{GPUImageCopyTexture/texture}}.{{GPUTexture/[[textureUsage]]}} contains - {{GPUTextureUsage/COPY_DST}}. - - |destination|.{{GPUImageCopyTexture/texture}}.{{GPUTexture/[[sampleCount]]}} is 1. - - If |destination|.{{GPUImageCopyTexture/texture}}.{{GPUTexture/[[format]]}} is a depth-stencil format: - - |destination|.{{GPUImageCopyTexture/aspect}} must refer to a single copyable aspect of - |destination|.{{GPUImageCopyTexture/texture}}.{{GPUTexture/[[format]]}}. - See [[#depth-formats|depth-formats]]. - - [=Valid Texture Copy Range=] applies to |destination| and |copySize|. - - If |destination|.{{GPUImageCopyTexture/texture}}.{{GPUTexture/[[format]]}} is not a depth/stencil format: - - |source|.{{GPUImageDataLayout/offset}} is a multiple of the [=texel block size=] of - |destination|.{{GPUImageCopyTexture/texture}}.{{GPUTexture/[[format]]}}. - - If |destination|.{{GPUImageCopyTexture/texture}}.{{GPUTexture/[[format]]}} is a depth/stencil format: - - |source|.{{GPUImageDataLayout/offset}} is a multiple of 4. - - [$validating linear texture data$](|source|, - |source|.{{GPUImageCopyBuffer/buffer}}.{{GPUBuffer/[[size]]}}, - |destination|.{{GPUImageCopyTexture/texture}}.{{GPUTexture/[[format]]}}, - |copySize|) succeeds. -
-
- - : copyTextureToBuffer(source, destination, copySize) - :: - Encode a command into the {{GPUCommandEncoder}} that copies data from a sub-region of one or - multiple continuous [=texture subresources=]to a sub-region of a {{GPUBuffer}}. - -
- **Called on:** {{GPUCommandEncoder}} |this|. - - **Arguments:** -
-                |source|: Combined with |copySize|, defines the region of the source [=texture subresources=].
-                |destination|: Combined with |copySize|, defines the region of the destination buffer.
-                |copySize|:
-            
- - **Returns:** {{undefined}} - - If any of the following conditions are unsatisfied, generate a validation error and stop. -
- - |this|.{{GPUCommandEncoder/[[state]]}} is {{encoder state/open}}. - - [$validating GPUImageCopyTexture$](|source|, |copySize|) returns `true`. - - |source|.{{GPUImageCopyTexture/texture}}.{{GPUTexture/[[textureUsage]]}} contains - {{GPUTextureUsage/COPY_SRC}}. - - |source|.{{GPUImageCopyTexture/texture}}.{{GPUTexture/[[sampleCount]]}} is 1. - - If |source|.{{GPUImageCopyTexture/texture}}.{{GPUTexture/[[format]]}} is a depth-stencil format: - - |source|.{{GPUImageCopyTexture/aspect}} must refer to a single copyable aspect of - |source|.{{GPUImageCopyTexture/texture}}.{{GPUTexture/[[format]]}}. - See [[#depth-formats|depth-formats]]. - - [$validating GPUImageCopyBuffer$](|destination|) returns `true`. - - |destination|.{{GPUImageCopyBuffer/buffer}}.{{GPUBuffer/[[usage]]}} contains - {{GPUBufferUsage/COPY_DST}}. - - [=Valid Texture Copy Range=] applies to |source| and |copySize|. - - If |source|.{{GPUImageCopyTexture/texture}}.{{GPUTexture/[[format]]}} is not a depth/stencil format: - - |destination|.{{GPUImageDataLayout/offset}} is a multiple of the [=texel block size=] of - |source|.{{GPUImageCopyTexture/texture}}.{{GPUTexture/[[format]]}}. - - If |source|.{{GPUImageCopyTexture/texture}}.{{GPUTexture/[[format]]}} is a depth/stencil format: - - |destination|.{{GPUImageDataLayout/offset}} is a multiple of 4. - - [$validating linear texture data$](|destination|, - |destination|.{{GPUImageCopyBuffer/buffer}}.{{GPUBuffer/[[size]]}}, - |source|.{{GPUImageCopyTexture/texture}}.{{GPUTexture/[[format]]}}, - |copySize|) succeeds. -
-
- - : copyTextureToTexture(source, destination, copySize) - :: - Encode a command into the {{GPUCommandEncoder}} that copies data from a sub-region of one - or multiple contiguous [=texture subresources=] to another sub-region of one or - multiple continuous [=texture subresources=]. - -
- **Called on:** {{GPUCommandEncoder}} |this|. - - **Arguments:** -
-                |source|: Combined with |copySize|, defines the region of the source [=texture subresources=].
-                |destination|: Combined with |copySize|, defines the region of the destination [=texture subresources=].
-                |copySize|:
-            
- - **Returns:** {{undefined}} - - 1. If any of the following conditions are unsatisfied, generate a validation error and stop. -
- - |this|.{{GPUCommandEncoder/[[state]]}} is {{encoder state/open}}. - - [$validating GPUImageCopyTexture$](|source|, |copySize|) returns `true`. - - |source|.{{GPUImageCopyTexture/texture}}.{{GPUTexture/[[textureUsage]]}} contains - {{GPUTextureUsage/COPY_SRC}}. - - [$validating GPUImageCopyTexture$](|destination|, |copySize|) returns `true`. - - |destination|.{{GPUImageCopyTexture/texture}}.{{GPUTexture/[[textureUsage]]}} contains - {{GPUTextureUsage/COPY_DST}}. - - |source|.{{GPUImageCopyTexture/texture}}.{{GPUTexture/[[sampleCount]]}} is equal to |destination|. - {{GPUImageCopyTexture/texture}}.{{GPUTexture/[[sampleCount]]}}. - - |source|.{{GPUImageCopyTexture/texture}}.{{GPUTexture/[[format]]}} is equal to |destination|. - {{GPUImageCopyTexture/texture}}.{{GPUTexture/[[format]]}}. - - If |source|.{{GPUImageCopyTexture/texture}}.{{GPUTexture/[[format]]}} is a depth-stencil format: - - |source|.{{GPUImageCopyTexture/aspect}} and |destination|.{{GPUImageCopyTexture/aspect}} - must both refer to all aspects of |source|.{{GPUImageCopyTexture/texture}}.{{GPUTexture/[[format]]}} - and |destination|.{{GPUImageCopyTexture/texture}}.{{GPUTexture/[[format]]}}, respectively. - - [=Valid Texture Copy Range=] applies to |source| and |copySize|. - - [=Valid Texture Copy Range=] applies to |destination| and |copySize|. - - The [$set of subresources for texture copy$](|source|, |copySize|) and - the [$set of subresources for texture copy$](|destination|, |copySize|) is disjoint. -
-
-
- -
- The set of subresources for texture copy(|imageCopyTexture|, |copySize|) - is the set containing: - - - If |imageCopyTexture|.{{GPUImageCopyTexture/texture}}.{{GPUTexture/[[dimension]]}} - is {{GPUTextureDimension/"2d"}}: - - For each |arrayLayer| of the |copySize|.[=Extent3D/depthOrArrayLayers=] [=array layers=] - starting at |imageCopyTexture|.{{GPUImageCopyTexture/origin}}.[=Origin3D/z=]: - - The [=subresource=] of |imageCopyTexture|.{{GPUImageCopyTexture/texture}} at - [=mipmap level=] |imageCopyTexture|.{{GPUImageCopyTexture/mipLevel}} and - [=array layer=] |arrayLayer|. - - Otherwise: - - The [=subresource=] of |imageCopyTexture|.{{GPUImageCopyTexture/texture}} at - [=mipmap level=] |imageCopyTexture|.{{GPUImageCopyTexture/mipLevel}}. -
- -## Debug Markers ## {#command-encoder-debug-markers} - -Both command encoders and programmable pass encoders provide methods to apply debug labels to groups -of commands or insert a single label into the command sequence. Debug groups can be nested to create -a hierarchy of labeled commands. These labels may be passed to the native API backends for tooling, -may be used by the user agent's internal tooling, or may be a no-op when such tooling is not -available or applicable. - -Debug groups in a {{GPUCommandEncoder}} or {{GPUProgrammablePassEncoder}} -must be well nested. - -
- : pushDebugGroup(groupLabel) - :: - Marks the beginning of a labeled group of commands for the {{GPUCommandEncoder}}. - -
- **Called on:** {{GPUCommandEncoder}} |this|. - - **Arguments:** -
-                |groupLabel|: The label for the command group.
-            
- - **Returns:** {{undefined}} - - Issue the following steps on the [=Device timeline=] of |this|: -
- - If any of the following conditions are unsatisfied, make |this| [=invalid=] and stop. -
- - |this|.{{GPUCommandEncoder/[[state]]}} is {{encoder state/open}}. -
- - [=stack/Push=] |groupLabel| onto |this|.{{GPUCommandEncoder/[[debug_group_stack]]}}. -
-
- - : popDebugGroup() - :: - Marks the end of a labeled group of commands for the {{GPUCommandEncoder}}. - -
- **Called on:** {{GPUCommandEncoder}} |this|. - - **Returns:** {{undefined}} - - Issue the following steps on the [=Device timeline=] of |this|: -
- - If any of the following conditions are unsatisfied, make |this| [=invalid=] and stop. -
- - |this|.{{GPUCommandEncoder/[[state]]}} is {{encoder state/open}}. - - |this|.{{GPUCommandEncoder/[[debug_group_stack]]}}'s [=stack/size=] is greater than 0. -
- - [=stack/Pop=] an entry off |this|.{{GPUCommandEncoder/[[debug_group_stack]]}}. -
-
- - : insertDebugMarker(markerLabel) - :: - Marks a point in a stream of commands with a label string. - -
- **Called on:** {{GPUCommandEncoder}} |this|. - - **Arguments:** -
-                markerLabel: The label to insert.
-            
- - **Returns:** {{undefined}} - - Issue the following steps on the [=Device timeline=] of |this|: -
- - If any of the following conditions are unsatisfied, make |this| [=invalid=] and stop. -
- - |this|.{{GPUCommandEncoder/[[state]]}} is {{encoder state/open}}. -
-
-
-
- -## Queries ## {#command-encoder-queries} - -
- : writeTimestamp(querySet, queryIndex) - :: - Writes a timestamp value into |querySet| when all previous commands have completed executing. - -
- **Called on:** {{GPUCommandEncoder}} |this|. - - **Arguments:** -
-                |querySet|: The query set that will store the timestamp values.
-                |queryIndex|: The index of the query in the query set.
-            
- - **Returns:** {{undefined}} - - 1. If |this|.{{GPUDevice/[[device]]}}.{{device/[[features]]}} does not [=list/contain=] - {{GPUFeatureName/"timestamp-query"}}, throw a {{TypeError}}. - 1. If any of the following conditions are unsatisfied, generate a validation error and stop. -
- - |this|.{{GPUCommandEncoder/[[state]]}} is {{encoder state/open}}. - - |querySet| is [$valid to use with$] |this|. - - |querySet|.{{GPUQuerySet/[[descriptor]]}}.{{GPUQuerySetDescriptor/type}} is {{GPUQueryType/"timestamp"}}. - - |queryIndex| < |querySet|.{{GPUQuerySet/[[descriptor]]}}.{{GPUQuerySetDescriptor/count}}. -
- - Issue: Describe {{GPUCommandEncoder/writeTimestamp()}} algorithm steps. -
- - : resolveQuerySet(querySet, firstQuery, queryCount, destination, destinationOffset) - :: - -
- **Called on:** {{GPUCommandEncoder}} this. - - **Arguments:** -
-                querySet:
-                firstQuery:
-                queryCount:
-                destination:
-                destinationOffset:
-            
- - **Returns:** {{undefined}} - - If any of the following conditions are unsatisfied, generate a {{GPUValidationError}} and stop. -
- - |this|.{{GPUCommandEncoder/[[state]]}} is {{encoder state/open}}. - - |querySet| is [$valid to use with$] |this|. - - |destination| is [$valid to use with$] |this|. - - |destination|.{{GPUBuffer/[[usage]]}} contains {{GPUBufferUsage/QUERY_RESOLVE}}. - - |firstQuery| is less than the number of queries in |querySet|. - - (|firstQuery| + |queryCount|) is less than or equal to the number of queries in |querySet|. - - |destinationOffset| is a multiple of 8. - - |destinationOffset| + 8 × |queryCount| ≤ |destination|.{{GPUBuffer/[[size]]}}. -
- - Issue: Describe {{GPUCommandEncoder/resolveQuerySet()}} algorithm steps. -
-
- -## Finalization ## {#command-encoder-finalization} - -A {{GPUCommandBuffer}} containing the commands recorded by the {{GPUCommandEncoder}} can be created -by calling {{GPUCommandEncoder/finish()}}. Once {{GPUCommandEncoder/finish()}} has been called the -command encoder can no longer be used. - -
- : finish(descriptor) - :: - Completes recording of the commands sequence and returns a corresponding {{GPUCommandBuffer}}. - -
- **Called on:** {{GPUCommandEncoder}} |this|. - - **Arguments:** -
-                descriptor:
-            
- - **Returns:** {{GPUCommandBuffer}} - - 1. Let |commandBuffer| be a new {{GPUCommandBuffer}}. - 1. Issue the following steps on the [=Device timeline=] of |this|: -
- 1. If any of the following conditions are unsatisfied, generate a validation - error and stop. -
- - |this| is [=valid=]. - - |this|.{{GPUCommandEncoder/[[debug_group_stack]]}}'s [=stack/size=] is 0. - - |this|.{{GPUCommandEncoder/[[state]]}} is {{encoder state/open}}. - - Every [=usage scope=] contained in |this| satisfies the [=usage scope validation=]. -
- - 1. Set |this|.{{GPUCommandEncoder/[[state]]}} to {{encoder state/closed}}. - 1. Let |commandBuffer|.{{GPUCommandBuffer/[[command_list]]}} be a [=list/clone=] - of |this|.{{GPUCommandEncoder/[[command_list]]}}. -
- - 1. Return |commandBuffer|. -
-
- -# Programmable Passes # {#programmable-passes} - - - -{{GPUProgrammablePassEncoder}} has the following internal slots: - -
- : \[[command_encoder]] of type {{GPUCommandEncoder}}. - :: - The {{GPUCommandEncoder}} that created this programmable pass. - - : \[[debug_group_stack]] of type [=stack=]<{{USVString}}>. - :: - A stack of active debug group labels. - - : \[[bind_groups]], of type [=ordered map=]<{{GPUIndex32}}, {{GPUBindGroup}}> - :: - The current {{GPUBindGroup}} for each index, initially empty. -
- -## Bind Groups ## {#programmable-passes-bind-groups} - -
- : setBindGroup(index, bindGroup, dynamicOffsets) - :: - Sets the current {{GPUBindGroup}} for the given index. - -
- **Called on:** {{GPUProgrammablePassEncoder}} this. - - **Arguments:** -
-                |index|: The index to set the bind group at.
-                |bindGroup|: Bind group to use for subsequent render or compute commands.
-
-                
-                
-            
- - Issue: Resolve bikeshed conflict when using `argumentdef` with overloaded functions that prevents us from - defining |dynamicOffsets|. - - **Returns:** {{undefined}} - - Issue the following steps on the [=Device timeline=] of |this|.{{GPUObjectBase/[[device]]}}: -
- 1. If any of the following conditions are unsatisfied, make |this| [=invalid=] and stop. -
- - |bindGroup| is [$valid to use with$] |this|. - - |index| < |this|.{{GPUObjectBase/[[device]]}}.{{device/[[limits]]}}.{{supported limits/maxBindGroups}}. - - |dynamicOffsets|.length is - |bindGroup|.{{GPUBindGroup/[[layout]]}}.{{GPUBindGroupLayout/[[dynamicOffsetCount]]}}. - - - [$Iterate over each dynamic binding offset$] in |bindGroup| and - run the following steps for each |bufferBinding|, |minBindingSize|, - and |dynamicOffsetIndex|: - - - Let |bufferDynamicOffset| be |dynamicOffsets|[|dynamicOffsetIndex|]. - - |bufferBinding|.{{GPUBufferBinding/offset}} + |bufferDynamicOffset| + - |minBindingSize| ≤ - |bufferBinding|.{{GPUBufferBinding/buffer}}.{{GPUBuffer/[[size]]}}. -
- 1. Set |this|.{{GPUProgrammablePassEncoder/[[bind_groups]]}}[|index|] to be |bindGroup|. -
-
- - : setBindGroup(index, bindGroup, dynamicOffsetsData, dynamicOffsetsDataStart, dynamicOffsetsDataLength) - :: - Sets the current {{GPUBindGroup}} for the given index, specifying dynamic offsets as a subset - of a {{Uint32Array}}. - -
- **Called on:** {{GPUProgrammablePassEncoder}} this. - - **Arguments:** -
-                |index|: The index to set the bind group at.
-                |bindGroup|: Bind group to use for subsequent render or compute commands.
-                |dynamicOffsetsData|: Array containing buffer offsets in bytes for each entry in
-                    |bindGroup| marked as {{GPUBindGroupLayoutEntry/buffer}}.{{GPUBufferBindingLayout/hasDynamicOffset}}.
-                |dynamicOffsetsDataStart|: Offset in elements into |dynamicOffsetsData| where the
-                    buffer offset data begins.
-                |dynamicOffsetsDataLength|: Number of buffer offsets to read from |dynamicOffsetsData|.
-            
- - **Returns:** {{undefined}} - - Issue the following steps on the [=Device timeline=] of |this|.{{GPUObjectBase/[[device]]}}: -
- 1. If any of the following conditions are unsatisfied, make |this| [=invalid=] and stop. -
- - |bindGroup| is [$valid to use with$] |this|. - - |index| < |this|.{{GPUObjectBase/[[device]]}}.{{device/[[limits]]}}.{{supported limits/maxBindGroups}}. - - |dynamicOffsetsDataLength| is - |bindGroup|.{{GPUBindGroup/[[layout]]}}.{{GPUBindGroupLayout/[[dynamicOffsetCount]]}}. - - |dynamicOffsetsDataStart| + |dynamicOffsetsDataLength| ≤ |dynamicOffsetsData|.length. - - - [$Iterate over each dynamic binding offset$] in |bindGroup| and - run the following steps for each |bufferBinding|, |minBindingSize|, - and |dynamicOffsetIndex|: - - - Let |bufferDynamicOffset| be - |dynamicOffsetsData|[|dynamicOffsetIndex| + |dynamicOffsetsDataStart|]. - - |bufferBinding|.{{GPUBufferBinding/offset}} + |bufferDynamicOffset| + - |minBindingSize| ≤ - |bufferBinding|.{{GPUBufferBinding/buffer}}.{{GPUBuffer/[[size]]}}. -
- 1. Set |this|.{{GPUProgrammablePassEncoder/[[bind_groups]]}}[|index|] to be |bindGroup|. -
-
-
- -
- To Iterate over each dynamic binding offset in a given {{GPUBindGroup}} |bindGroup| - with a given list of |steps| to be executed for each dynamic offset: - - 1. Let |dynamicOffsetIndex| be `0`. - 1. Let |layout| be |bindGroup|.{{GPUBindGroup/[[layout]]}}. - 1. For each {{GPUBindGroupEntry}} |entry| in |bindGroup|.{{GPUBindGroup/[[entries]]}}: - 1. Let |bindingDescriptor| be the {{GPUBindGroupLayoutEntry}} at - |layout|.{{GPUBindGroupLayout/[[entryMap]]}}[|entry|.{{GPUBindGroupEntry/binding}}]: - 1. If |bindingDescriptor|.{{GPUBindGroupLayoutEntry/buffer}} is not `undefined` and - |bindingDescriptor|.{{GPUBindGroupLayoutEntry/buffer}}.{{GPUBufferBindingLayout/hasDynamicOffset}} is `true`: - 1. Let |bufferBinding| be |entry|.{{GPUBindGroupEntry/resource}}. - 1. Let |minBindingSize| be |bindingDescriptor|.{{GPUBindGroupLayoutEntry/buffer}}.{{GPUBufferBindingLayout/minBindingSize}}. - 1. Call |steps| with |bufferBinding|, |minBindingSize|, and |dynamicOffsetIndex|. - 1. Let |dynamicOffsetIndex| be |dynamicOffsetIndex| + `1` -
- -
- Validate encoder bind groups(encoder, pipeline) - - **Arguments:** - : {{GPUProgrammablePassEncoder}} |encoder| - :: Encoder who's bind groups are being validated. - : {{GPUPipelineBase}} |pipeline| - :: Pipline to validate |encoder|s bind groups are compatible with. - - If any of the following conditions are unsatisfied, return `false`: -
- - |pipeline| must not be `null`. - - For each pair of ({{GPUIndex32}} |index|, {{GPUBindGroupLayout}} |bindGroupLayout|) in - |pipeline|.{{GPUPipelineBase/[[layout]]}}.{{GPUPipelineLayout/[[bindGroupLayouts]]}}. - - Let |bindGroup| be |encoder|.{{GPUProgrammablePassEncoder/[[bind_groups]]}}[|index|]. - - |bindGroup| must not be `null`. - - |bindGroup|.{{GPUBindGroup/[[layout]]}} must be [=group-equivalent=] with |bindGroupLayout|. - - Issue: Check buffer bindings against `minBindingSize` if present. -
- - Otherwise return `true`. -
- -## Debug Markers ## {#programmable-passes-debug-markers} - -Debug marker methods for programmable pass encoders provide the same functionality as -[[#command-encoder-debug-markers|command encoder debug markers]] while recording a programmable -pass. - -
- : pushDebugGroup(groupLabel) - :: - Marks the beginning of a labeled group of commands for the {{GPUProgrammablePassEncoder}}. - -
- **Called on:** {{GPUProgrammablePassEncoder}} |this|. - - **Arguments:** -
-                |groupLabel|: The label for the command group.
-            
- - **Returns:** {{undefined}} - - Issue the following steps on the [=Device timeline=] of |this|: -
- 1. [=stack/Push=] |groupLabel| onto |this|.{{GPUProgrammablePassEncoder/[[debug_group_stack]]}}. -
-
- - : popDebugGroup() - :: - Marks the end of a labeled group of commands for the {{GPUProgrammablePassEncoder}}. - -
- **Called on:** {{GPUProgrammablePassEncoder}} |this|. - - **Returns:** {{undefined}} - - Issue the following steps on the [=Device timeline=] of |this|: -
- 1. If any of the following conditions are unsatisfied, generate a validation - error and stop. -
- - |this|.{{GPUProgrammablePassEncoder/[[debug_group_stack]]}}'s [=stack/size=] is greater than 0. -
- 1. [=stack/Pop=] an entry off of |this|.{{GPUProgrammablePassEncoder/[[debug_group_stack]]}}. -
-
- - : insertDebugMarker(markerLabel) - :: - Inserts a single debug marker label into the {{GPUProgrammablePassEncoder}}'s commands sequence. - -
- **Called on:** {{GPUProgrammablePassEncoder}} this. - - **Arguments:** -
-                markerLabel: The label to insert.
-            
- - **Returns:** {{undefined}} -
-
- -# Compute Passes # {#compute-passes} - -## GPUComputePassEncoder ## {#compute-pass-encoder} - - - -{{GPUComputePassEncoder}} has the following internal slots: - -
- : \[[pipeline]], of type {{GPUComputePipeline}} - :: - The current {{GPUComputePipeline}}, initially `null`. -
- -### Creation ### {#compute-pass-encoder-creation} - - - -### Dispatch ### {#compute-pass-encoder-dispatch} - -
- : setPipeline(pipeline) - :: - Sets the current {{GPUComputePipeline}}. - -
- **Called on:** {{GPUComputePassEncoder}} this. - - **Arguments:** -
-                |pipeline|: The compute pipeline to use for subsequent dispatch commands.
-            
- - **Returns:** {{undefined}} - - Issue the following steps on the [=Device timeline=] of |this|.{{GPUObjectBase/[[device]]}}: -
- 1. If any of the following conditions are unsatisfied, make |this| [=invalid=] and stop. -
- - |pipeline| is [$valid to use with$] |this|. -
- 1. Set |this|.{{GPUComputePassEncoder/[[pipeline]]}} to be |pipeline|. -
-
- - : dispatch(x, y, z) - :: - Dispatch work to be performed with the current {{GPUComputePipeline}}. - See [[#computing-operations]] for the detailed specification. - -
- **Called on:** {{GPUComputePassEncoder}} this. - - **Arguments:** -
-                |x|: X dimension of the grid of workgroups to dispatch.
-                |y|: Y dimension of the grid of workgroups to dispatch.
-                |z|: Z dimension of the grid of workgroups to dispatch.
-            
- - **Returns:** {{undefined}} - - Issue the following steps on the [=Device timeline=] of |this|.{{GPUObjectBase/[[device]]}}: -
- 1. If any of the following conditions are unsatisfied, make |this| [=invalid=] and stop. -
- - [$Validate encoder bind groups$](|this|, |this|.{{GPUComputePassEncoder/[[pipeline]]}}) - is `true`. -
- - 1. [=list/Append=] a [=GPU command=] to - |this|.{{GPUProgrammablePassEncoder/[[command_encoder]]}}.{{GPUCommandEncoder/[[command_list]]}} - that captures the {{GPUComputePassEncoder}} state of |this| as |passState| and, - when executed, issues the following steps on the appropriate [=Queue timeline=]: -
- 1. Dispatch a grid of workgroups with dimensions [|x|, |y|, |z|] with - |passState|.{{GPUComputePassEncoder/[[pipeline]]}} using - |passState|.{{GPUProgrammablePassEncoder/[[bind_groups]]}}. -
-
-
- - : dispatchIndirect(indirectBuffer, indirectOffset) - :: - Dispatch work to be performed with the current {{GPUComputePipeline}} using parameters read - from a {{GPUBuffer}}. - See [[#computing-operations]] for the detailed specification. - - The indirect dispatch parameters encoded in the buffer must be a tightly - packed block of **three 32-bit unsigned integer values (12 bytes total)**, given in the same - order as the arguments for {{GPUComputePassEncoder/dispatch()}}. For example: - -
-            let dispatchIndirectParameters = new Uint32Array(3);
-            dispatchIndirectParameters[0] = x;
-            dispatchIndirectParameters[1] = y;
-            dispatchIndirectParameters[2] = z;
-        
- -
- **Called on:** {{GPUComputePassEncoder}} this. - - **Arguments:** -
-                |indirectBuffer|: Buffer containing the [=indirect dispatch parameters=].
-                |indirectOffset|: Offset in bytes into |indirectBuffer| where the dispatch data begins.
-            
- - **Returns:** {{undefined}} - - Issue the following steps on the [=Device timeline=] of |this|.{{GPUObjectBase/[[device]]}}: -
- 1. If any of the following conditions are unsatisfied, make |this| [=invalid=] and stop. -
- - [$Validate encoder bind groups$](|this|, |this|.{{GPUComputePassEncoder/[[pipeline]]}}) - is `true`. - - |indirectBuffer| is [$valid to use with$] |this|. - - |indirectBuffer|.{{GPUBuffer/[[usage]]}} contains {{GPUBufferUsage/INDIRECT}}. - - |indirectOffset| + sizeof([=indirect dispatch parameters=]) ≤ - |indirectBuffer|.{{GPUBuffer/[[size]]}}. - - |indirectOffset| is a multiple of 4. -
- 1. Add |indirectBuffer| to the [=usage scope=] as {{GPUBufferUsage/INDIRECT}}. -
-
-
- -### Queries ### {#compute-pass-encoder-queries} - -
- : beginPipelineStatisticsQuery(querySet, queryIndex) - :: - -
- **Called on:** {{GPUComputePassEncoder}} |this|. - - **Arguments:** -
-                querySet:
-                queryIndex:
-            
- - **Returns:** {{undefined}} - - 1. If |this|.{{GPUDevice/[[device]]}}.{{device/[[features]]}} does not [=list/contain=] - {{GPUFeatureName/"pipeline-statistics-query"}}, throw a {{TypeError}}. - - Issue: Describe {{GPUComputePassEncoder/beginPipelineStatisticsQuery()}} algorithm steps. -
- - : endPipelineStatisticsQuery() - :: - -
- **Called on:** {{GPUComputePassEncoder}} |this|. - - **Returns:** {{undefined}} - - 1. If |this|.{{GPUDevice/[[device]]}}.{{device/[[features]]}} does not [=list/contain=] - {{GPUFeatureName/"pipeline-statistics-query"}}, throw a {{TypeError}}. - - Issue: Describe {{GPUComputePassEncoder/endPipelineStatisticsQuery()}} algorithm steps. -
- - : writeTimestamp(querySet, queryIndex) - :: - Writes a timestamp value into |querySet| when all previous commands have completed executing. - -
- **Called on:** {{GPUComputePassEncoder}} |this|. - - **Arguments:** -
-                |querySet|: The query set that will store the timestamp values.
-                |queryIndex|: The index of the query in the query set.
-            
- - **Returns:** {{undefined}} - - 1. If |this|.{{GPUDevice/[[device]]}}.{{device/[[features]]}} does not [=list/contain=] - {{GPUFeatureName/"timestamp-query"}}, throw a {{TypeError}}. - 1. If any of the following conditions are unsatisfied, generate a validation error and stop. -
- - |querySet| is [$valid to use with$] |this|. - - |querySet|.{{GPUQuerySet/[[descriptor]]}}.{{GPUQuerySetDescriptor/type}} is {{GPUQueryType/"timestamp"}}. - - |queryIndex| < |querySet|.{{GPUQuerySet/[[descriptor]]}}.{{GPUQuerySetDescriptor/count}}. -
- - Issue: Describe {{GPUComputePassEncoder/writeTimestamp()}} algorithm steps. -
-
- -### Finalization ### {#compute-pass-encoder-finalization} - -The compute pass encoder can be ended by calling {{GPUComputePassEncoder/endPass()}} once the user -has finished recording commands for the pass. Once {{GPUComputePassEncoder/endPass()}} has been -called the compute pass encoder can no longer be used. - -
- : endPass() - :: - Completes recording of the compute pass commands sequence. - -
- **Called on:** {{GPUComputePassEncoder}} |this|. - - **Returns:** {{undefined}} - - Issue the following steps on the [=Device timeline=] of |this|: -
- 1. If any of the following conditions are unsatisfied, generate a validation - error and stop. -
- - |this|.{{GPUProgrammablePassEncoder/[[debug_group_stack]]}}'s [=stack/size=] is 0. - - Issue: Add remaining validation. -
-
-
-
- -# Render Passes # {#render-passes} - -## GPURenderPassEncoder ## {#render-pass-encoder} - - - - * In indirect draw calls, the base instance field (inside the indirect - buffer data) must be set to zero. - -{{GPURenderEncoderBase}} has the following internal slots: - -
- : \[[pipeline]], of type {{GPURenderPipeline}} - :: - The current {{GPURenderPipeline}}, initially `null`. - - : \[[index_buffer]], of type {{GPUBuffer}} - :: - The current buffer to read index data from, initially `null`. - - : \[[index_format]], of type {{GPUIndexFormat}} - :: - The format of the index data in {{GPURenderEncoderBase/[[index_buffer]]}}. - - : \[[vertex_buffers]], of type [=ordered map=]<slot, {{GPUBuffer}}> - :: - The current {{GPUBuffer}}s to read vertex data from for each slot, initially empty. -
- -{{GPURenderPassEncoder}} has the following internal slots: - -
- : \[[attachment_size]] - :: - Set to the following extents: - - `width, height` = the dimensions of the pass's render attachments - - : \[[occlusion_query_set]], of type {{GPUQuerySet}}. - :: - The {{GPUQuerySet}} to store occlusion query results for the pass, which is initialized with - {{GPURenderPassDescriptor}}.{{GPURenderPassDescriptor/occlusionQuerySet}} at pass creation time. - - : \[[occlusion_query_active]], of type {{boolean}}. - :: - Whether the pass's {{GPURenderPassEncoder/[[occlusion_query_set]]}} is being written. - - : \[[viewport]] - :: Current viewport rectangle and depth range. -
- -When a {{GPURenderPassEncoder}} is created, it has the following default state: - * {{GPURenderPassEncoder/[[viewport]]}}: - * `x, y` = `0.0, 0.0` - * `width, height` = the dimensions of the pass's render targets - * `minDepth, maxDepth` = `0.0, 1.0` - * Scissor rectangle: - * `x, y` = `0, 0` - * `width, height` = the dimensions of the pass's render targets - -### Creation ### {#render-pass-encoder-creation} - - - -
- : colorAttachments - :: - The set of {{GPURenderPassColorAttachment}} values in this sequence defines which - color attachments will be output to when executing this render pass. - - : depthStencilAttachment - :: - The {{GPURenderPassDepthStencilAttachment}} value that defines the depth/stencil - attachment that will be output to and tested against when executing this render pass. - - : occlusionQuerySet - :: - The {{GPUQuerySet}} value defines where the occlusion query results will be stored for this pass. -
- -
- Valid Usage - - Given a {{GPURenderPassDescriptor}} |this| the following validation rules apply: - - 1. |this|.{{GPURenderPassDescriptor/colorAttachments}}.length must be less than or equal to the - [=maximum color attachments=]. - 1. |this|.{{GPURenderPassDescriptor/colorAttachments}}.length must greater than `0` or - |this|.{{GPURenderPassDescriptor/depthStencilAttachment}} must not be `null`. - 1. For each |colorAttachment| in |this|.{{GPURenderPassDescriptor/colorAttachments}}: - - 1. |colorAttachment| must meet the [$GPURenderPassColorAttachment/GPURenderPassColorAttachment Valid Usage$] rules. - - 1. If |this|.{{GPURenderPassDescriptor/depthStencilAttachment}} is not `null`: - - 1. |this|.{{GPURenderPassDescriptor/depthStencilAttachment}} must meet the [$GPURenderPassDepthStencilAttachment/GPURenderPassDepthStencilAttachment Valid Usage$] rules. - - 1. Each {{GPURenderPassColorAttachment/view}} in |this|.{{GPURenderPassDescriptor/colorAttachments}} - and |this|.{{GPURenderPassDescriptor/depthStencilAttachment}}.{{GPURenderPassDepthStencilAttachment/view}}, - if present, must have all have the same {{GPUTexture/[[sampleCount]]}}. - - 1. For each {{GPURenderPassColorAttachment/view}} in |this|.{{GPURenderPassDescriptor/colorAttachments}} - and |this|.{{GPURenderPassDescriptor/depthStencilAttachment}}.{{GPURenderPassDepthStencilAttachment/view}}, - if present, the {{GPUTextureView/[[renderExtent]]}} must match. - - 1. If |this|.{{GPURenderPassDescriptor/occlusionQuerySet}} is not `null`: - - 1. |this|.{{GPURenderPassDescriptor/occlusionQuerySet}}.{{GPUQuerySet/[[descriptor]]}}.{{GPUQuerySetDescriptor/type}} - must be {{GPUQueryType/occlusion}}. - - Issue: Define maximum color attachments - - Issue(gpuweb/gpuweb#503): support for no attachments -
- -
- For a given {{GPURenderPassDescriptor}} value |descriptor|, the syntax: - - - |descriptor|.renderExtent refers to - {{GPUTextureView/[[renderExtent]]}} of any {{GPUTextureView/[[descriptor]]}} - in either |descriptor|.{{GPURenderPassDescriptor/depthStencilAttachment}}.{{GPURenderPassDepthStencilAttachment/view}}, - or any of the {{GPURenderPassColorAttachment/view}} in |descriptor|.{{GPURenderPassDescriptor/colorAttachments}}. - - Issue: make it a define once we reference to this from other places - - Note: the [$GPURenderPassDescriptor/Valid Usage$] guarantees that all of the render extents - of the attachments are the same, so we can take any of them, assuming the descriptor is valid. -
- -#### Color Attachments #### {#color-attachments} - - - -
- : view - :: - A {{GPUTextureView}} describing the texture [=subresource=] that will be output to for this - color attachment. - - : resolveTarget - :: - A {{GPUTextureView}} describing the texture [=subresource=] that will receive the resolved - output for this color attachment if {{GPURenderPassColorAttachment/view}} is - multisampled. - - : loadValue - :: - If a {{GPULoadOp}}, indicates the load operation to perform on - {{GPURenderPassColorAttachment/view}} prior to executing the render pass. - If a {{GPUColor}}, indicates the value to clear {{GPURenderPassColorAttachment/view}} - to prior to executing the render pass. - - Note: It is recommended to prefer a clear-value; see {{GPULoadOp/"load"}}. - - : storeOp - :: - The store operation to perform on {{GPURenderPassColorAttachment/view}} - after executing the render pass. -
- -
- GPURenderPassColorAttachment Valid Usage - - Given a {{GPURenderPassColorAttachment}} |this| the following validation rules - apply: - - 1. |this|.{{GPURenderPassColorAttachment/view}} must have a renderable color format. - 1. |this|.{{GPURenderPassColorAttachment/view}}.{{GPUTextureView/[[texture]]}}.{{GPUTexture/[[textureUsage]]}} - must contain {{GPUTextureUsage/RENDER_ATTACHMENT}}. - 1. |this|.{{GPURenderPassColorAttachment/view}} must be a view of a single [=subresource=]. - 1. If |this|.{{GPURenderPassColorAttachment/resolveTarget}} is not `null`: - - 1. |this|.{{GPURenderPassColorAttachment/view}} must be multisampled. - 1. |this|.{{GPURenderPassColorAttachment/resolveTarget}} must not be multisampled. - 1. |this|.{{GPURenderPassColorAttachment/resolveTarget}}.{{GPUTextureView/[[texture]]}}.{{GPUTexture/[[textureUsage]]}} - must contain {{GPUTextureUsage/RENDER_ATTACHMENT}}. - 1. |this|.{{GPURenderPassColorAttachment/resolveTarget}} must be a view of a single [=subresource=]. - - 1. The dimensions of the [=subresource=]s seen by |this|.{{GPURenderPassColorAttachment/resolveTarget}} - and |this|.{{GPURenderPassColorAttachment/view}} must match. - 1. |this|.{{GPURenderPassColorAttachment/resolveTarget}}.{{GPUTextureView/[[texture]]}}.{{GPUTexture/[[format]]}} - must match |this|.{{GPURenderPassColorAttachment/view}}.{{GPUTextureView/[[texture]]}}.{{GPUTexture/[[format]]}}. - 1. Issue: Describe any remaining resolveTarget validation - - Issue: Describe the remaining validation rules for this type. -
- -#### Depth/Stencil Attachments #### {#depth-stencil-attachments} - - - -
- : view - :: - A {{GPUTextureView}} describing the texture [=subresource=] that will be output to - and read from for this depth/stencil attachment. - - : depthLoadValue - :: - If a {{GPULoadOp}}, indicates the load operation to perform on - {{GPURenderPassDepthStencilAttachment/view}}'s depth component prior to - executing the render pass. - If a `float`, indicates the value to clear {{GPURenderPassDepthStencilAttachment/view}}'s - depth component to prior to executing the render pass. - - Note: It is recommended to prefer a clear-value; see {{GPULoadOp/"load"}}. - - : depthStoreOp - :: - The store operation to perform on {{GPURenderPassDepthStencilAttachment/view}}'s - depth component after executing the render pass. - - Note: It is recommended to prefer a clear-value; see {{GPULoadOp/"load"}}. - - : depthReadOnly - :: - Indicates that the depth component of {{GPURenderPassDepthStencilAttachment/view}} - is read only. - - : stencilLoadValue - :: - If a {{GPULoadOp}}, indicates the load operation to perform on - {{GPURenderPassDepthStencilAttachment/view}}'s stencil component prior to - executing the render pass. - If a {{GPUStencilValue}}, indicates the value to clear - {{GPURenderPassDepthStencilAttachment/view}}'s stencil component to prior to - executing the render pass. - - : stencilStoreOp - :: - The store operation to perform on {{GPURenderPassDepthStencilAttachment/view}}'s - stencil component after executing the render pass. - - : stencilReadOnly - :: - Indicates that the stencil component of {{GPURenderPassDepthStencilAttachment/view}} - is read only. -
- -
- GPURenderPassDepthStencilAttachment Valid Usage - - Given a {{GPURenderPassDepthStencilAttachment}} |this| the following validation - rules apply: - - 1. |this|.{{GPURenderPassDepthStencilAttachment/view}} must have a renderable - depth-and/or-stencil format. - 1. |this|.{{GPURenderPassDepthStencilAttachment/view}} must be a view of a - single [=texture subresource=]. - 1. |this|.{{GPURenderPassDepthStencilAttachment/view}}.{{GPUTexture/[[textureUsage]]}} - must contain {{GPUTextureUsage/RENDER_ATTACHMENT}}. - 1. |this|.{{GPURenderPassDepthStencilAttachment/depthReadOnly}} is `true`, - |this|.{{GPURenderPassDepthStencilAttachment/depthLoadValue}} must be - {{GPULoadOp/"load"}} and |this|.{{GPURenderPassDepthStencilAttachment/depthStoreOp}} - must be {{GPUStoreOp/"store"}}. - 1. |this|.{{GPURenderPassDepthStencilAttachment/stencilReadOnly}} is `true`, - |this|.{{GPURenderPassDepthStencilAttachment/stencilLoadValue}} must be - {{GPULoadOp/"load"}} and |this|.{{GPURenderPassDepthStencilAttachment/stencilStoreOp}} - must be {{GPUStoreOp/"store"}}. - - Issue: Describe the remaining validation rules for this type. -
- -#### Load & Store Operations #### {#load-and-store-ops} - - - -
- : "load" - :: - Loads the existing value for this attachment into the render pass. - - Note: - On some GPU hardware (primarily mobile), providing a clear-value is significantly cheaper - because it avoids loading data from main memory into tile-local memory. - On other GPU hardware, there isn't a significant difference. As a result, it is - recommended to use a clear-value, rather than {{GPULoadOp/"load"}}, in cases where the - initial value doesn't matter (e.g. the render target will be cleared using a skybox). -
- - - -### Drawing ### {#render-pass-encoder-drawing} - -
- : setPipeline(pipeline) - :: - Sets the current {{GPURenderPipeline}}. - -
- **Called on:** {{GPURenderEncoderBase}} this. - - **Arguments:** -
-                |pipeline|: The render pipeline to use for subsequent drawing commands.
-            
- - **Returns:** {{undefined}} - - Issue the following steps on the [=Device timeline=] of |this|.{{GPUObjectBase/[[device]]}}: -
- 1. If any of the following conditions are unsatisfied, make |this| [=invalid=] and stop. -
- - |pipeline| is [$valid to use with$] |this|. - - Issue: Validate that |pipeline| is compatible with the render pass descriptor. -
- 1. Set |this|.{{GPURenderEncoderBase/[[pipeline]]}} to be |pipeline|. -
-
- - : setIndexBuffer(buffer, indexFormat, offset, size) - :: - Sets the current index buffer. - -
- **Called on:** {{GPURenderEncoderBase}} this. - - **Arguments:** -
-                |buffer|: Buffer containing index data to use for subsequent drawing commands.
-                |indexFormat|: Format of the index data contained in |buffer|.
-                |offset|: Offset in bytes into |buffer| where the index data begins.
-                |size|: Size in bytes of the index data in |buffer|.
-                    If `0`, |buffer|.{{GPUBuffer/[[size]]}} - |offset| is used.
-            
- - **Returns:** {{undefined}} - - Issue the following steps on the [=Device timeline=] of |this|.{{GPUObjectBase/[[device]]}}: -
- 1. If any of the following conditions are unsatisfied, make |this| [=invalid=] and stop. -
- - |buffer| is [$valid to use with$] |this|. - - |buffer|.{{GPUBuffer/[[usage]]}} contains {{GPUBufferUsage/INDEX}}. - - |offset| + |size| ≤ |buffer|.{{GPUBuffer/[[size]]}}. -
- 1. Add |buffer| to the [=usage scope=] as [=internal usage/input=]. - 1. Set |this|.{{GPURenderEncoderBase/[[index_buffer]]}} to be |buffer|. - 1. Set |this|.{{GPURenderEncoderBase/[[index_format]]}} to be |indexFormat|. -
-
- - : setVertexBuffer(slot, buffer, offset, size) - :: - Sets the current vertex buffer for the given slot. - -
- **Called on:** {{GPURenderEncoderBase}} this. - - **Arguments:** -
-                |slot|: The vertex buffer slot to set the vertex buffer for.
-                |buffer|: Buffer containing vertex data to use for subsequent drawing commands.
-                |offset|: Offset in bytes into |buffer| where the vertex data begins.
-                |size|: Size in bytes of the vertex data in |buffer|.
-                    If `0`, |buffer|.{{GPUBuffer/[[size]]}} - |offset| is used.
-            
- - **Returns:** {{undefined}} - - Issue the following steps on the [=Device timeline=] of |this|.{{GPUObjectBase/[[device]]}}: -
- 1. If any of the following conditions are unsatisfied, make |this| [=invalid=] and stop. -
- - |buffer| is [$valid to use with$] |this|. - - |buffer|.{{GPUBuffer/[[usage]]}} contains {{GPUBufferUsage/VERTEX}}. - - |slot| < |this|.{{GPUObjectBase/[[device]]}}.{{device/[[limits]]}}.{{supported limits/maxVertexBuffers}}. - - |offset| + |size| ≤ |buffer|.{{GPUBuffer/[[size]]}}. -
- 1. Add |buffer| to the [=usage scope=] as [=internal usage/input=]. - 1. Set |this|.{{GPURenderEncoderBase/[[vertex_buffers]]}}[|slot|] to be |buffer|. -
-
- - : draw(vertexCount, instanceCount, firstVertex, firstInstance) - :: - Draws primitives. - See [[#rendering-operations]] for the detailed specification. - -
- **Called on:** {{GPURenderEncoderBase}} this. - - **Arguments:** -
-                vertexCount: The number of vertices to draw.
-                instanceCount: The number of instances to draw.
-                firstVertex: Offset into the vertex buffers, in vertices, to begin drawing from.
-                firstInstance: First instance to draw.
-            
- - **Returns:** {{undefined}} - - Issue the following steps on the [=Device timeline=] of |this|.{{GPUObjectBase/[[device]]}}: -
- If any of the following conditions are unsatisfied, make |this| [=invalid=] and stop. -
- - It is [$valid to draw$] with |this|. -
-
-
- - : drawIndexed(indexCount, instanceCount, firstIndex, baseVertex, firstInstance) - :: - Draws indexed primitives. - See [[#rendering-operations]] for the detailed specification. - -
- **Called on:** {{GPURenderEncoderBase}} this. - - **Arguments:** -
-                indexCount: The number of indices to draw.
-                instanceCount: The number of instances to draw.
-                firstIndex: Offset into the index buffer, in indices, begin drawing from.
-                baseVertex: Added to each index value before indexing into the vertex buffers.
-                firstInstance: First instance to draw.
-            
- - **Returns:** {{undefined}} - - Issue the following steps on the [=Device timeline=] of |this|.{{GPUObjectBase/[[device]]}}: -
- If any of the following conditions are unsatisfied, make |this| [=invalid=] and stop. -
- - It is [$valid to draw indexed$] with |this|. -
-
-
- - : drawIndirect(indirectBuffer, indirectOffset) - :: - Draws primitives using parameters read from a {{GPUBuffer}}. - See [[#rendering-operations]] for the detailed specification. - - The indirect draw parameters encoded in the buffer must be a tightly - packed block of **four 32-bit unsigned integer values (16 bytes total)**, given in the same - order as the arguments for {{GPURenderEncoderBase/draw()}}. For example: - -
-            let drawIndirectParameters = new Uint32Array(4);
-            drawIndirectParameters[0] = vertexCount;
-            drawIndirectParameters[1] = instanceCount;
-            drawIndirectParameters[2] = firstVertex;
-            drawIndirectParameters[3] = firstInstance;
-        
- -
- **Called on:** {{GPURenderEncoderBase}} this. - - **Arguments:** -
-                |indirectBuffer|: Buffer containing the [=indirect draw parameters=].
-                |indirectOffset|: Offset in bytes into |indirectBuffer| where the drawing data begins.
-            
- - **Returns:** {{undefined}} - - Issue the following steps on the [=Device timeline=] of |this|.{{GPUObjectBase/[[device]]}}: -
- 1. If any of the following conditions are unsatisfied, make |this| [=invalid=] and stop. -
- - It is [$valid to draw$] with |this|. - - |indirectBuffer| is [$valid to use with$] |this|. - - |indirectBuffer|.{{GPUBuffer/[[usage]]}} contains {{GPUBufferUsage/INDIRECT}}. - - |indirectOffset| + sizeof([=indirect draw parameters=]) ≤ - |indirectBuffer|.{{GPUBuffer/[[size]]}}. - - |indirectOffset| is a multiple of 4. -
- 1. Add |indirectBuffer| to the [=usage scope=] as [=internal usage/input=]. -
-
- - : drawIndexedIndirect(indirectBuffer, indirectOffset) - :: - Draws indexed primitives using parameters read from a {{GPUBuffer}}. - See [[#rendering-operations]] for the detailed specification. - - The indirect drawIndexed parameters encoded in the buffer must be a - tightly packed block of **five 32-bit unsigned integer values (20 bytes total)**, given in - the same order as the arguments for {{GPURenderEncoderBase/drawIndexed()}}. For example: - -
-            let drawIndexedIndirectParameters = new Uint32Array(5);
-            drawIndexedIndirectParameters[0] = indexCount;
-            drawIndexedIndirectParameters[1] = instanceCount;
-            drawIndexedIndirectParameters[2] = firstIndex;
-            drawIndexedIndirectParameters[3] = baseVertex;
-            drawIndexedIndirectParameters[4] = firstInstance;
-        
- -
- **Called on:** {{GPURenderEncoderBase}} this. - - **Arguments:** -
-                |indirectBuffer|: Buffer containing the [=indirect drawIndexed parameters=].
-                |indirectOffset|: Offset in bytes into |indirectBuffer| where the drawing data begins.
-            
- - **Returns:** {{undefined}} - - Issue the following steps on the [=Device timeline=] of |this|.{{GPUObjectBase/[[device]]}}: -
- 1. If any of the following conditions are unsatisfied, make |this| [=invalid=] and stop. -
- - It is [$valid to draw indexed$] with |this|. - - |indirectBuffer| is [$valid to use with$] |this|. - - |indirectBuffer|.{{GPUBuffer/[[usage]]}} contains {{GPUBufferUsage/INDIRECT}}. - - |indirectOffset| + sizeof([=indirect drawIndexed parameters=]) ≤ - |indirectBuffer|.{{GPUBuffer/[[size]]}}. - - |indirectOffset| is a multiple of 4. -
- 1. Add |indirectBuffer| to the [=usage scope=] as [=internal usage/input=]. -
-
-
- -
- To determine if it's valid to draw with {{GPURenderEncoderBase}} |encoder| - run the following steps: - - If any of the following conditions are unsatisfied, return `false`: -
- - [$Validate encoder bind groups$](|encoder|, |encoder|.{{GPURenderEncoderBase/[[pipeline]]}}) - must be `true`. - - - Let |pipelineDescriptor| be |encoder|.{{GPURenderEncoderBase/[[pipeline]]}}.{{GPURenderPipeline/[[descriptor]]}}. - - For each {{GPUIndex32}} |slot| `0` to - |pipelineDescriptor|.{{GPURenderPipelineDescriptor/vertex}}.{{GPUVertexState/buffers}}.length: - - |encoder|.{{GPURenderEncoderBase/[[vertex_buffers]]}}[|slot|] must not be `null`. -
- - Otherwise return `true`. -
- -
- To determine if it's valid to draw indexed with {{GPURenderEncoderBase}} |encoder| - run the following steps: - - If any of the following conditions are unsatisfied, return `false`: -
- - It must be [$valid to draw$] with |encoder|. - - - |encoder|.{{GPURenderEncoderBase/[[index_buffer]]}} must not be `null`. - - Let |stripIndexFormat| be |encoder|.{{GPURenderEncoderBase/[[pipeline]]}}.{{GPURenderPipeline/[[strip_index_format]]}}. - - If |stripIndexFormat| is not `undefined`: - - |encoder|.{{GPURenderEncoderBase/[[index_format]]}} must be |stripIndexFormat|. -
- - Otherwise return `true`. -
- -### Rasterization state ### {#render-pass-encoder-rasterization-state} - -The {{GPURenderPassEncoder}} has several methods which affect how draw commands are rasterized to -attachments used by this encoder. - -
- : setViewport(x, y, width, height, minDepth, maxDepth) - :: - Sets the viewport used during the rasterization stage to linearly map from normalized device - coordinates to viewport coordinates. - -
- **Called on:** {{GPURenderPassEncoder}} |this|. - - **Arguments:** -
-                |x|: Minimum X value of the viewport in pixels.
-                |y|: Minimum Y value of the viewport in pixels.
-                |width|: Width of the viewport in pixels.
-                |height|: Height of the viewport in pixels.
-                |minDepth|: Minimum depth value of the viewport.
-                |maxDepth|: Maximum depth value of the viewport.
-            
- - **Returns:** {{undefined}} - - Issue the following steps on the [=Device timeline=] of |this|: -
- 1. If any of the following conditions are unsatisfied, generate a validation - error and stop. -
- - |x| is greater than or equal to `0`. - - |y| is greater than or equal to `0`. - - |width| is greater than or equal to `0`. - - |height| is greater than or equal to `0`. - - |x| + |width| is less than or equal to - |this|.{{GPURenderPassEncoder/[[attachment_size]]}}.width. - - |y| + |height| is less than or equal to - |this|.{{GPURenderPassEncoder/[[attachment_size]]}}.height. - - |minDepth| is greater than or equal to `0.0` and less than or equal to `1.0`. - - |maxDepth| is greater than or equal to `0.0` and less than or equal to `1.0`. - - |maxDepth| is greater than |minDepth|. -
- 1. Set |this|.{{GPURenderPassEncoder/[[viewport]]}} to the extents |x|, |y|, |width|, |height|, |minDepth|, and |maxDepth|. -
- - Issue: Allowed for GPUs to use fixed point or rounded viewport coordinates -
- - : setScissorRect(x, y, width, height) - :: - Sets the scissor rectangle used during the rasterization stage. - After transformation into viewport coordinates any fragments which fall outside the scissor - rectangle will be discarded. - -
- **Called on:** {{GPURenderPassEncoder}} |this|. - - **Arguments:** -
-                |x|: Minimum X value of the scissor rectangle in pixels.
-                |y|: Minimum Y value of the scissor rectangle in pixels.
-                |width|: Width of the scissor rectangle in pixels.
-                |height|: Height of the scissor rectangle in pixels.
-            
- - **Returns:** {{undefined}} - - Issue the following steps on the [=Device timeline=] of |this|: -
- 1. If any of the following conditions are unsatisfied, generate a validation - error and stop. -
- - |x|+|width| is less than or equal to - |this|.{{GPURenderPassEncoder/[[attachment_size]]}}.width. - - |y|+|height| is less than or equal to - |this|.{{GPURenderPassEncoder/[[attachment_size]]}}.height. -
- 1. Set the scissor rectangle to the extents |x|, |y|, |width|, and |height|. -
-
- - : setBlendConstant(color) - :: - Sets the constant blend color and alpha values used with {{GPUBlendFactor/"constant"}} - and {{GPUBlendFactor/"one-minus-constant"}} {{GPUBlendFactor}}s. - -
- **Called on:** {{GPURenderPassEncoder}} this. - - **Arguments:** -
-                color: The color to use when blending.
-            
-
- - : setStencilReference(reference) - :: - Sets the stencil reference value used during stencil tests with the the - {{GPUStencilOperation/"replace"}} {{GPUStencilOperation}}. - -
- **Called on:** {{GPURenderPassEncoder}} this. - - **Arguments:** -
-                reference: The stencil reference value.
-            
-
-
- -### Queries ### {#render-pass-encoder-queries} - -
- : beginOcclusionQuery(queryIndex) - :: - -
- **Called on:** {{GPURenderPassEncoder}} |this|. - - **Arguments:** -
-                |queryIndex|: The index of the query in the query set.
-            
- - **Returns:** {{undefined}} - - Issue the following steps on the [=Device timeline=] of |this|.{{GPUObjectBase/[[device]]}}: -
- 1. If any of the following conditions are unsatisfied, generate a validation error and stop. -
- - |this|.{{GPURenderPassEncoder/[[occlusion_query_set]]}} is not `null`. - - |queryIndex| < |this|.{{GPURenderPassEncoder/[[occlusion_query_set]]}}.{{GPUQuerySet/[[descriptor]]}}.{{GPUQuerySetDescriptor/count}}. - - The query at same |queryIndex| must not have been previously written to in this pass. - - |this|.{{GPURenderPassEncoder/[[occlusion_query_active]]}} is `false`. -
- - 1. Set |this|.{{GPURenderPassEncoder/[[occlusion_query_active]]}} to `true`. -
-
- - : endOcclusionQuery() - :: - -
- **Called on:** {{GPURenderPassEncoder}} this. - - **Returns:** {{undefined}} - - Issue the following steps on the [=Device timeline=] of |this|.{{GPUObjectBase/[[device]]}}: -
- 1. If any of the following conditions are unsatisfied, generate a validation error and stop. -
- - |this|.{{GPURenderPassEncoder/[[occlusion_query_active]]}} is `true`. -
- - 1. Set |this|.{{GPURenderPassEncoder/[[occlusion_query_active]]}} to `false`. -
-
- - : beginPipelineStatisticsQuery(querySet, queryIndex) - :: - -
- **Called on:** {{GPURenderPassEncoder}} |this|. - - **Arguments:** -
-                querySet:
-                queryIndex:
-            
- - **Returns:** {{undefined}} - - 1. If |this|.{{GPUDevice/[[device]]}}.{{device/[[features]]}} does not [=list/contain=] - {{GPUFeatureName/"pipeline-statistics-query"}}, throw a {{TypeError}}. - - Issue: Describe {{GPURenderPassEncoder/beginPipelineStatisticsQuery()}} algorithm steps. -
- - : endPipelineStatisticsQuery() - :: - -
- **Called on:** {{GPURenderPassEncoder}} |this|. - - **Returns:** {{undefined}} - - 1. If |this|.{{GPUDevice/[[device]]}}.{{device/[[features]]}} does not [=list/contain=] - {{GPUFeatureName/"pipeline-statistics-query"}}, throw a {{TypeError}}. - - Issue: Describe {{GPURenderPassEncoder/endPipelineStatisticsQuery()}} algorithm steps. -
- - : writeTimestamp(querySet, queryIndex) - :: - Writes a timestamp value into |querySet| when all previous commands have completed executing. - -
- **Called on:** {{GPURenderPassEncoder}} |this|. - - **Arguments:** -
-                querySet: The query set that will store the timestamp values.
-                queryIndex: The index of the query in the query set.
-            
- - **Returns:** {{undefined}} - - 1. If |this|.{{GPUDevice/[[device]]}}.{{device/[[features]]}} does not [=list/contain=] - {{GPUFeatureName/"timestamp-query"}}, throw a {{TypeError}}. - 1. If any of the following conditions are unsatisfied, generate a validation error and stop. -
- - |querySet| is [$valid to use with$] |this|. - - |querySet|.{{GPUQuerySet/[[descriptor]]}}.{{GPUQuerySetDescriptor/type}} is {{GPUQueryType/"timestamp"}}. - - |queryIndex| < |querySet|.{{GPUQuerySet/[[descriptor]]}}.{{GPUQuerySetDescriptor/count}}. - - The query in |querySet| at index |queryIndex| has not been written earlier in this render pass. -
- - Issue: Describe {{GPURenderPassEncoder/writeTimestamp()}} algorithm steps. -
-
- -### Bundles ### {#render-pass-encoder-bundles} - -
- : executeBundles(bundles) - :: - Executes the commands previously recorded into the given {{GPURenderBundle}}s as part of - this render pass. - - When a {{GPURenderBundle}} is executed, it does not inherit the render pass's pipeline, bind - groups, or vertex and index buffers. After a {{GPURenderBundle}} has executed, the render - pass's pipeline, bind groups, and vertex and index buffers are cleared. - - Note: state is cleared even if zero {{GPURenderBundle|GPURenderBundles}} are executed. - -
- **Called on:** {{GPURenderPassEncoder}} this. - - **Arguments:** -
-                bundles: List of render bundles to execute.
-            
- - **Returns:** {{undefined}} - - Issue: Describe {{GPURenderPassEncoder/executeBundles()}} algorithm steps. -
-
- -### Finalization ### {#render-pass-encoder-finalization} - -The render pass encoder can be ended by calling {{GPURenderPassEncoder/endPass()}} once the user -has finished recording commands for the pass. Once {{GPURenderPassEncoder/endPass()}} has been -called the render pass encoder can no longer be used. - -
- : endPass() - :: - Completes recording of the render pass commands sequence. - -
- **Called on:** {{GPURenderPassEncoder}} |this|. - - **Returns:** {{undefined}} - - Issue the following steps on the [=Device timeline=] of |this|: -
- 1. If any of the following conditions are unsatisfied, generate a validation - error and stop. -
- - |this|.{{GPUProgrammablePassEncoder/[[debug_group_stack]]}}'s [=stack/size=] is 0. - - |this|.{{GPURenderPassEncoder/[[occlusion_query_active]]}} is `false`. - - Issue: Add remaining validation. -
- - Issue: Enqueue the attachment stores (with storeOp clear). -
-
-
- -# Bundles # {#bundles} - -## GPURenderBundle ## {#render-bundle} - - - -### Creation ### {#render-bundle-creation} - - - - - -
- : createRenderBundleEncoder(descriptor) - :: - Creates a {{GPURenderBundleEncoder}}. - -
- **Called on:** {{GPUDevice}} this. - - **Arguments:** -
-                descriptor: Description of the {{GPURenderBundleEncoder}} to create.
-            
- - **Returns:** {{GPURenderBundleEncoder}} - - Issue: Describe {{GPUDevice/createRenderBundleEncoder()}} algorithm steps. -
-
- -### Encoding ### {#render-bundle-encoding} - - - -### Finalization ### {#render-bundle-finalization} - -
- : finish(descriptor) - :: - Completes recording of the render bundle commands sequence. - -
- **Called on:** {{GPURenderBundleEncoder}} this. - - **Arguments:** -
-                descriptor:
-            
- - **Returns:** {{GPURenderBundle}} - - Issue: Describe {{GPURenderBundleEncoder/finish()}} algorithm steps. -
-
- -# Queues # {#queues} - - - -{{GPUQueue}} has the following methods: - -
- : writeBuffer(buffer, bufferOffset, data, dataOffset, size) - :: - Issues a write operation of the provided data into a {{GPUBuffer}}. - -
- **Called on:** {{GPUQueue}} |this|. - - **Arguments:** -
-                |buffer|: The buffer to write to.
-                |bufferOffset|: Offset in bytes into |buffer| to begin writing at.
-                |data|: Data to write into |buffer|.
-                |dataOffset|: Offset in into |data| to begin writing from. Given in elements if
-                    |data| is a `TypedArray` and bytes otherwise.
-                |size|: Size of content to write from |data| to |buffer|. Given in elements if
-                    |data| is a `TypedArray` and bytes otherwise.
-            
- - **Returns:** {{undefined}} - - 1. If |data| is an {{ArrayBuffer}} or {{DataView}}, let the element type be "byte". - Otherwise, |data| is a TypedArray; let the element type be the type of the TypedArray. - 1. Let |dataSize| be the size of |data|, in elements. - 1. If |size| is unspecified, - let |contentsSize| be |dataSize| − |dataOffset|. - Otherwise, let |contentsSize| be |size|. - 1. If any of the following conditions are unsatisfied, - throw {{OperationError}} and stop. - -
- - |contentsSize| ≥ 0. - - |dataOffset| + |contentsSize| ≤ |dataSize|. - - |contentsSize|, converted to bytes, is a multiple of 4 bytes. -
- 1. Let |dataContents| be [=get a copy of the buffer source|a copy of the bytes held by the buffer source=]. - 1. Let |contents| be the |contentsSize| elements of |dataContents| starting at - an offset of |dataOffset| elements. - 1. Issue the following steps on the [=Queue timeline=] of |this|: -
- 1. If any of the following conditions are unsatisfied, - generate a validation error and stop. -
- - |buffer| is [$valid to use with$] |this|. - - |buffer|.{{GPUBuffer/[[state]]}} is [=buffer state/unmapped=]. - - |buffer|.{{GPUBuffer/[[usage]]}} includes {{GPUBufferUsage/COPY_DST}}. - - |bufferOffset|, converted to bytes, is a multiple of 4 bytes. - - |bufferOffset| + |contentsSize|, converted to bytes, ≤ |buffer|.{{GPUBuffer/[[size]]}} bytes. -
- 1. Write |contents| into |buffer| starting at |bufferOffset|. -
-
- - : writeTexture(destination, data, dataLayout, size) - :: - Issues a write operation of the provided data into a {{GPUTexture}}. - -
- **Called on:** {{GPUQueue}} |this|. - - **Arguments:** -
-                |destination|: The [=texture subresource=] and origin to write to.
-                |data|: Data to write into |destination|.
-                |dataLayout|: Layout of the content in |data|.
-                |size|: Extents of the content to write from |data| to |destination|.
-            
- - **Returns:** {{undefined}} - - 1. Let |dataBytes| be [=get a copy of the buffer source|a copy of the bytes held by the buffer source=] |data|. - 1. Let |dataByteSize| be the number of bytes in |dataBytes|. - 1. If any of the following conditions are unsatisfied, - throw {{OperationError}} and stop. -
- - [$validating linear texture data$](|dataLayout|, - |dataByteSize|, - |destination|.{{GPUImageCopyTexture/texture}}.{{GPUTexture/[[format]]}}, - |size|) succeeds. -
- 1. Let |contents| be the contents of the [=images=] seen by - viewing |dataBytes| with |dataLayout| and |size|. - - Issue: Specify more formally. - 1. Issue the following steps on the [=Queue timeline=] of |this|: -
- 1. If any of the following conditions are unsatisfied, - generate a validation error and stop. -
- - [$validating GPUImageCopyTexture$](|destination|, |size|) returns `true`. - - |destination|.{{GPUImageCopyTexture/texture}}.{{GPUTexture/[[textureUsage]]}} - includes {{GPUTextureUsage/COPY_DST}}. - - |destination|.{{GPUImageCopyTexture/texture}}.{{GPUTexture/[[sampleCount]]}} is 1. - - [=Valid Texture Copy Range=](|destination|, |size|) - is satisfied. - - |destination|.{{GPUImageCopyTexture/aspect}} refers to a single copyable aspect - of |destination|.{{GPUImageCopyTexture/texture}}.{{GPUTexture/[[format]]}}. - See [[#depth-formats|depth-formats]]. - - Note: unlike - {{GPUCommandEncoder}}.{{GPUCommandEncoder/copyBufferToTexture()}}, - there is no alignment requirement on - |dataLayout|.{{GPUImageDataLayout/bytesPerRow}}. -
- 1. Write |contents| into |destination|. - - Issue: Specify more formally. -
-
- - : copyImageBitmapToTexture(source, destination, copySize) - :: - Schedules a copy operation of the contents of an image bitmap into the destination texture. - -
- **Called on:** {{GPUQueue}} this. - - **Arguments:** -
-                |source|: {{ImageBitmap}} and origin to copy to |destination|.
-                |destination|: The [=texture subresource=] and origin to write to.
-                |copySize|: Extents of the content to write from |source| to |destination|.
-            
- - **Returns:** {{undefined}} - - If any of the following conditions are unsatisfied, throw an {{OperationError}} and stop. -
- - |copySize|.[=Extent3D/depthOrArrayLayers=] is `1`. - - |destination|.{{GPUImageCopyTexture/texture}}.{{GPUTexture/[[dimension]]}} is {{GPUTextureDimension/"2d"}}. - - |destination|.{{GPUImageCopyTexture/texture}}.{{GPUTexture/[[format]]}} is one of the following: - - {{GPUTextureFormat/"rgba8unorm"}} - - {{GPUTextureFormat/"rgba8unorm-srgb"}} - - {{GPUTextureFormat/"bgra8unorm"}} - - {{GPUTextureFormat/"bgra8unorm-srgb"}} - - {{GPUTextureFormat/"rgb10a2unorm"}} - - {{GPUTextureFormat/"rgba16float"}} - - {{GPUTextureFormat/"rgba32float"}} - - {{GPUTextureFormat/"rg8unorm"}} - - {{GPUTextureFormat/"rg16float"}} -
-
- - : submit(commandBuffers) - :: - Schedules the execution of the command buffers by the GPU on this queue. - -
- **Called on:** {{GPUQueue}} this. - - **Arguments:** -
-                |commandBuffers|:
-            
- - **Returns:** {{undefined}} - - Issue the following steps on the [=Device timeline=] of |this|: -
- 1. If any of the following conditions are unsatisfied, generate a validation error and stop. -
- - Every {{GPUBuffer}} referenced in any element of |commandBuffers| is in the - `"unmapped"` [=buffer state=]. - - Every {{GPUQuerySet}} referenced in a command in any element of |commandBuffers| is - in the [=query set state/available=] state. For occlusion queries, - {{GPURenderPassDescriptor/occlusionQuerySet}} in {{GPUCommandEncoder/beginRenderPass()}} - does not constitute a reference, while {{GPURenderPassEncoder/beginOcclusionQuery()}} - does. -
- - 1. Issue the following steps on the [=Queue timeline=] of |this|: -
- 1. For each |commandBuffer| in |commandBuffers|: - 1. Execute each command in |commandBuffer|.{{GPUCommandBuffer/[[command_list]]}}. -
-
-
- - : onSubmittedWorkDone() - :: - Returns a {{Promise}} that resolves once this queue finishes processing all the work submitted - up to this moment. - -
- **Called on:** {{GPUQueue}} this. - - **Arguments:** -
-            
- - **Returns:** {{Promise}}<{{undefined}}> - - Issue: Describe {{GPUQueue/onSubmittedWorkDone()}} algorithm steps. -
-
- -Queries {#queries} -================ - -## GPUQuerySet ## {#queryset} - - - -{{GPUQuerySet}} has the following internal slots: - -
- : \[[descriptor]], of type {{GPUQuerySetDescriptor}} - :: - The {{GPUQuerySetDescriptor}} describing this query set. - - All optional fields of {{GPUTextureViewDescriptor}} are defined. - - : \[[state]] of type [=query set state=]. - :: - The current state of the {{GPUQuerySet}}. -
- -Each {{GPUQuerySet}} has a current query set state on the [=Device timeline=] -which is one of the following: - - - "available" where the {{GPUQuerySet}} is - available for GPU operations on its content. - - "destroyed" where the {{GPUQuerySet}} is - no longer available for any operations except {{GPUQuerySet/destroy}}. - -### QuerySet Creation ### {#queryset-creation} - -A {{GPUQuerySetDescriptor}} specifies the options to use in creating a {{GPUQuerySet}}. - - - -
- : type - :: - The type of queries managed by {{GPUQuerySet}}. - - : count - :: - The number of queries managed by {{GPUQuerySet}}. - - : pipelineStatistics - :: - The set of {{GPUPipelineStatisticName}} values in this sequence defines which pipeline statistics will be returned in the new query set. -
- -
- : createQuerySet(descriptor) - :: - Creates a {{GPUQuerySet}}. - -
- **Called on:** {{GPUDevice}} this. - - **Arguments:** -
-                descriptor: Description of the {{GPUQuerySet}} to create.
-            
- - **Returns:** {{GPUQuerySet}} - - 1. If |descriptor|.{{GPUQuerySetDescriptor/type}} is {{GPUQueryType/"pipeline-statistics"}}, - but |this|.{{GPUDevice/[[device]]}}.{{device/[[features]]}} does not [=list/contain=] - {{GPUFeatureName/"pipeline-statistics-query"}}, throw a {{TypeError}}. - 1. If |descriptor|.{{GPUQuerySetDescriptor/type}} is {{GPUQueryType/"timestamp"}}, - but |this|.{{GPUDevice/[[device]]}}.{{device/[[features]]}} does not [=list/contain=] - {{GPUFeatureName/"timestamp-query"}}, throw a {{TypeError}}. - 1. If any of the following requirements are unmet, return an error query set and stop. -
- - |this| must be a [=valid=] {{GPUDevice}}. - - |descriptor|.{{GPUQuerySetDescriptor/count}} must be ≤ 8192. - - If |descriptor|.{{GPUQuerySetDescriptor/type}} is - {{GPUQueryType/"pipeline-statistics"}}: - - - |descriptor|.{{GPUQuerySetDescriptor/pipelineStatistics}} must not - contain duplicate entries. - - Otherwise: - - - |descriptor|.{{GPUQuerySetDescriptor/pipelineStatistics}} must be - [=list/empty=]. -
- 1. Let |q| be a new {{GPUQuerySet}} object. - 1. Set |q|.{{GPUQuerySet/[[descriptor]]}} to |descriptor|. - 1. Set |q|.{{GPUQuerySet/[[state]]}} to [=query set state/available=]. - 1. Return |q|. -
-
- -### QuerySet Destruction ### {#queryset-destruction} - -An application that no longer requires a {{GPUQuerySet}} can choose to lose access to it before -garbage collection by calling {{GPUQuerySet/destroy()}}. - -
- : destroy() - :: - Destroys the {{GPUQuerySet}}. - -
- **Called on:** {{GPUQuerySet}} |this|. - - **Returns:** {{undefined}} - - 1. Set |this|.{{GPUQuerySet/[[state]]}} to [=query set state/destroyed=]. -
-
- -## QueryType ## {#querytype} - - - -## Occlusion Query ## {#occlusion} - -Occlusion query is only available on render passes, to query the number of fragment samples that pass -all the per-fragment tests for a set of drawing commands, including scissor, sample mask, alpha to -coverage, stencil, and depth tests. Any non-zero result value for the query indicates that at least -one sample passed the tests and reached the output merging stage of the render pipeline, 0 indicates -that no samples passed the tests. - -When beginning a render pass, {{GPURenderPassDescriptor}}.{{GPURenderPassDescriptor/occlusionQuerySet}} -must be set to be able to use occlusion queries during the pass. An occlusion query is begun -and ended by calling {{GPURenderPassEncoder/beginOcclusionQuery()}} and -{{GPURenderPassEncoder/endOcclusionQuery()}} in pairs that cannot be nested. - -## Pipeline Statistics Query ## {#pipeline-statistics} - - - -When resolving pipeline statistics query, each result is written into {{GPUSize64}}, and the number and order of the results written to GPU buffer matches the number and order of {{GPUPipelineStatisticName}} specified in {{GPUQuerySetDescriptor/pipelineStatistics}}. - -The {{GPURenderPassEncoder/beginPipelineStatisticsQuery()}} and {{GPURenderPassEncoder/endPipelineStatisticsQuery()}} (on both {{GPUComputePassEncoder}} and {{GPURenderPassEncoder}}) cannot be nested. A pipeline statistics query must be ended before beginning another one. - -Pipeline statistics query requires {{GPUFeatureName/pipeline-statistics-query}} is available on the device. - -## Timestamp Query ## {#timestamp} - -Timestamp query allows application to write timestamp values to a {{GPUQuerySet}} by calling {{GPURenderPassEncoder/writeTimestamp()}} on {{GPUComputePassEncoder}} or {{GPURenderPassEncoder}} or {{GPUCommandEncoder}}, and then resolve timestamp values in **nanoseconds** (type of {{GPUSize64}}) to a {{GPUBuffer}} (using {{GPUCommandEncoder/resolveQuerySet()}}). - -Timestamp query requires {{GPUFeatureName/timestamp-query}} is available on the device. - -Note: The timestamp values may be zero if the physical device reset timestamp counter, please ignore it and the following values. - -Issue: Write normative text about timestamp value resets. - -Issue: Because timestamp query provides high-resolution GPU timestamp, we need to decide what constraints, if any, are on its availability. - -# Canvas Rendering & Swap Chains # {#canvas-rendering} - -## {{HTMLCanvasElement/getContext()|HTMLCanvasElement.getContext()}} ## {#canvas-getcontext} - -A {{GPUCanvasContext}} object can be obtained via the {{HTMLCanvasElement/getContext()}} -method of an {{HTMLCanvasElement}} instance by -passing the string literal `'gpupresent'` as its `contextType` argument. - -
- Get a {{GPUCanvasContext}} from an offscreen {{HTMLCanvasElement}}: -
-        const canvas = document.createElement('canvas');
-        const context =  canvas.getContext('gpupresent');
-        const swapChain = context.configureSwapChain(/* ... */);
-        // ...
-    
-
- -## GPUCanvasContext ## {#canvas-context} - - - -{{GPUCanvasContext}} has the following internal slots: - -
- : \[[canvas]] of type {{HTMLCanvasElement}}. - :: - The canvas this context was created from. -
- -{{GPUCanvasContext}} has the following methods: - -
- : configureSwapChain(descriptor) - :: - Configures the swap chain for this canvas, and returns a new - {{GPUSwapChain}} object representing it. Destroys any swapchain - previously returned by `configureSwapChain`, including all of the - textures it has produced. - -
- **Called on:** {{GPUCanvasContext}} |this|. - - **Arguments:** -
-                |descriptor|: Description of the {{GPUSwapChain}} to configure.
-            
- - **Returns:** {{GPUSwapChain}} - - 1. Issue the following steps on the [=Device timeline=] of |this|: -
- 1. If any of the following conditions are unsatisfied, generate a validation error and stop. -
- - |descriptor|.{{GPUSwapChainDescriptor/device}} is a [=valid=] {{GPUDevice}}. - - [=Supported swap chain formats=] [=set/contains=] |descriptor|.{{GPUSwapChainDescriptor/format}}. -
- - Issue: Describe remaining {{GPUCanvasContext/configureSwapChain()}} algorithm steps. -
-
- - : getSwapChainPreferredFormat(adapter) - :: - Returns an optimal {{GPUTextureFormat}} to use for swap chains with this context and the - given device. - -
- **Called on:** {{GPUCanvasContext}} this. - - **Arguments:** -
-                |adapter|: Adapter the swap chain format should be queried for.
-            
- - **Returns:** {{GPUTextureFormat}} - -
- 1. Return an optimal {{GPUTextureFormat}} to use when creating a {{GPUSwapChain}} - with the given |adapter|. Must be one of the [=supported swap chain formats=]. -
-
-
- -## GPUSwapChainDescriptor ## {#swapchain-descriptor} - -The supported swap chain formats are a [=set=] of {{GPUTextureFormat}}s that must be -supported when specified as a {{GPUSwapChainDescriptor}}.{{GPUSwapChainDescriptor/format}} regardless -of the given {{GPUSwapChainDescriptor}}.{{GPUSwapChainDescriptor/device}}, initially set to: -«{{GPUTextureFormat/"bgra8unorm"}}, {{GPUTextureFormat/"bgra8unorm-srgb"}}, -{{GPUTextureFormat/"rgba8unorm"}}, {{GPUTextureFormat/"rgba8unorm-srgb"}}». - - - -## GPUSwapChain ## {#swapchain} - - - -{{GPUSwapChain}} has the following internal slots: - -
- : \[[context]] of type {{GPUCanvasContext}} - :: - The context this swap chain was configured for. - - : \[[descriptor]] of type {{GPUSwapChainDescriptor}} - :: - The descriptor this swap chain was created with. - - : \[[currentTexture]] of type {{GPUTexture}}, nullable - :: - The current texture that will be returned by the swap chain when calling - {{GPUSwapChain/getCurrentTexture()}}, and the next one to be composited to the document. - Initially set to the result of [$allocating a new swap chain texture$] for this swap chain. -
- -{{GPUSwapChain}} has the following methods: - -
- : getCurrentTexture() - :: - Get the {{GPUTexture}} that will be composited to the document by the {{GPUCanvasContext}} - that created this swap chain next. - -
- **Called on:** {{GPUSwapChain}} |this|. - - **Returns:** {{GPUTexture}} - - 1. If |this|.{{GPUSwapChain/[[currentTexture]]}} is `null`: - 1. Let |this|.{{GPUSwapChain/[[currentTexture]]}} be the result of [$allocating a - new swap chain texture$] for |this|. - 1. Return |this|.{{GPUSwapChain/[[currentTexture]]}}. -
- - Note: Developers can expect that the same {{GPUTexture}} object will be returned by every - call to {{GPUSwapChain/getCurrentTexture()}} made within the same frame (i.e. between - invocations of [=Update the rendering=]). -
- -
- During the "update the rendering [of the] `Document`" step of the "[=Update the rendering=]" - HTML processing model, each {{GPUSwapChain}} |swapChain| must present the - swap chain content to the canvas by running the following steps: - - 1. Let |texture| be |swapChain|.{{GPUSwapChain/[[currentTexture]]}}. - 1. If |texture| is `null`, stop. - 1. Set |swapChain|.{{GPUSwapChain/[[currentTexture]]}} to `null`. - 1. Ensure that all submitted work items (e.g. queue submissions) has - completed writing to |texture|. - 1. Update |swapChain|.{{GPUSwapChain/[[context]]}}.{{GPUCanvasContext/[[canvas]]}} with the - contents of |texture|. - 1. Call {{GPUTexture/destroy()}} on |texture|. - - Issue: The texture should mark its `[[destroyed]]` field as true rather than calling the - {{GPUTexture/destroy()}} method if we separate object invalid and destroyed states. -
- -
- To Allocate a new swap chain texture - for {{GPUSwapChain}} |swapChain| run the following steps: - - 1. Let |canvas| be |swapChain|.{{GPUSwapChain/[[context]]}}.{{GPUCanvasContext/[[canvas]]}}. - 1. Let |device| be |swapChain|.{{GPUSwapChain/[[descriptor]]}}.{{GPUSwapChainDescriptor/device}}. - 1. Let |descriptor| be a new {{GPUTextureDescriptor}}. - 1. Set |descriptor|.{{GPUTextureDescriptor/size}} to [|canvas|.width, |canvas|.height, 1]. - 1. Set |descriptor|.{{GPUTextureDescriptor/format}} to - |swapChain|.{{GPUSwapChain/[[descriptor]]}}.{{GPUSwapChainDescriptor/format}}. - 1. Set |descriptor|.{{GPUTextureDescriptor/usage}} to - |swapChain|.{{GPUSwapChain/[[descriptor]]}}.{{GPUSwapChainDescriptor/usage}}. - 1. Let |texture| be a new {{GPUTexture}} created as if |device|.{{GPUDevice/createTexture()}} - were called with |descriptor|. -
If a previously presented texture from |swapChain| matches the required criteria, - its GPU memory may be re-used.
- 1. Ensure |texture| is cleared to `(0, 0, 0, 0)`. - 1. Return |texture|. -
- - -## GPUCanvasCompositingAlphaMode ## {#GPUCanvasCompositingAlphaMode} - -This enum selects how the swap chain canvas will paint onto the page. - - - - - - - -
GPUCanvasCompositingAlphaMode - Description - dst.rgb - dst.a -
{{GPUCanvasCompositingAlphaMode/opaque}} - Paint RGB as opaque and ignore alpha values. - If the content is not already opaque, implementations may need to clear alpha to opaque during presentation. - |dst.rgb = src.rgb| - |dst.a = 1| -
{{GPUCanvasCompositingAlphaMode/premultiplied}} - Composite assuming color values are premultiplied by their alpha value. - 100% red 50% opaque is [0.5, 0, 0, 0.5]. - Color values must be less than or equal to their alpha value. - [1.0, 0, 0, 0.5] is "super-luminant" and cannot reliably be displayed. - |dst.rgb = src.rgb + dst.rgb*(1-src.a)| - |dst.a = src.a + dst.a*(1-src.a)| -
- -# Errors & Debugging # {#errors-and-debugging} - -## Fatal Errors ## {#fatal-errors} - - - - -## Error Scopes ## {#error-scopes} - - - - - - - -
- : pushErrorScope(filter) - :: - Issue: Define pushErrorScope. - - : popErrorScope() - :: - Issue: Define popErrorScope. - - Rejects with {{OperationError}} if: - - - The device is lost. - - There are no error scopes on the stack. -
- -## Telemetry ## {#telemetry} - - - - - -# Detailed Operations # {#detailed-operations} - -This section describes the details of various GPU operations. - -## Transfer ## {#transfer-operations} - -Issue: describe the transfers at the high level - -## Computing ## {#computing-operations} - -Computing operations provide direct access to GPU's programmable hardware. -Compute shaders do not have pipeline inputs or outputs, their results are -side effects from writing data into storage bindings bound as -{{GPUBufferBindingType/"storage"|GPUBufferBindingType."storage"}} and {{GPUStorageTextureBindingLayout}}. -These operations are encoded within {{GPUComputePassEncoder}} as: - - {{GPUComputePassEncoder/dispatch()}} - - {{GPUComputePassEncoder/dispatchIndirect()}} - -Issue: describe the computing algorithm - -## Rendering ## {#rendering-operations} - -Rendering is done by a set of GPU operations that are executed within {{GPURenderPassEncoder}}, -and result in modifications of the texture data, viewed by the render pass attachments. -These operations are encoded with: - - {{GPURenderEncoderBase/draw()}} - - {{GPURenderEncoderBase/drawIndexed()}}, - - {{GPURenderEncoderBase/drawIndirect()}} - - {{GPURenderEncoderBase/drawIndexedIndirect()}}. - -Note: rendering is the traditional use of GPUs, and is supported by multiple fixed-function -blocks in hardware. - -A RenderState is an internal object representing the state -of the current {{GPURenderPassEncoder}} during command encoding. -[=RenderState=] is a spec namespace for the following definitions: -
- For a given {{GPURenderPassEncoder}} |pass|, the syntax: - - - |pass|.indexBuffer refers to - the index buffer bound via {{GPURenderEncoderBase/setIndexBuffer()}}, if any. - - |pass|.vertexBuffers refers to - [=list=]<vertex buffer> bound by {{GPURenderEncoderBase/setVertexBuffer()}}. - - |pass|.bindGroups refers to - [=list=]<{{GPUBindGroup}}> bound by {{GPUProgrammablePassEncoder/setBindGroup(index, bindGroup, dynamicOffsets)}}. -
- -The main rendering algorithm: - -
- render(descriptor, drawCall, state) - - **Arguments:** - - |descriptor|: Description of the current {{GPURenderPipeline}}. - - |drawCall|: The draw call parameters. - - |state|: [=RenderState=] of the {{GPURenderEncoderBase}} where the draw call is issued. - - 1. **Resolve indices**. See [[#index-resolution]]. - - Let |vertexList| be the result of [$resolve indices$](|drawCall|, |state|). - - 1. **Process vertices**. See [[#vertex-processing]]. - - Execute [$process vertices$](|vertexList|, |drawCall|, |descriptor|.{{GPURenderPipelineDescriptor/vertex}}, |state|). - - 1. **Assemble primitives**. See [[#primitive-assembly]]. - - Execute [$assemble primitives$](|vertexList|, |drawCall|, |descriptor|.{{GPURenderPipelineDescriptor/primitive}}). - - 1. **Clip primitives**. See [[#primitive-clipping]]. - - 1. Rasterize. See [[#rasterization]]. - - 1. Process fragments. - Issue: fill out the section - 1. Process depth/stencil. - Issue: fill out the section - 1. Write pixels. - Issue: fill out the section -
- -### Index Resolution ### {#index-resolution} - -At the first stage of rendering, the pipeline builds -a list of vertices to process for each instance. - -
- resolve indices(drawCall, state) - - **Arguments:** - - |drawCall|: The draw call parameters. - - |state|: The active [=RenderState=]. - - **Returns:** list of integer indices. - - 1. Let |vertexIndexList| be an empty list of indices. - 1. If |drawCall| is an indexed draw call: - 1. initialize the |vertexIndexList| with |drawCall|.indexCount integers. - 1. for |i| in range 0 .. |drawCall|.indexCount (non-inclusive): - 1. let |vertexIndex| be [$fetch index$](|i| + |drawCall|.firstIndex, - |state|.[=RenderState/indexBuffer=].buffer, |state|.[=RenderState/indexBuffer=].offset, - |state|.[=RenderState/indexBuffer=].format) + |drawCall|.baseVertex - 1. append |vertexIndex| to the |vertexIndexList| - 1. Otherwise: - 1. initialize the |vertexIndexList| with |drawCall|.vertexCount integers. - 1. assign the |vertexIndexList| item |i| to be |drawCall|.firstVertex + |i| - 1. Return |vertexIndexList|. - - Note: in case of indirect draw calls, the `indexCount`, `vertexCount`, - and other properties of |drawCall| are read from the indirect buffer - instead of the draw command itself. - - Issue: specify indirect commands better. -
- -
- fetch index(i, buffer, offset, format) - - **Arguments:** - - |i|: Index of a vertex index to fetch. - - |buffer|: {{GPUBuffer}} containing index data. - - |offset|: Base offset into the |buffer|. - - |format|: {{GPUIndexFormat}} of the index. - - Let |stride| be defined by the |format|: -
- : {{GPUIndexFormat/"uint16"}} - :: 2 - : {{GPUIndexFormat/"uint32"}} - :: 4 -
- Interpret the data in |buffer| starting with |offset| + |i| * |stride| - of size |stride| bytes as an unsigned integer and return it. -
- -### Vertex Processing ### {#vertex-processing} - -Vertex processing stage is a programmable stage of the render [=pipeline=] that -processes the vertex attribute data, and produces -clip space positions for {#primitive-clipping}, as well as other data for the -{#fragment-processing}. - -
- process vertices(vertexIndexList, drawCall, desc, state) - - **Arguments:** - - |vertexIndexList|: List of vertex indices to process. - - |drawCall|: The draw call parameters. - - |desc|: The descriptor of type {{GPUVertexState}}. - - |state|: The active [=RenderState=]. - - Each vertex |vertexIndex| in the |vertexIndexList|, - in each instance of index |rawInstanceIndex|, is processed independently. - The |rawInstanceIndex| is in range from 0 to |drawCall|.instanceCount - 1, inclusive. - This processing happens in parallel, and any side effects, such as - writes into {{GPUBufferBindingType/"storage"|GPUBufferBindingType."storage"}} bindings, - may happen in any order. - 1. Let |instanceIndex| be |rawInstanceIndex| + |drawCall|.baseInstance. - 1. For each non-`null` |vertexBufferLayout| in the list of |desc|.{{GPUVertexState/buffers}}: - 1. Let |i| be the index of the buffer layout in this list. - 1. Let |vertexBuffer| and |vertexBufferOffset| be the buffer and offset in - |state|.[=RenderState/vertexBuffers=] bindings at slot |i| - 1. Let |vertexElementIndex| be dependent on |vertexBufferLayout|.{{GPUVertexBufferLayout/stepMode}}: -
- : {{GPUInputStepMode/"vertex"}} - :: |vertexIndex| - : {{GPUInputStepMode/"instance"}} - :: |instanceIndex| -
- 1. For each |attributeDesc| in |vertexBufferLayout|.{{GPUVertexBufferLayout/attributes}}: - 1. Let |attributeOffset| be |vertexBufferOffset| + - |vertexElementIndex| * |vertexBufferLayout|.{{GPUVertexBufferLayout/arrayStride}} + - |attributeDesc|.{{GPUVertexAttribute/offset}}. - 1. Load the attribute |data| of format |attributeDesc|.{{GPUVertexAttribute/format}} - from |vertexBuffer| starting at offset |attributeOffset|. - 1. Convert the |data| into a shader-visible format. -
- An attribute of type {{GPUVertexFormat/"unorm8x2"}} will be converted from - 2 bytes of fixed-point unsigned 8-bit integers into 2 floating-point values - as `vec2` in WGSL. -
- 1. Bind the |data| to vertex shader input - location |attributeDesc|.{{GPUVertexAttribute/shaderLocation}}. - 1. For each {{GPUBindGroup}} group at |index| in |state|.[=RenderState/bindGroups=]: - 1. For each resource {{GPUBindingResource}} in the bind group: - 1. Let |entry| be the corresponding {{GPUBindGroupLayoutEntry}} for this resource. - 1. If |entry|.{{GPUBindGroupLayoutEntry}}.visibility includes {{GPUShaderStage/VERTEX}}: - - Bind the resource to the shader under group |index| and binding {{GPUBindGroupLayoutEntry/binding|GPUBindGroupLayoutEntry.binding}}. - 1. Set the shader builtins: - - Set the `VertexIndex` builtin, if any, to |vertexIndex|. - - Set the `InstanceIndex` builtin, if any, to |instanceIndex|. - 1. Invoke vertex shader entry point described by |desc|. - - Note: The target platform caches the results of vertex shader invocations. - There is no guarantee that any |vertexIndex| that repeats more than once will - result in multiple invocations. Similarly, there is no guarantee that a single |vertexIndex| - will only be processed once. -
- -### Primitive Assembly ### {#primitive-assembly} - -Primitives are assembled by a fixed-function stage of GPUs. - -
- assemble primitives(vertexIndexList, drawCall, desc) - - **Arguments:** - - |vertexIndexList|: List of vertex indices to process. - - |drawCall|: The draw call parameters. - - |desc|: The descriptor of type {{GPUPrimitiveState}}. - - For each instance, the primitives get assembled from the vertices that have been - processed by the shaders, based on the |vertexIndexList|. - - 1. First, if |desc|.{{GPUPrimitiveState/stripIndexFormat}} is not `null` - (which means the primitive topology is a strip), and the |drawCall| is indexed, - the |vertexIndexList| is split into sub-lists - using the maximum value of this index format as a separator. - - Example: a |vertexIndexList| with values `[1, 2, 65535, 4, 5, 6]` of type {{GPUIndexFormat/"uint16"}} - will be split in sub-lists `[1, 2]` and `[4, 5, 6]`. - - 1. For each of the sub-lists |vl|, primitive generation is done according to the - |desc|.{{GPUPrimitiveState/topology}}: -
- : {{GPUPrimitiveTopology/"line-list"}} - :: - Line primitives are composed from (|vl|.0, |vl|.1), - then (|vl|.2, |vl|.3), then (|vl|.4 to |vl|.5), etc. - Each subsequent primitive takes 2 vertices. - - : {{GPUPrimitiveTopology/"line-strip"}} - :: - Line primitives are composed from (|vl|.0, |vl|.1), - then (|vl|.1, |vl|.2), then (|vl|.2, |vl|.3), etc. - Each subsequent primitive takes 1 vertex. - - : {{GPUPrimitiveTopology/"triangle-list"}} - :: - Triangle primitives are composed from (|vl|.0, |vl|.1, |vl|.2), - then (|vl|.3, |vl|.4, |vl|.5), then (|vl|.6, |vl|.7, |vl|.8), etc. - Each subsequent primitive takes 3 vertices. - - : {{GPUPrimitiveTopology/"triangle-strip"}} - :: - Triangle primitives are composed from (|vl|.0, |vl|.1, |vl|.2), - then (|vl|.2, |vl|.1, |vl|.3), then (|vl|.2, |vl|.3, |vl|.4), - then (|vl|.4, |vl|.3, |vl|.5), etc. - Each subsequent primitive takes 1 vertices. -
- - Issue: should this be defined more formally? - - Any incomplete primitives are dropped. - -
- -### Primitive Clipping ### {#primitive-clipping} - -Vertex shaders have to produce a built-in "position" (of type `vec4`), -which denotes the clip position of a vertex. - -Issue: link to WGSL built-ins - -Primitives are clipped to the clip volume, which, for any [=clip position=] |p| -inside a primitive, is defined by the following inequalities: - - −|p|.w ≤ |p|.x ≤ |p|.w - - −|p|.w ≤ |p|.y ≤ |p|.w - - 0 ≤ |p|.z ≤ |p|.w (depth clipping) - -If |descriptor|.{{GPURenderPipelineDescriptor/primitive}}.{{GPUPrimitiveState/clampDepth}} is `true`, -the [=depth clipping=] restriction of the [=clip volume=] is not applied. - -A primitive passes through this stage unchanged if every one of its edges -lie entirely inside the [=clip volume=]. -If the edges of a primitives intersect the boundary of the [=clip volume=], -the intersecting edges are reconnected by new edges that lie along the boundary of the [=clip volume=]. -For triangular primitives (|descriptor|.{{GPURenderPipelineDescriptor/primitive}}.{{GPUPrimitiveState/topology}} is -{{GPUPrimitiveTopology/"triangle-list"}} or {{GPUPrimitiveTopology/"triangle-strip"}}), this reconnection -may result in introduction of new vertices into the polygon, internally. - -If a primitive intersects an edge of the [=clip volume=]’s boundary, -the clipped polygon must include a point on this boundary edge. - -If the vertex shader outputs other floating-point values (scalars and vectors), qualified with -"perspective" interpolation, they also get clipped. -The output values associated with a vertex that lies within the clip volume are unaffected by clipping. -If a primitive is clipped, however, the output values assigned to vertices produced by clipping are clipped. - -Considering an edge between vertices |a| and |b| that got clipped, resulting in the vertex |c|, -let's define |t| to be the ratio between the edge vertices: -|c|.p = |t| × |a|.p + (1 − |t|) × |b|.p, -where |x|.p is the output [=clip position=] of a vertex |x|. - -For each vertex output value "v" with a corresponding fragment input, -|a|.v and |b|.v would be the outputs for |a| and |b| vertices respectively. -The clipped shader output |c|.v is produced based on the interpolation qualifier: -
- : "flat" - :: - Flat interpolation is unaffected, and is based on provoking vertex, - which is the first vertex in the primitive. The output value is the same - for the whole primitive, and matches the vertex output of the [=provoking vertex=]: - |c|.v = [=provoking vertex=].v - - : "linear" - :: - The interpolation ratio gets adjusted against the perspective coordinates of the - [=clip position=]s, so that the result of interpolation is linear in screen space. - - Issue: provide more specifics here, if possible - - : "perspective" - :: - The value is linearly interpolated in clip space, producing perspective-correct values: - - |c|.v = |t| × |a|.v + (1 − |t|) × |b|.v -
- -Issue: link to interpolation qualifiers in WGSL - -### Rasterization ### {#rasterization} - -Rasterization is the hardware processing stage that maps the generated primitives -to the 2-dimensional rendering area of the framebuffer - -the set of render attachments in the current {{GPURenderPassEncoder}}. -This rendering area is split into an even grid of pixels. - -Rasterization determines the set of pixels affected by a primitive. In case of multi-sampling, -each pixel is further split into |descriptor|.{{GPURenderPipelineDescriptor/multisample}}.{{GPUMultisampleState/count}} -samples. The locations of samples are the same for each pixel, but not defined in this spec. - -Issue: do we want to force-enable the "Standard sample locations" in Vulkan? - -The [=framebuffer=] coordinates start from the top-left corner of the render targets. -Each unit corresponds exactly to a pixel. See {#coordinate-systems} for more information. - -1. First, the clipped vertices are transformed into NDC - normalized device coordinates. - Given the output position |p|, the [=NDC=] coordinates are computed as: - - ndc(|p|) = vector(|p|.x ÷ |p|.w, |p|.y ÷ |p|.w, |p|.z ÷ |p|.w) - -1. Let |viewport| be {{GPURenderPassEncoder/[[viewport]]}} of the current render pass. - Then the [=NDC=] coordinates |n| are converted into [=framebuffer=] coordinates, based on the size of the render targets: - - framebufferCoords(n) = vector(|viewport|.`x` + 0.5×(|n|.x+1)×|viewport|.`width`, |viewport|.`y` + 0.5×(|n|.y+1)×|viewport|.`height`) - -1. The specific rasterization algorithm depends on {{GPURenderPipelineDescriptor/primitive}}.{{GPUPrimitiveState/topology}}: -
- : {{GPUPrimitiveTopology/"point-list"}} - :: The point, if not filtered by [[#primitive-clipping]], goes into [[#point-rasterization]]. - : {{GPUPrimitiveTopology/"line-list"}} or {{GPUPrimitiveTopology/"line-strip"}} - :: The line cut by [[#primitive-clipping]] goes into [[#line-rasterization]]. - : {{GPUPrimitiveTopology/"triangle-list"}} or {{GPUPrimitiveTopology/"triangle-strip"}} - :: The polygon produced in [[#primitive-clipping]] goes into [[#polygon-rasterization]]. -
- -Issue: reword the "goes into" part - -Let's define fragment destination to be a combination of the pixel position with -the sample index, in case [[#sample-frequency-shading]] is active. - -The result of rasterization is a set of points, each associated with the following data: - - [=fragment destination=] - - multisample coverage mask (see {#sample-masking}) - - depth, in [=NDC=] coordinates. - - barycentric coordinates - -Issue: define barycentric coordinates -Issue: define the depth computation algorithm - -#### Point Rasterization #### {#point-rasterization} - -A single [=fragment destination=] is selected within the pixel containing the -[=framebuffer=] coordinates of the point. - -The coverage mask depends on multi-sampling mode: -
- : sample-frequency - :: coverageMask = 1 ≪ `sampleIndex` - : pixel-frequency multi-sampling - :: coverageMask = 1 ≪ |descriptor|.{{GPURenderPipelineDescriptor/multisample}}.{{GPUMultisampleState/count}} − 1 - : no multi-sampling - :: coverageMask = 1 -
- -#### Line Rasterization #### {#line-rasterization} - -Issue: fill out this section - -#### Polygon Rasterization #### {#polygon-rasterization} - -Let |v|(|i|) be the [=framebuffer=] coordinates for the clipped vertex number |i| (starting with 1) -in a rasterized polygon of |n| vertices. - -Note: this section uses the term "polygon" instead of a "triangle", -since [[#primitive-clipping]] stage may have introduced additional vertices. -This is non-observable by the application. - -The first step of polygon rasterization is determining if the polygon is front-facing or back-facing. -This depends on the sign of the |area| occupied by the polygon in [=framebuffer=] coordinates: - -|area| = 0.5 × ((|v|1.x × |v||n|.y − |v||n|.x × |v|1.y) + ∑ (|v||i|+1.x × |v||i|.y − |v||i|.x × |v||i|+1.y)) - -The sign of |area| is interpreted based on the {{GPURenderPipelineDescriptor/primitive}}.{{GPUPrimitiveState/frontFace}}: -
- : {{GPUFrontFace/"ccw"}} - :: |area| > 0 is considered [=front-facing=], otherwise [=back-facing=] - : {{GPUFrontFace/"cw"}} - :: |area| < 0 is considered [=front-facing=], otherwise [=back-facing=] - : "linear" -
- -The polygon can be culled by {{GPURenderPipelineDescriptor/primitive}}.{{GPUPrimitiveState/cullMode}}: -
- : {{GPUCullMode/"none"}} - :: All polygons pass this test. - : {{GPUCullMode/"front"}} - :: The [=front-facing=] polygons are discarded, - and do not process in later stages of the render pipeline. - : {{GPUCullMode/"back"}} - :: The [=back-facing=] polygons are discarded. -
- -The next step is determining a set of fragments inside the polygon in framebuffer space - -these are locations scheduled for the per-fragment operations. -The determination is based on |descriptor|.{{GPURenderPipelineDescriptor/multisample}}: -
- : disabled - :: [=Fragment=]s are associated with pixel centers. That is, all the points with coordinates |C|, where - fract(|C|) = vector2(0.5, 0.5) in the [=framebuffer=] space, enclosed into the polygon, are included. - If a pixel center is on the edge of the polygon, whether or not it's included is not defined. - - Note: this becomes a subject of precision for the rasterizer. - - : enabled - :: Each pixel is associated with |descriptor|.{{GPURenderPipelineDescriptor/multisample}}.{{GPUMultisampleState/count}} - locations, which are implementation-defined. - The locations are ordered, and the list is the same for each pixel of the [=framebuffer=]. - Each location corresponds to one fragment in the multisampled [=framebuffer=]. - - The rasterizer builds a mask of locations being hit inside each pixel and provides is as "sample-mask" - built-in to the fragment shader. -
- -### Fragment Processing ### {#fragment-processing} - -TODO: fill out this section - -### No Color Output ### {#no-color-output} - -In no-color-output mode, [=pipeline=] does not produce any color attachment outputs. - -The [=pipeline=] still performs rasterization and produces depth values -based on the vertex position output. The depth testing and stencil operations can still be used. - -### Alpha to Coverage ### {#alpha-to-coverage} - -In alpha-to-coverage mode, an additional alpha-to-coverage mask -of MSAA samples is generated based on the |alpha| component of the -fragment shader output value of the {{GPURenderPipelineDescriptor/fragment}}.{{GPUFragmentState/targets}}[0]. - -The algorithm of producing the extra mask is platform-dependent and can vary for different pixels. -It guarantees that: - - if |alpha| is 0.0 or less, the result is 0x0 - - if |alpha| is 1.0 or greater, the result is 0xFFFFFFFF - - if |alpha| is greater than some other |alpha1|, - then the produced sample mask has at least as many bits set to 1 as the mask for |alpha1| - -### Sample frequency shading ### {#sample-frequency-shading} - -TODO: fill out the section - -### Sample Masking ### {#sample-masking} - -The final sample mask for a pixel is computed as: -[=rasterization mask=] & {{GPUMultisampleState/mask}} & [=shader-output mask=]. - -Only the lower {{GPUMultisampleState/count}} bits of the mask are considered. - -If the least-significant bit at position |N| of the [=final sample mask=] has value of "0", -the sample color outputs (corresponding to sample |N|) to all attachments of the fragment shader are discarded. -Also, no depth test or stencil operations are executed on the relevant samples of the depth-stencil attachment. - -Note: the color output for sample |N| is produced by the fragment shader execution -with SV_SampleIndex == |N| for the current pixel. -If the fragment shader doesn't use this semantics, it's only executed once per pixel. - -The rasterization mask is produced by the rasterization stage, -based on the shape of the rasterized polygon. The samples incuded in the shape get the relevant -bits 1 in the mask. - -The shader-output mask takes the output value of SV_Coverage semantics in the fragment shader. -If the semantics is not [=statically used=] by the shader, and {{GPUMultisampleState/alphaToCoverageEnabled}} -is enabled, the [=shader-output mask=] becomes the [=alpha-to-coverage mask=]. Otherwise, it defaults to 0xFFFFFFFF. - -Issue: link to the semantics of SV_SampleIndex and SV_Coverage in WGSL spec. - -# Type Definitions # {#type-definitions} - - - -## Colors & Vectors ## {#colors-and-vectors} - - - -Note: `double` is large enough to precisely hold 32-bit signed/unsigned -integers and single-precision floats. - - - -An Origin2D is a {{GPUOrigin2D}}. -[=Origin2D=] is a spec namespace for the following definitions: - - -
- For a given {{GPUOrigin2D}} value |origin|, depending on its type, the syntax: - - - |origin|.x refers to - either {{GPUOrigin2DDict}}.{{GPUOrigin2DDict/x}} - or the first item of the sequence or 0 if it isn't present. - - |origin|.y refers to - either {{GPUOrigin2DDict}}.{{GPUOrigin2DDict/y}} - or the second item of the sequence or 0 if it isn't present. -
- - - -An Origin3D is a {{GPUOrigin3D}}. -[=Origin3D=] is a spec namespace for the following definitions: - - -
- For a given {{GPUOrigin3D}} value |origin|, depending on its type, the syntax: - - - |origin|.x refers to - either {{GPUOrigin3DDict}}.{{GPUOrigin3DDict/x}} - or the first item of the sequence or 0 if it isn't present. - - |origin|.y refers to - either {{GPUOrigin3DDict}}.{{GPUOrigin3DDict/y}} - or the second item of the sequence or 0 if it isn't present. - - |origin|.z refers to - either {{GPUOrigin3DDict}}.{{GPUOrigin3DDict/z}} - or the third item of the sequence or 0 if it isn't presnet. -
- - - -An Extent3D is a {{GPUExtent3D}}. -[=Extent3D=] is a spec namespace for the following definitions: - - -
- For a given {{GPUExtent3D}} value |extent|, depending on its type, the syntax: - - - |extent|.width refers to - either {{GPUExtent3DDict}}.{{GPUExtent3DDict/width}} - or the first item of the sequence (1 if not present). - - |extent|.height refers to - either {{GPUExtent3DDict}}.{{GPUExtent3DDict/height}} - or the second item of the sequence (1 if not present). - - |extent|.depthOrArrayLayers refers to - either {{GPUExtent3DDict}}.{{GPUExtent3DDict/depthOrArrayLayers}} - or the third item of the sequence (1 if not present). -
- -# Feature Index # {#feature-index} - -## depth-clamping ## {#depth-clamping} - -Issue: Define functionality when the {{GPUFeatureName/"depth-clamping"}} [=feature=] is enabled. - -**Feature Dictionary Values** - -The following dictionary values are supported if and only if the {{GPUFeatureName/"depth-clamping"}} -[=feature=] is enabled, otherwise they must be set to their default values: - -
- : {{GPUPrimitiveState}} - :: - * {{GPUPrimitiveState/clampDepth}} -
- -## depth24unorm-stencil8 ## {#depth24unorm-stencil8} - -Allows for explicit creation of textures of format {{GPUTextureFormat/"depth24unorm-stencil8"}}. - -**Feature Enums** - -The following enums are supported if and only if the {{GPUFeatureName/"depth24unorm-stencil8"}} -[=feature=] is enabled: - -
- : {{GPUTextureFormat}} - :: - * {{GPUTextureFormat/"depth24unorm-stencil8"}} -
- -## depth32float-stencil8 ## {#depth32float-stencil8} - -Allows for explicit creation of textures of format {{GPUTextureFormat/"depth32float-stencil8"}}. - -**Feature Enums** - -The following enums are supported if and only if the {{GPUFeatureName/"depth32float-stencil8"}} -[=feature=] is enabled: - -
- : {{GPUTextureFormat}} - :: - * {{GPUTextureFormat/"depth32float-stencil8"}} -
- -## pipeline-statistics-query ## {#pipeline-statistics-query} - -Issue: Define functionality when the {{GPUFeatureName/"pipeline-statistics-query"}} [=feature=] is enabled. - -**Feature Enums** - -The following enums are supported if and only if the {{GPUFeatureName/"pipeline-statistics-query"}} -[=feature=] is enabled: - -
- : {{GPUQueryType}} - :: - * {{GPUQueryType/"pipeline-statistics"}} -
- - -## texture-compression-bc ## {#texture-compression-bc} - -Allows for explicit creation of textures of BC compressed formats. - -**Feature Enums** - -The following enums are supported if and only if the {{GPUFeatureName/"texture-compression-bc"}} -[=feature=] is enabled: - -
- : {{GPUTextureFormat}} - :: - * {{GPUTextureFormat/"bc1-rgba-unorm"}} - * {{GPUTextureFormat/"bc1-rgba-unorm-srgb"}} - * {{GPUTextureFormat/"bc2-rgba-unorm"}} - * {{GPUTextureFormat/"bc2-rgba-unorm-srgb"}} - * {{GPUTextureFormat/"bc3-rgba-unorm"}} - * {{GPUTextureFormat/"bc3-rgba-unorm-srgb"}} - * {{GPUTextureFormat/"bc4-r-unorm"}} - * {{GPUTextureFormat/"bc4-r-snorm"}} - * {{GPUTextureFormat/"bc5-rg-unorm"}} - * {{GPUTextureFormat/"bc5-rg-snorm"}} - * {{GPUTextureFormat/"bc6h-rgb-ufloat"}} - * {{GPUTextureFormat/"bc6h-rgb-float"}} - * {{GPUTextureFormat/"bc7-rgba-unorm"}} - * {{GPUTextureFormat/"bc7-rgba-unorm-srgb"}} -
- -## timestamp-query ## {#timestamp-query} - -Issue: Define functionality when the {{GPUFeatureName/"timestamp-query"}} [=feature=] is enabled. - -**Feature Enums** - -The following enums are supported if and only if the {{GPUFeatureName/"timestamp-query"}} -[=feature=] is enabled: - -
- : {{GPUQueryType}} - :: - * {{GPUQueryType/"timestamp"}} -
- -# Appendices # {#appendices} - -## Texture Format Capabilities ## {#texture-format-caps} - -Issue: Add multisampling to the tables below. - -### Plain color formats ### {#plain-color-formats} - -All plain color formats support {{GPUTextureUsage/COPY_SRC}}, {{GPUTextureUsage/COPY_DST}}, and {{GPUTextureUsage/SAMPLED}} usage. - -Only formats with {{GPUTextureSampleType}} {{GPUTextureSampleType/"float"}} can be blended. - -The {{GPUTextureUsage/STORAGE|GPUTextureUsage.STORAGE}} column specifies the support for {{GPUTextureUsage/STORAGE}} -usage in the core API, including both {{GPUStorageTextureAccess/"read-only"}} and {{GPUStorageTextureAccess/"write-only"}}. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Format - {{GPUTextureSampleType}} - {{GPUTextureUsage/RENDER_ATTACHMENT}} - {{GPUTextureUsage/STORAGE}} -
8-bit per component -
{{GPUTextureFormat/r8unorm}} - {{GPUTextureSampleType/"float"}},
{{GPUTextureSampleType/"unfilterable-float"}} -
✓ - -
{{GPUTextureFormat/r8snorm}} - {{GPUTextureSampleType/"float"}},
{{GPUTextureSampleType/"unfilterable-float"}} -
- -
{{GPUTextureFormat/r8uint}} - {{GPUTextureSampleType/"uint"}} - ✓ - -
{{GPUTextureFormat/r8sint}} - {{GPUTextureSampleType/"sint"}} - ✓ - -
{{GPUTextureFormat/rg8unorm}} - {{GPUTextureSampleType/"float"}},
{{GPUTextureSampleType/"unfilterable-float"}} -
✓ - -
{{GPUTextureFormat/rg8snorm}} - {{GPUTextureSampleType/"float"}},
{{GPUTextureSampleType/"unfilterable-float"}} -
- -
{{GPUTextureFormat/rg8uint}} - {{GPUTextureSampleType/"uint"}} - ✓ - -
{{GPUTextureFormat/rg8sint}} - {{GPUTextureSampleType/"sint"}} - ✓ - -
{{GPUTextureFormat/rgba8unorm}} - {{GPUTextureSampleType/"float"}},
{{GPUTextureSampleType/"unfilterable-float"}} -
✓ - ✓ -
{{GPUTextureFormat/rgba8unorm-srgb}} - {{GPUTextureSampleType/"float"}},
{{GPUTextureSampleType/"unfilterable-float"}} -
✓ - -
{{GPUTextureFormat/rgba8snorm}} - {{GPUTextureSampleType/"float"}},
{{GPUTextureSampleType/"unfilterable-float"}} -
- ✓ -
{{GPUTextureFormat/rgba8uint}} - {{GPUTextureSampleType/"uint"}} - ✓ - ✓ -
{{GPUTextureFormat/rgba8sint}} - {{GPUTextureSampleType/"sint"}} - ✓ - ✓ -
{{GPUTextureFormat/bgra8unorm}} - {{GPUTextureSampleType/"float"}},
{{GPUTextureSampleType/"unfilterable-float"}} -
✓ - -
{{GPUTextureFormat/bgra8unorm-srgb}} - {{GPUTextureSampleType/"float"}},
{{GPUTextureSampleType/"unfilterable-float"}} -
✓ - -
16-bit per component -
{{GPUTextureFormat/r16uint}} - {{GPUTextureSampleType/"uint"}} - ✓ - -
{{GPUTextureFormat/r16sint}} - {{GPUTextureSampleType/"sint"}} - ✓ - -
{{GPUTextureFormat/r16float}} - {{GPUTextureSampleType/"float"}},
{{GPUTextureSampleType/"unfilterable-float"}} -
✓ - -
{{GPUTextureFormat/rg16uint}} - {{GPUTextureSampleType/"uint"}} - ✓ - -
{{GPUTextureFormat/rg16sint}} - {{GPUTextureSampleType/"sint"}} - ✓ - -
{{GPUTextureFormat/rg16float}} - {{GPUTextureSampleType/"float"}},
{{GPUTextureSampleType/"unfilterable-float"}} -
✓ - -
{{GPUTextureFormat/rgba16uint}} - {{GPUTextureSampleType/"uint"}} - ✓ - ✓ -
{{GPUTextureFormat/rgba16sint}} - {{GPUTextureSampleType/"sint"}} - ✓ - ✓ -
{{GPUTextureFormat/rgba16float}} - {{GPUTextureSampleType/"float"}},
{{GPUTextureSampleType/"unfilterable-float"}} -
✓ - ✓ -
32-bit per component -
{{GPUTextureFormat/r32uint}} - {{GPUTextureSampleType/"uint"}} - ✓ - ✓ -
{{GPUTextureFormat/r32sint}} - {{GPUTextureSampleType/"sint"}} - ✓ - ✓ -
{{GPUTextureFormat/r32float}} - {{GPUTextureSampleType/"unfilterable-float"}} - ✓ - ✓ -
{{GPUTextureFormat/rg32uint}} - {{GPUTextureSampleType/"uint"}} - ✓ - ✓ -
{{GPUTextureFormat/rg32sint}} - {{GPUTextureSampleType/"sint"}} - ✓ - ✓ -
{{GPUTextureFormat/rg32float}} - {{GPUTextureSampleType/"unfilterable-float"}} - ✓ - ✓ -
{{GPUTextureFormat/rgba32uint}} - {{GPUTextureSampleType/"uint"}} - ✓ - ✓ -
{{GPUTextureFormat/rgba32sint}} - {{GPUTextureSampleType/"sint"}} - ✓ - ✓ -
{{GPUTextureFormat/rgba32float}} - {{GPUTextureSampleType/"unfilterable-float"}} - ✓ - ✓ -
mixed component width -
{{GPUTextureFormat/rgb10a2unorm}} - {{GPUTextureSampleType/"float"}},
{{GPUTextureSampleType/"unfilterable-float"}} -
✓ - -
{{GPUTextureFormat/rg11b10ufloat}} - {{GPUTextureSampleType/"float"}},
{{GPUTextureSampleType/"unfilterable-float"}} -
- - -
- -### Depth/stencil formats ### {#depth-formats} - -All depth formats support {{GPUTextureUsage/COPY_SRC}}, {{GPUTextureUsage/COPY_DST}}, {{GPUTextureUsage/SAMPLED}}, and {{GPUTextureUsage/RENDER_ATTACHMENT}} usage. However, the source/destination is restricted based on the format. - -None of the depth formats can be filtered. - - - - - - - - - - - -
Format - Bytes per texel - Aspect - {{GPUTextureSampleType}} - Copy aspect from Buffer - Copy aspect into Buffer -
{{GPUTextureFormat/stencil8}} - 1 − 5 - stencil - {{GPUTextureSampleType/"uint"}} - ✓ -
{{GPUTextureFormat/depth16unorm}} - 2 - depth - {{GPUTextureSampleType/"depth"}} - ✓ -
{{GPUTextureFormat/depth24plus}} - 4 - depth - {{GPUTextureSampleType/"depth"}} - ✗ -
{{GPUTextureFormat/depth24plus-stencil8}} - 4 − 8 - depth - {{GPUTextureSampleType/"depth"}} - ✗ -
stencil - {{GPUTextureSampleType/"uint"}} - ✓ -
{{GPUTextureFormat/depth32float}} - 4 - depth - {{GPUTextureSampleType/"depth"}} - ✗ - ✓ -
- -Copies between depth textures can only happen within the following sets of formats: - - {{GPUTextureFormat/stencil8}}, {{GPUTextureFormat/depth24plus-stencil8}} (stencil component), {{GPUTextureFormat/r8uint}} - - {{GPUTextureFormat/depth16unorm}}, {{GPUTextureFormat/r16uint}} - - {{GPUTextureFormat/depth24plus}}, {{GPUTextureFormat/depth24plus-stencil8}} (depth aspect) - -Additionally, {{GPUTextureFormat/depth32float}} textures can be copied into {{GPUTextureFormat/depth32float}} and {{GPUTextureFormat/r32float}} textures. - -Note: -{{GPUTextureFormat/depth32float}} texel values have a limited range. As a result, copies into -{{GPUTextureFormat/depth32float}} textures are only valid from other {{GPUTextureFormat/depth32float}} textures. - -Issue: clarify if `depth24plus-stencil8` is copyable into `depth24plus` in Metal. - -### Packed formats ### {#packed-formats} - -All packed texture formats support {{GPUTextureUsage/COPY_SRC}}, {{GPUTextureUsage/COPY_DST}}, and {{GPUTextureUsage/SAMPLED}} usages. All of these formats have {{GPUTextureSampleType/"float"}} type and can be filtered on sampling. - - - - - - - - - - - - - - - - - - - - -
Format - Bytes per block - {{GPUTextureSampleType}} - Block Size - [=Feature=] -
{{GPUTextureFormat/rgb9e5ufloat}} - 4 - {{GPUTextureSampleType/"float"}},
{{GPUTextureSampleType/"unfilterable-float"}} -
1 × 1 - -
{{GPUTextureFormat/bc1-rgba-unorm}} - 8 - {{GPUTextureSampleType/"float"}},
{{GPUTextureSampleType/"unfilterable-float"}} -
4 × 4 - {{GPUFeatureName/texture-compression-bc}} -
{{GPUTextureFormat/bc1-rgba-unorm-srgb}} -
{{GPUTextureFormat/bc2-rgba-unorm}} - 16 -
{{GPUTextureFormat/bc2-rgba-unorm-srgb}} -
{{GPUTextureFormat/bc3-rgba-unorm}} - 16 -
{{GPUTextureFormat/bc3-rgba-unorm-srgb}} -
{{GPUTextureFormat/bc4-r-unorm}} - 8 -
{{GPUTextureFormat/bc4-r-snorm}} -
{{GPUTextureFormat/bc5-rg-unorm}} - 16 -
{{GPUTextureFormat/bc5-rg-snorm}} -
{{GPUTextureFormat/bc6h-rgb-ufloat}} - 16 -
{{GPUTextureFormat/bc6h-rgb-float}} -
{{GPUTextureFormat/bc7-rgba-unorm}} - 16 -
{{GPUTextureFormat/bc7-rgba-unorm-srgb}} -
- -## Temporary usages of non-exported dfns ## {#temp-dfn-usages} - -[=Origin2D/x=] [=Origin2D/y=] -[=RenderPassDescriptor/renderExtent=] - -Eventually all of these should disappear but they are useful to avoid warning while building the specification. - -[=vertex buffer=] diff --git a/tools/package.json b/tools/package.json deleted file mode 100644 index 1fc9b8cf28..0000000000 --- a/tools/package.json +++ /dev/null @@ -1,15 +0,0 @@ -{ - "name": "webgpu-tools", - "version": "1.0.0", - "description": "Scripts to help WebGPU development", - "main": "extract-wgsl-agenda.js", - "scripts": { - "test": "echo \"Error: no test specified\" && exit 1" - }, - "author": "", - "license": "ISC", - "dependencies": { - "@octokit/rest": "^17.6.0", - "yargs": "^15.3.1" - } -} diff --git a/tools/wgsl-meeting-helper b/tools/wgsl-meeting-helper deleted file mode 100755 index 2815128ffd..0000000000 --- a/tools/wgsl-meeting-helper +++ /dev/null @@ -1,116 +0,0 @@ -#!/usr/bin/env node - -const yargs = require("yargs"); -const { Octokit } = require("@octokit/rest"); - -const GPUWEB_ORG = "gpuweb"; -const GPUWEB_REPOSITORY = "gpuweb"; - -// The identifiers for some columns in the WGSL project. -// You can list all the columns by calling this script with the "project-info" command. - -const WGSL_UNDER_DISCUSSION_COLUMN = 8444327; -const WGSL_MEETING_COLUMN = 8898490; - -const TOKEN_ENV_NAME = "GPUWEB_GITHUB_TOKEN"; - -async function authenticate(token) { - const octokit = new Octokit({ auth: token }); - await octokit.request("/user"); - return octokit; -} - -async function agenda(kit, column) { - console.log("Suggested meeting agenda from Github project."); - console.log("----"); - const cards = await kit.projects.listCards({ column_id: column }); - for (let card of cards.data) { - // Notes are agenda topics that don't have an issue. - if (card.note) { - console.log(`- ${card.note}`); - } else { - // Check if the card is an issue. e.g. https://api.github.com/repos/gpuweb/gpuweb/issues/569 - const url = card.content_url; - const matches = /gpuweb\/gpuweb\/issues\/(\d+)$/.exec(url); - if (matches) { - const issueIdentifier = matches[1]; - const issue = await kit.issues.get({ owner: GPUWEB_ORG, repo: GPUWEB_REPOSITORY, issue_number: issueIdentifier }); - console.log(`- ${issue.data.title.replace(/\s*\[wgsl\]\s*/i, "")} (#${issueIdentifier})`); - console.log(` ${issue.data.html_url}`); - } - } - console.log(""); - } -} - -async function sortMVP(kit, column) { - const cards = await kit.projects.listCards({ column_id: column }); - for (let card of cards.data.reverse()) { // Do it in reverse so order is preserved. - const url = card.content_url; - const matches = /gpuweb\/gpuweb\/issues\/(\d+)$/.exec(url); - if (matches) { - const issueIdentifier = matches[1]; - const issue = await kit.issues.get({ owner: GPUWEB_ORG, repo: GPUWEB_REPOSITORY, issue_number: issueIdentifier }); - if (issue.data.milestone && issue.data.milestone.title == "MVP") { - console.log(`Moving MVP issue #${issueIdentifier} to top of column`); - await kit.projects.moveCard({ card_id: card.id, position: "top" }); - } - } - } -} - -async function projectInfo(kit) { - const projects = await kit.projects.listForOrg({ org: GPUWEB_ORG }); - for (let project of projects.data) { - console.log(`Columns for project "${project.name}" (id: ${project.id})`); - console.log("----"); - const columns = await kit.projects.listColumns({ project_id: project.id }); - for (let column of columns.data) { - console.log(`${column.name} (id: ${column.id})`); - } - console.log(""); - } -} - -const argumentProcessor = yargs - .scriptName("wgsl-meeting-helper") - .usage("$0 ") - .command({ - command: ["agenda", "$0"], - desc: "Build the meeting agenda from the WGSL Github Project.", - handler: () => { - run(agenda, WGSL_MEETING_COLUMN); - } - }) - .command({ - command: "sort-mvp", - desc: "Sorts the Under Discussion column in WGSL to put MVP topics at the top.", - handler: () => { - run(sortMVP, WGSL_UNDER_DISCUSSION_COLUMN); - } - }) - .command({ - command: "project-info", - desc: "Dumps info on the projects in GPUWeb.", - handler: () => { - run(projectInfo); - } - }) - .version(false) - .help() - .epilogue(`Requires a Github Personal Access Token to be provided as the environment variable ${TOKEN_ENV_NAME}.`) - -const token = process.env[TOKEN_ENV_NAME]; -if (!token) { - console.log("Error: missing Github Personal Access Token.\n"); - argumentProcessor.showHelp(); - process.exit(); -} - -async function run(targetFunction, ...args) { - const kit = await authenticate(token); - targetFunction(kit, ...args); -} - -argumentProcessor.parse(); - diff --git a/w3c.json b/w3c.json deleted file mode 100644 index 15648136ff..0000000000 --- a/w3c.json +++ /dev/null @@ -1,5 +0,0 @@ -{ - "group": [96877, 125519], - "contacts": ["tidoust", "Kangz", "grorg"], - "repo-type": ["cg-report", "rec-track"] -} \ No newline at end of file diff --git a/webgpu.idl b/webgpu.idl new file mode 100644 index 0000000000..a9f739bf3b --- /dev/null +++ b/webgpu.idl @@ -0,0 +1,1337 @@ +// Copyright (C) [2022] World Wide Web Consortium, +// (Massachusetts Institute of Technology, European Research Consortium for +// Informatics and Mathematics, Keio University, Beihang). +// All Rights Reserved. +// +// This work is distributed under the W3C (R) Software License [1] in the hope +// that it will be useful, but WITHOUT ANY WARRANTY; without even the implied +// warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. +// +// [1] http://www.w3.org/Consortium/Legal/copyright-software + +// **** This file is auto-generated. Do not edit. **** + +interface mixin GPUObjectBase { + attribute (USVString or undefined) label; +}; + + +dictionary GPUObjectDescriptorBase { + USVString label; +}; + + +[Exposed=(Window, DedicatedWorker), SecureContext] +interface GPUSupportedLimits { + readonly attribute unsigned long maxTextureDimension1D; + readonly attribute unsigned long maxTextureDimension2D; + readonly attribute unsigned long maxTextureDimension3D; + readonly attribute unsigned long maxTextureArrayLayers; + readonly attribute unsigned long maxBindGroups; + readonly attribute unsigned long maxDynamicUniformBuffersPerPipelineLayout; + readonly attribute unsigned long maxDynamicStorageBuffersPerPipelineLayout; + readonly attribute unsigned long maxSampledTexturesPerShaderStage; + readonly attribute unsigned long maxSamplersPerShaderStage; + readonly attribute unsigned long maxStorageBuffersPerShaderStage; + readonly attribute unsigned long maxStorageTexturesPerShaderStage; + readonly attribute unsigned long maxUniformBuffersPerShaderStage; + readonly attribute unsigned long long maxUniformBufferBindingSize; + readonly attribute unsigned long long maxStorageBufferBindingSize; + readonly attribute unsigned long minUniformBufferOffsetAlignment; + readonly attribute unsigned long minStorageBufferOffsetAlignment; + readonly attribute unsigned long maxVertexBuffers; + readonly attribute unsigned long maxVertexAttributes; + readonly attribute unsigned long maxVertexBufferArrayStride; + readonly attribute unsigned long maxInterStageShaderComponents; + readonly attribute unsigned long maxComputeWorkgroupStorageSize; + readonly attribute unsigned long maxComputeInvocationsPerWorkgroup; + readonly attribute unsigned long maxComputeWorkgroupSizeX; + readonly attribute unsigned long maxComputeWorkgroupSizeY; + readonly attribute unsigned long maxComputeWorkgroupSizeZ; + readonly attribute unsigned long maxComputeWorkgroupsPerDimension; +}; + + +[Exposed=(Window, DedicatedWorker), SecureContext] +interface GPUSupportedFeatures { + readonly setlike; +}; + + +enum GPUPredefinedColorSpace { + "srgb", +}; + + +interface mixin NavigatorGPU { + [SameObject, SecureContext] readonly attribute GPU gpu; +}; +Navigator includes NavigatorGPU; +WorkerNavigator includes NavigatorGPU; + + +[Exposed=(Window, DedicatedWorker), SecureContext] +interface GPU { + Promise requestAdapter(optional GPURequestAdapterOptions options = {}); +}; + + +dictionary GPURequestAdapterOptions { + GPUPowerPreference powerPreference; + boolean forceFallbackAdapter = false; +}; + + +enum GPUPowerPreference { + "low-power", + "high-performance", +}; + + +[Exposed=(Window, DedicatedWorker), SecureContext] +interface GPUAdapter { + readonly attribute DOMString name; + [SameObject] readonly attribute GPUSupportedFeatures features; + [SameObject] readonly attribute GPUSupportedLimits limits; + readonly attribute boolean isFallbackAdapter; + + Promise requestDevice(optional GPUDeviceDescriptor descriptor = {}); +}; + + +dictionary GPUDeviceDescriptor : GPUObjectDescriptorBase { + sequence requiredFeatures = []; + record requiredLimits = {}; + GPUQueueDescriptor defaultQueue = {}; +}; + + +enum GPUFeatureName { + "depth-clip-control", + "depth24unorm-stencil8", + "depth32float-stencil8", + "texture-compression-bc", + "texture-compression-etc2", + "texture-compression-astc", + "timestamp-query", + "indirect-first-instance", +}; + + +[Exposed=(Window, DedicatedWorker), SecureContext] +interface GPUDevice : EventTarget { + [SameObject] readonly attribute GPUSupportedFeatures features; + [SameObject] readonly attribute GPUSupportedLimits limits; + + [SameObject] readonly attribute GPUQueue queue; + + undefined destroy(); + + GPUBuffer createBuffer(GPUBufferDescriptor descriptor); + GPUTexture createTexture(GPUTextureDescriptor descriptor); + GPUSampler createSampler(optional GPUSamplerDescriptor descriptor = {}); + GPUExternalTexture importExternalTexture(GPUExternalTextureDescriptor descriptor); + + GPUBindGroupLayout createBindGroupLayout(GPUBindGroupLayoutDescriptor descriptor); + GPUPipelineLayout createPipelineLayout(GPUPipelineLayoutDescriptor descriptor); + GPUBindGroup createBindGroup(GPUBindGroupDescriptor descriptor); + + GPUShaderModule createShaderModule(GPUShaderModuleDescriptor descriptor); + GPUComputePipeline createComputePipeline(GPUComputePipelineDescriptor descriptor); + GPURenderPipeline createRenderPipeline(GPURenderPipelineDescriptor descriptor); + Promise createComputePipelineAsync(GPUComputePipelineDescriptor descriptor); + Promise createRenderPipelineAsync(GPURenderPipelineDescriptor descriptor); + + GPUCommandEncoder createCommandEncoder(optional GPUCommandEncoderDescriptor descriptor = {}); + GPURenderBundleEncoder createRenderBundleEncoder(GPURenderBundleEncoderDescriptor descriptor); + + GPUQuerySet createQuerySet(GPUQuerySetDescriptor descriptor); +}; +GPUDevice includes GPUObjectBase; + + +[Exposed=(Window, DedicatedWorker), SecureContext] +interface GPUBuffer { + Promise mapAsync(GPUMapModeFlags mode, optional GPUSize64 offset = 0, optional GPUSize64 size); + ArrayBuffer getMappedRange(optional GPUSize64 offset = 0, optional GPUSize64 size); + undefined unmap(); + + undefined destroy(); +}; +GPUBuffer includes GPUObjectBase; + + +dictionary GPUBufferDescriptor : GPUObjectDescriptorBase { + required GPUSize64 size; + required GPUBufferUsageFlags usage; + boolean mappedAtCreation = false; +}; + + +typedef [EnforceRange] unsigned long GPUBufferUsageFlags; +[Exposed=(Window, DedicatedWorker)] +namespace GPUBufferUsage { + const GPUFlagsConstant MAP_READ = 0x0001; + const GPUFlagsConstant MAP_WRITE = 0x0002; + const GPUFlagsConstant COPY_SRC = 0x0004; + const GPUFlagsConstant COPY_DST = 0x0008; + const GPUFlagsConstant INDEX = 0x0010; + const GPUFlagsConstant VERTEX = 0x0020; + const GPUFlagsConstant UNIFORM = 0x0040; + const GPUFlagsConstant STORAGE = 0x0080; + const GPUFlagsConstant INDIRECT = 0x0100; + const GPUFlagsConstant QUERY_RESOLVE = 0x0200; +}; + + +typedef [EnforceRange] unsigned long GPUMapModeFlags; +[Exposed=(Window, DedicatedWorker)] +namespace GPUMapMode { + const GPUFlagsConstant READ = 0x0001; + const GPUFlagsConstant WRITE = 0x0002; +}; + + +[Exposed=(Window, DedicatedWorker), SecureContext] +interface GPUTexture { + GPUTextureView createView(optional GPUTextureViewDescriptor descriptor = {}); + + undefined destroy(); +}; +GPUTexture includes GPUObjectBase; + + +dictionary GPUTextureDescriptor : GPUObjectDescriptorBase { + required GPUExtent3D size; + GPUIntegerCoordinate mipLevelCount = 1; + GPUSize32 sampleCount = 1; + GPUTextureDimension dimension = "2d"; + required GPUTextureFormat format; + required GPUTextureUsageFlags usage; + sequence viewFormats = []; +}; + + +enum GPUTextureDimension { + "1d", + "2d", + "3d", +}; + + +typedef [EnforceRange] unsigned long GPUTextureUsageFlags; +[Exposed=(Window, DedicatedWorker)] +namespace GPUTextureUsage { + const GPUFlagsConstant COPY_SRC = 0x01; + const GPUFlagsConstant COPY_DST = 0x02; + const GPUFlagsConstant TEXTURE_BINDING = 0x04; + const GPUFlagsConstant STORAGE_BINDING = 0x08; + const GPUFlagsConstant RENDER_ATTACHMENT = 0x10; +}; + + +[Exposed=(Window, DedicatedWorker), SecureContext] +interface GPUTextureView { +}; +GPUTextureView includes GPUObjectBase; + + +dictionary GPUTextureViewDescriptor : GPUObjectDescriptorBase { + GPUTextureFormat format; + GPUTextureViewDimension dimension; + GPUTextureAspect aspect = "all"; + GPUIntegerCoordinate baseMipLevel = 0; + GPUIntegerCoordinate mipLevelCount; + GPUIntegerCoordinate baseArrayLayer = 0; + GPUIntegerCoordinate arrayLayerCount; +}; + + +enum GPUTextureViewDimension { + "1d", + "2d", + "2d-array", + "cube", + "cube-array", + "3d", +}; + + +enum GPUTextureAspect { + "all", + "stencil-only", + "depth-only", +}; + + +enum GPUTextureFormat { + // 8-bit formats + "r8unorm", + "r8snorm", + "r8uint", + "r8sint", + + // 16-bit formats + "r16uint", + "r16sint", + "r16float", + "rg8unorm", + "rg8snorm", + "rg8uint", + "rg8sint", + + // 32-bit formats + "r32uint", + "r32sint", + "r32float", + "rg16uint", + "rg16sint", + "rg16float", + "rgba8unorm", + "rgba8unorm-srgb", + "rgba8snorm", + "rgba8uint", + "rgba8sint", + "bgra8unorm", + "bgra8unorm-srgb", + // Packed 32-bit formats + "rgb9e5ufloat", + "rgb10a2unorm", + "rg11b10ufloat", + + // 64-bit formats + "rg32uint", + "rg32sint", + "rg32float", + "rgba16uint", + "rgba16sint", + "rgba16float", + + // 128-bit formats + "rgba32uint", + "rgba32sint", + "rgba32float", + + // Depth/stencil formats + "stencil8", + "depth16unorm", + "depth24plus", + "depth24plus-stencil8", + "depth32float", + + // "depth24unorm-stencil8" feature + "depth24unorm-stencil8", + + // "depth32float-stencil8" feature + "depth32float-stencil8", + + // BC compressed formats usable if "texture-compression-bc" is both + // supported by the device/user agent and enabled in requestDevice. + "bc1-rgba-unorm", + "bc1-rgba-unorm-srgb", + "bc2-rgba-unorm", + "bc2-rgba-unorm-srgb", + "bc3-rgba-unorm", + "bc3-rgba-unorm-srgb", + "bc4-r-unorm", + "bc4-r-snorm", + "bc5-rg-unorm", + "bc5-rg-snorm", + "bc6h-rgb-ufloat", + "bc6h-rgb-float", + "bc7-rgba-unorm", + "bc7-rgba-unorm-srgb", + + // ETC2 compressed formats usable if "texture-compression-etc2" is both + // supported by the device/user agent and enabled in requestDevice. + "etc2-rgb8unorm", + "etc2-rgb8unorm-srgb", + "etc2-rgb8a1unorm", + "etc2-rgb8a1unorm-srgb", + "etc2-rgba8unorm", + "etc2-rgba8unorm-srgb", + "eac-r11unorm", + "eac-r11snorm", + "eac-rg11unorm", + "eac-rg11snorm", + + // ASTC compressed formats usable if "texture-compression-astc" is both + // supported by the device/user agent and enabled in requestDevice. + "astc-4x4-unorm", + "astc-4x4-unorm-srgb", + "astc-5x4-unorm", + "astc-5x4-unorm-srgb", + "astc-5x5-unorm", + "astc-5x5-unorm-srgb", + "astc-6x5-unorm", + "astc-6x5-unorm-srgb", + "astc-6x6-unorm", + "astc-6x6-unorm-srgb", + "astc-8x5-unorm", + "astc-8x5-unorm-srgb", + "astc-8x6-unorm", + "astc-8x6-unorm-srgb", + "astc-8x8-unorm", + "astc-8x8-unorm-srgb", + "astc-10x5-unorm", + "astc-10x5-unorm-srgb", + "astc-10x6-unorm", + "astc-10x6-unorm-srgb", + "astc-10x8-unorm", + "astc-10x8-unorm-srgb", + "astc-10x10-unorm", + "astc-10x10-unorm-srgb", + "astc-12x10-unorm", + "astc-12x10-unorm-srgb", + "astc-12x12-unorm", + "astc-12x12-unorm-srgb", +}; + + +[Exposed=(Window, DedicatedWorker), SecureContext] +interface GPUExternalTexture { + readonly attribute boolean expired; +}; +GPUExternalTexture includes GPUObjectBase; + + +dictionary GPUExternalTextureDescriptor : GPUObjectDescriptorBase { + required HTMLVideoElement source; + GPUPredefinedColorSpace colorSpace = "srgb"; +}; + + +[Exposed=(Window, DedicatedWorker), SecureContext] +interface GPUSampler { +}; +GPUSampler includes GPUObjectBase; + + +dictionary GPUSamplerDescriptor : GPUObjectDescriptorBase { + GPUAddressMode addressModeU = "clamp-to-edge"; + GPUAddressMode addressModeV = "clamp-to-edge"; + GPUAddressMode addressModeW = "clamp-to-edge"; + GPUFilterMode magFilter = "nearest"; + GPUFilterMode minFilter = "nearest"; + GPUMipmapFilterMode mipmapFilter = "nearest"; + float lodMinClamp = 0; + float lodMaxClamp = 32; + GPUCompareFunction compare; + [Clamp] unsigned short maxAnisotropy = 1; +}; + + +enum GPUAddressMode { + "clamp-to-edge", + "repeat", + "mirror-repeat", +}; + + +enum GPUFilterMode { + "nearest", + "linear", +}; + +enum GPUMipmapFilterMode { + "nearest", + "linear", +}; + + +enum GPUCompareFunction { + "never", + "less", + "equal", + "less-equal", + "greater", + "not-equal", + "greater-equal", + "always", +}; + + +[Exposed=(Window, DedicatedWorker), SecureContext] +interface GPUBindGroupLayout { +}; +GPUBindGroupLayout includes GPUObjectBase; + + +dictionary GPUBindGroupLayoutDescriptor : GPUObjectDescriptorBase { + required sequence entries; +}; + + +typedef [EnforceRange] unsigned long GPUShaderStageFlags; +[Exposed=(Window, DedicatedWorker)] +namespace GPUShaderStage { + const GPUFlagsConstant VERTEX = 0x1; + const GPUFlagsConstant FRAGMENT = 0x2; + const GPUFlagsConstant COMPUTE = 0x4; +}; + +dictionary GPUBindGroupLayoutEntry { + required GPUIndex32 binding; + required GPUShaderStageFlags visibility; + + GPUBufferBindingLayout buffer; + GPUSamplerBindingLayout sampler; + GPUTextureBindingLayout texture; + GPUStorageTextureBindingLayout storageTexture; + GPUExternalTextureBindingLayout externalTexture; +}; + + +enum GPUBufferBindingType { + "uniform", + "storage", + "read-only-storage", +}; + +dictionary GPUBufferBindingLayout { + GPUBufferBindingType type = "uniform"; + boolean hasDynamicOffset = false; + GPUSize64 minBindingSize = 0; +}; + + +enum GPUSamplerBindingType { + "filtering", + "non-filtering", + "comparison", +}; + +dictionary GPUSamplerBindingLayout { + GPUSamplerBindingType type = "filtering"; +}; + + +enum GPUTextureSampleType { + "float", + "unfilterable-float", + "depth", + "sint", + "uint", +}; + +dictionary GPUTextureBindingLayout { + GPUTextureSampleType sampleType = "float"; + GPUTextureViewDimension viewDimension = "2d"; + boolean multisampled = false; +}; + + +enum GPUStorageTextureAccess { + "write-only", +}; + +dictionary GPUStorageTextureBindingLayout { + GPUStorageTextureAccess access = "write-only"; + required GPUTextureFormat format; + GPUTextureViewDimension viewDimension = "2d"; +}; + + +dictionary GPUExternalTextureBindingLayout { +}; + + +[Exposed=(Window, DedicatedWorker), SecureContext] +interface GPUBindGroup { +}; +GPUBindGroup includes GPUObjectBase; + + +dictionary GPUBindGroupDescriptor : GPUObjectDescriptorBase { + required GPUBindGroupLayout layout; + required sequence entries; +}; + + +typedef (GPUSampler or GPUTextureView or GPUBufferBinding or GPUExternalTexture) GPUBindingResource; + +dictionary GPUBindGroupEntry { + required GPUIndex32 binding; + required GPUBindingResource resource; +}; + + +dictionary GPUBufferBinding { + required GPUBuffer buffer; + GPUSize64 offset = 0; + GPUSize64 size; +}; + + +[Exposed=(Window, DedicatedWorker), SecureContext] +interface GPUPipelineLayout { +}; +GPUPipelineLayout includes GPUObjectBase; + + +dictionary GPUPipelineLayoutDescriptor : GPUObjectDescriptorBase { + required sequence bindGroupLayouts; +}; + + +[Exposed=(Window, DedicatedWorker), SecureContext] +interface GPUShaderModule { + Promise compilationInfo(); +}; +GPUShaderModule includes GPUObjectBase; + + +dictionary GPUShaderModuleCompilationHint { + required GPUPipelineLayout layout; +}; + +dictionary GPUShaderModuleDescriptor : GPUObjectDescriptorBase { + required USVString code; + object sourceMap; + record hints; +}; + + +enum GPUCompilationMessageType { + "error", + "warning", + "info", +}; + +[Exposed=(Window, DedicatedWorker), Serializable, SecureContext] +interface GPUCompilationMessage { + readonly attribute DOMString message; + readonly attribute GPUCompilationMessageType type; + readonly attribute unsigned long long lineNum; + readonly attribute unsigned long long linePos; + readonly attribute unsigned long long offset; + readonly attribute unsigned long long length; +}; + +[Exposed=(Window, DedicatedWorker), Serializable, SecureContext] +interface GPUCompilationInfo { + readonly attribute FrozenArray messages; +}; + + +dictionary GPUPipelineDescriptorBase : GPUObjectDescriptorBase { + GPUPipelineLayout layout; +}; + +interface mixin GPUPipelineBase { + GPUBindGroupLayout getBindGroupLayout(unsigned long index); +}; + + +dictionary GPUProgrammableStage { + required GPUShaderModule module; + required USVString entryPoint; + record constants; +}; + +typedef double GPUPipelineConstantValue; // May represent WGSL's bool, f32, i32, u32. + + +[Exposed=(Window, DedicatedWorker), SecureContext] +interface GPUComputePipeline { +}; +GPUComputePipeline includes GPUObjectBase; +GPUComputePipeline includes GPUPipelineBase; + + +dictionary GPUComputePipelineDescriptor : GPUPipelineDescriptorBase { + required GPUProgrammableStage compute; +}; + + +[Exposed=(Window, DedicatedWorker), SecureContext] +interface GPURenderPipeline { +}; +GPURenderPipeline includes GPUObjectBase; +GPURenderPipeline includes GPUPipelineBase; + + +dictionary GPURenderPipelineDescriptor : GPUPipelineDescriptorBase { + required GPUVertexState vertex; + GPUPrimitiveState primitive = {}; + GPUDepthStencilState depthStencil; + GPUMultisampleState multisample = {}; + GPUFragmentState fragment; +}; + + +enum GPUPrimitiveTopology { + "point-list", + "line-list", + "line-strip", + "triangle-list", + "triangle-strip", +}; + + +dictionary GPUPrimitiveState { + GPUPrimitiveTopology topology = "triangle-list"; + GPUIndexFormat stripIndexFormat; + GPUFrontFace frontFace = "ccw"; + GPUCullMode cullMode = "none"; + + // Requires "depth-clip-control" feature. + boolean unclippedDepth = false; +}; + + +enum GPUFrontFace { + "ccw", + "cw", +}; + + +enum GPUCullMode { + "none", + "front", + "back", +}; + + +dictionary GPUMultisampleState { + GPUSize32 count = 1; + GPUSampleMask mask = 0xFFFFFFFF; + boolean alphaToCoverageEnabled = false; +}; + + +dictionary GPUFragmentState : GPUProgrammableStage { + required sequence targets; +}; + + +dictionary GPUColorTargetState { + required GPUTextureFormat format; + + GPUBlendState blend; + GPUColorWriteFlags writeMask = 0xF; // GPUColorWrite.ALL +}; + + +dictionary GPUBlendState { + required GPUBlendComponent color; + required GPUBlendComponent alpha; +}; + + +typedef [EnforceRange] unsigned long GPUColorWriteFlags; +[Exposed=(Window, DedicatedWorker)] +namespace GPUColorWrite { + const GPUFlagsConstant RED = 0x1; + const GPUFlagsConstant GREEN = 0x2; + const GPUFlagsConstant BLUE = 0x4; + const GPUFlagsConstant ALPHA = 0x8; + const GPUFlagsConstant ALL = 0xF; +}; + + +dictionary GPUBlendComponent { + GPUBlendOperation operation = "add"; + GPUBlendFactor srcFactor = "one"; + GPUBlendFactor dstFactor = "zero"; +}; + + +enum GPUBlendFactor { + "zero", + "one", + "src", + "one-minus-src", + "src-alpha", + "one-minus-src-alpha", + "dst", + "one-minus-dst", + "dst-alpha", + "one-minus-dst-alpha", + "src-alpha-saturated", + "constant", + "one-minus-constant", +}; + + +enum GPUBlendOperation { + "add", + "subtract", + "reverse-subtract", + "min", + "max", +}; + + +dictionary GPUDepthStencilState { + required GPUTextureFormat format; + + boolean depthWriteEnabled = false; + GPUCompareFunction depthCompare = "always"; + + GPUStencilFaceState stencilFront = {}; + GPUStencilFaceState stencilBack = {}; + + GPUStencilValue stencilReadMask = 0xFFFFFFFF; + GPUStencilValue stencilWriteMask = 0xFFFFFFFF; + + GPUDepthBias depthBias = 0; + float depthBiasSlopeScale = 0; + float depthBiasClamp = 0; +}; + + +dictionary GPUStencilFaceState { + GPUCompareFunction compare = "always"; + GPUStencilOperation failOp = "keep"; + GPUStencilOperation depthFailOp = "keep"; + GPUStencilOperation passOp = "keep"; +}; + + +enum GPUStencilOperation { + "keep", + "zero", + "replace", + "invert", + "increment-clamp", + "decrement-clamp", + "increment-wrap", + "decrement-wrap", +}; + + +enum GPUIndexFormat { + "uint16", + "uint32", +}; + + +enum GPUVertexFormat { + "uint8x2", + "uint8x4", + "sint8x2", + "sint8x4", + "unorm8x2", + "unorm8x4", + "snorm8x2", + "snorm8x4", + "uint16x2", + "uint16x4", + "sint16x2", + "sint16x4", + "unorm16x2", + "unorm16x4", + "snorm16x2", + "snorm16x4", + "float16x2", + "float16x4", + "float32", + "float32x2", + "float32x3", + "float32x4", + "uint32", + "uint32x2", + "uint32x3", + "uint32x4", + "sint32", + "sint32x2", + "sint32x3", + "sint32x4", +}; + + +enum GPUVertexStepMode { + "vertex", + "instance", +}; + + +dictionary GPUVertexState : GPUProgrammableStage { + sequence buffers = []; +}; + + +dictionary GPUVertexBufferLayout { + required GPUSize64 arrayStride; + GPUVertexStepMode stepMode = "vertex"; + required sequence attributes; +}; + + +dictionary GPUVertexAttribute { + required GPUVertexFormat format; + required GPUSize64 offset; + + required GPUIndex32 shaderLocation; +}; + + +[Exposed=(Window, DedicatedWorker), SecureContext] +interface GPUCommandBuffer { +}; +GPUCommandBuffer includes GPUObjectBase; + + +dictionary GPUCommandBufferDescriptor : GPUObjectDescriptorBase { +}; + + +interface mixin GPUCommandsMixin { +}; + + +[Exposed=(Window, DedicatedWorker), SecureContext] +interface GPUCommandEncoder { + GPURenderPassEncoder beginRenderPass(GPURenderPassDescriptor descriptor); + GPUComputePassEncoder beginComputePass(optional GPUComputePassDescriptor descriptor = {}); + + undefined copyBufferToBuffer( + GPUBuffer source, + GPUSize64 sourceOffset, + GPUBuffer destination, + GPUSize64 destinationOffset, + GPUSize64 size); + + undefined copyBufferToTexture( + GPUImageCopyBuffer source, + GPUImageCopyTexture destination, + GPUExtent3D copySize); + + undefined copyTextureToBuffer( + GPUImageCopyTexture source, + GPUImageCopyBuffer destination, + GPUExtent3D copySize); + + undefined copyTextureToTexture( + GPUImageCopyTexture source, + GPUImageCopyTexture destination, + GPUExtent3D copySize); + + undefined clearBuffer( + GPUBuffer buffer, + optional GPUSize64 offset = 0, + optional GPUSize64 size); + + undefined writeTimestamp(GPUQuerySet querySet, GPUSize32 queryIndex); + + undefined resolveQuerySet( + GPUQuerySet querySet, + GPUSize32 firstQuery, + GPUSize32 queryCount, + GPUBuffer destination, + GPUSize64 destinationOffset); + + GPUCommandBuffer finish(optional GPUCommandBufferDescriptor descriptor = {}); +}; +GPUCommandEncoder includes GPUObjectBase; +GPUCommandEncoder includes GPUCommandsMixin; +GPUCommandEncoder includes GPUDebugCommandsMixin; + + +dictionary GPUCommandEncoderDescriptor : GPUObjectDescriptorBase { +}; + + +dictionary GPUImageDataLayout { + GPUSize64 offset = 0; + GPUSize32 bytesPerRow; + GPUSize32 rowsPerImage; +}; + + +dictionary GPUImageCopyBuffer : GPUImageDataLayout { + required GPUBuffer buffer; +}; + + +dictionary GPUImageCopyTexture { + required GPUTexture texture; + GPUIntegerCoordinate mipLevel = 0; + GPUOrigin3D origin = {}; + GPUTextureAspect aspect = "all"; +}; + + +dictionary GPUImageCopyTextureTagged : GPUImageCopyTexture { + GPUPredefinedColorSpace colorSpace = "srgb"; + boolean premultipliedAlpha = false; +}; + + +dictionary GPUImageCopyExternalImage { + required (ImageBitmap or HTMLCanvasElement or OffscreenCanvas) source; + GPUOrigin2D origin = {}; + boolean flipY = false; +}; + + + +interface mixin GPUProgrammablePassEncoder { + undefined setBindGroup(GPUIndex32 index, GPUBindGroup bindGroup, + optional sequence dynamicOffsets = []); + + undefined setBindGroup(GPUIndex32 index, GPUBindGroup bindGroup, + Uint32Array dynamicOffsetsData, + GPUSize64 dynamicOffsetsDataStart, + GPUSize32 dynamicOffsetsDataLength); +}; + + +interface mixin GPUDebugCommandsMixin { + undefined pushDebugGroup(USVString groupLabel); + undefined popDebugGroup(); + undefined insertDebugMarker(USVString markerLabel); +}; + + +[Exposed=(Window, DedicatedWorker), SecureContext] +interface GPUComputePassEncoder { + undefined setPipeline(GPUComputePipeline pipeline); + undefined dispatch(GPUSize32 workgroupCountX, optional GPUSize32 workgroupCountY = 1, optional GPUSize32 workgroupCountZ = 1); + undefined dispatchIndirect(GPUBuffer indirectBuffer, GPUSize64 indirectOffset); + + undefined end(); +}; +GPUComputePassEncoder includes GPUObjectBase; +GPUComputePassEncoder includes GPUCommandsMixin; +GPUComputePassEncoder includes GPUDebugCommandsMixin; +GPUComputePassEncoder includes GPUProgrammablePassEncoder; + + +enum GPUComputePassTimestampLocation { + "beginning", + "end", +}; + +dictionary GPUComputePassTimestampWrite { + required GPUQuerySet querySet; + required GPUSize32 queryIndex; + required GPUComputePassTimestampLocation location; +}; + +typedef sequence GPUComputePassTimestampWrites; + +dictionary GPUComputePassDescriptor : GPUObjectDescriptorBase { + GPUComputePassTimestampWrites timestampWrites = []; +}; + + +interface mixin GPURenderEncoderBase { + undefined setPipeline(GPURenderPipeline pipeline); + + undefined setIndexBuffer(GPUBuffer buffer, GPUIndexFormat indexFormat, optional GPUSize64 offset = 0, optional GPUSize64 size); + undefined setVertexBuffer(GPUIndex32 slot, GPUBuffer buffer, optional GPUSize64 offset = 0, optional GPUSize64 size); + + undefined draw(GPUSize32 vertexCount, optional GPUSize32 instanceCount = 1, + optional GPUSize32 firstVertex = 0, optional GPUSize32 firstInstance = 0); + undefined drawIndexed(GPUSize32 indexCount, optional GPUSize32 instanceCount = 1, + optional GPUSize32 firstIndex = 0, + optional GPUSignedOffset32 baseVertex = 0, + optional GPUSize32 firstInstance = 0); + + undefined drawIndirect(GPUBuffer indirectBuffer, GPUSize64 indirectOffset); + undefined drawIndexedIndirect(GPUBuffer indirectBuffer, GPUSize64 indirectOffset); +}; + +[Exposed=(Window, DedicatedWorker), SecureContext] +interface GPURenderPassEncoder { + undefined setViewport(float x, float y, + float width, float height, + float minDepth, float maxDepth); + + undefined setScissorRect(GPUIntegerCoordinate x, GPUIntegerCoordinate y, + GPUIntegerCoordinate width, GPUIntegerCoordinate height); + + undefined setBlendConstant(GPUColor color); + undefined setStencilReference(GPUStencilValue reference); + + undefined beginOcclusionQuery(GPUSize32 queryIndex); + undefined endOcclusionQuery(); + + undefined executeBundles(sequence bundles); + undefined end(); +}; +GPURenderPassEncoder includes GPUObjectBase; +GPURenderPassEncoder includes GPUCommandsMixin; +GPURenderPassEncoder includes GPUDebugCommandsMixin; +GPURenderPassEncoder includes GPUProgrammablePassEncoder; +GPURenderPassEncoder includes GPURenderEncoderBase; + + +enum GPURenderPassTimestampLocation { + "beginning", + "end", +}; + +dictionary GPURenderPassTimestampWrite { + required GPUQuerySet querySet; + required GPUSize32 queryIndex; + required GPURenderPassTimestampLocation location; +}; + +typedef sequence GPURenderPassTimestampWrites; + +dictionary GPURenderPassDescriptor : GPUObjectDescriptorBase { + required sequence colorAttachments; + GPURenderPassDepthStencilAttachment depthStencilAttachment; + GPUQuerySet occlusionQuerySet; + GPURenderPassTimestampWrites timestampWrites = []; +}; + + +dictionary GPURenderPassColorAttachment { + required GPUTextureView view; + GPUTextureView resolveTarget; + + GPUColor clearValue; + required GPULoadOp loadOp; + required GPUStoreOp storeOp; +}; + + +dictionary GPURenderPassDepthStencilAttachment { + required GPUTextureView view; + + float depthClearValue = 0; + GPULoadOp depthLoadOp; + GPUStoreOp depthStoreOp; + boolean depthReadOnly = false; + + GPUStencilValue stencilClearValue = 0; + GPULoadOp stencilLoadOp; + GPUStoreOp stencilStoreOp; + boolean stencilReadOnly = false; +}; + + +enum GPULoadOp { + "load", + "clear", +}; + + +enum GPUStoreOp { + "store", + "discard", +}; + + +dictionary GPURenderPassLayout: GPUObjectDescriptorBase { + required sequence colorFormats; + GPUTextureFormat depthStencilFormat; + GPUSize32 sampleCount = 1; +}; + + +[Exposed=(Window, DedicatedWorker), SecureContext] +interface GPURenderBundle { +}; +GPURenderBundle includes GPUObjectBase; + + +dictionary GPURenderBundleDescriptor : GPUObjectDescriptorBase { +}; + + +[Exposed=(Window, DedicatedWorker), SecureContext] +interface GPURenderBundleEncoder { + GPURenderBundle finish(optional GPURenderBundleDescriptor descriptor = {}); +}; +GPURenderBundleEncoder includes GPUObjectBase; +GPURenderBundleEncoder includes GPUCommandsMixin; +GPURenderBundleEncoder includes GPUDebugCommandsMixin; +GPURenderBundleEncoder includes GPUProgrammablePassEncoder; +GPURenderBundleEncoder includes GPURenderEncoderBase; + + +dictionary GPURenderBundleEncoderDescriptor : GPURenderPassLayout { + boolean depthReadOnly = false; + boolean stencilReadOnly = false; +}; + + +dictionary GPUQueueDescriptor : GPUObjectDescriptorBase { +}; + + +[Exposed=(Window, DedicatedWorker), SecureContext] +interface GPUQueue { + undefined submit(sequence commandBuffers); + + Promise onSubmittedWorkDone(); + + undefined writeBuffer( + GPUBuffer buffer, + GPUSize64 bufferOffset, + [AllowShared] BufferSource data, + optional GPUSize64 dataOffset = 0, + optional GPUSize64 size); + + undefined writeTexture( + GPUImageCopyTexture destination, + [AllowShared] BufferSource data, + GPUImageDataLayout dataLayout, + GPUExtent3D size); + + undefined copyExternalImageToTexture( + GPUImageCopyExternalImage source, + GPUImageCopyTextureTagged destination, + GPUExtent3D copySize); +}; +GPUQueue includes GPUObjectBase; + + +[Exposed=(Window, DedicatedWorker), SecureContext] +interface GPUQuerySet { + undefined destroy(); +}; +GPUQuerySet includes GPUObjectBase; + + +dictionary GPUQuerySetDescriptor : GPUObjectDescriptorBase { + required GPUQueryType type; + required GPUSize32 count; +}; + + +enum GPUQueryType { + "occlusion", + "timestamp", +}; + + +[Exposed=(Window, DedicatedWorker), SecureContext] +interface GPUCanvasContext { + readonly attribute (HTMLCanvasElement or OffscreenCanvas) canvas; + + undefined configure(GPUCanvasConfiguration configuration); + undefined unconfigure(); + + GPUTextureFormat getPreferredFormat(GPUAdapter adapter); + GPUTexture getCurrentTexture(); +}; + + + +enum GPUCanvasCompositingAlphaMode { + "opaque", + "premultiplied", +}; + +dictionary GPUCanvasConfiguration { + required GPUDevice device; + required GPUTextureFormat format; + GPUTextureUsageFlags usage = 0x10; // GPUTextureUsage.RENDER_ATTACHMENT + sequence viewFormats = []; + GPUPredefinedColorSpace colorSpace = "srgb"; + GPUCanvasCompositingAlphaMode compositingAlphaMode = "opaque"; + GPUExtent3D size; +}; + + +enum GPUDeviceLostReason { + "destroyed", +}; + +[Exposed=(Window, DedicatedWorker), SecureContext] +interface GPUDeviceLostInfo { + readonly attribute (GPUDeviceLostReason or undefined) reason; + readonly attribute DOMString message; +}; + +partial interface GPUDevice { + readonly attribute Promise lost; +}; + + +enum GPUErrorFilter { + "out-of-memory", + "validation", +}; + + +[Exposed=(Window, DedicatedWorker), SecureContext] +interface GPUOutOfMemoryError { + constructor(); +}; + +[Exposed=(Window, DedicatedWorker), SecureContext] +interface GPUValidationError { + constructor(DOMString message); + readonly attribute DOMString message; +}; + +typedef (GPUOutOfMemoryError or GPUValidationError) GPUError; + + +partial interface GPUDevice { + undefined pushErrorScope(GPUErrorFilter filter); + Promise popErrorScope(); +}; + + +[Exposed=(Window, DedicatedWorker), SecureContext] +interface GPUUncapturedErrorEvent : Event { + constructor( + DOMString type, + GPUUncapturedErrorEventInit gpuUncapturedErrorEventInitDict + ); + readonly attribute GPUError error; +}; + +dictionary GPUUncapturedErrorEventInit : EventInit { + required GPUError error; +}; + + +partial interface GPUDevice { + [Exposed=(Window, DedicatedWorker)] + attribute EventHandler onuncapturederror; +}; + + +typedef [EnforceRange] unsigned long GPUBufferDynamicOffset; +typedef [EnforceRange] unsigned long GPUStencilValue; +typedef [EnforceRange] unsigned long GPUSampleMask; +typedef [EnforceRange] long GPUDepthBias; + +typedef [EnforceRange] unsigned long long GPUSize64; +typedef [EnforceRange] unsigned long GPUIntegerCoordinate; +typedef [EnforceRange] unsigned long GPUIndex32; +typedef [EnforceRange] unsigned long GPUSize32; +typedef [EnforceRange] long GPUSignedOffset32; + +typedef unsigned long GPUFlagsConstant; + + +dictionary GPUColorDict { + required double r; + required double g; + required double b; + required double a; +}; +typedef (sequence or GPUColorDict) GPUColor; + + +dictionary GPUOrigin2DDict { + GPUIntegerCoordinate x = 0; + GPUIntegerCoordinate y = 0; +}; +typedef (sequence or GPUOrigin2DDict) GPUOrigin2D; + + +dictionary GPUOrigin3DDict { + GPUIntegerCoordinate x = 0; + GPUIntegerCoordinate y = 0; + GPUIntegerCoordinate z = 0; +}; +typedef (sequence or GPUOrigin3DDict) GPUOrigin3D; + + +dictionary GPUExtent3DDict { + required GPUIntegerCoordinate width; + GPUIntegerCoordinate height = 1; + GPUIntegerCoordinate depthOrArrayLayers = 1; +}; +typedef (sequence or GPUExtent3DDict) GPUExtent3D; diff --git a/wgsl.html b/wgsl.html new file mode 100644 index 0000000000..ce9d53503a --- /dev/null +++ b/wgsl.html @@ -0,0 +1 @@ + diff --git a/wgsl/.pr-preview.json b/wgsl/.pr-preview.json deleted file mode 100644 index 41eb560613..0000000000 --- a/wgsl/.pr-preview.json +++ /dev/null @@ -1,7 +0,0 @@ -{ - "src_file": "index.bs", - "type": "bikeshed", - "params": { - "force": 1 - } -} diff --git a/wgsl/Makefile b/wgsl/Makefile deleted file mode 100644 index 06cd95a772..0000000000 --- a/wgsl/Makefile +++ /dev/null @@ -1,8 +0,0 @@ -all: index.html - -index.html: index.bs - bikeshed --die-on=everything spec index.bs - -online: - curl https://api.csswg.org/bikeshed/ -F file=@index.bs -F output=err - curl https://api.csswg.org/bikeshed/ -F file=@index.bs -F force=1 > index.html diff --git a/wgsl/README.md b/wgsl/README.md index eb78a196de..3ee04a2480 100644 --- a/wgsl/README.md +++ b/wgsl/README.md @@ -1,19 +1,47 @@ # WebGPU Shading Language Specification -## Generating the specification +## Dependencies -The specification is written using [Bikeshed](https://tabatkins.github.io/bikeshed). +The specification is written using [Bikeshed](https://tabatkins.github.io/bikeshed). \ +The WGSL grammar in the specification is validated using [Tree-sitter](https://tree-sitter.github.io/tree-sitter/). -If you have bikeshed installed locally, you can generate the specification with: +To install both `Bikeshed` and `Tree-sitter`, type: +```bash +python3 -m pip install bikeshed==3.0.3 tree_sitter==0.19.0 ``` + +## Generating both the specification and validating grammar (recommended) + +With both `Bikeshed` and `Tree-sitter` locally installed, type: + +```bash make ``` -This simply runs bikeshed on the `index.bs` file. +The rendered specification will be written to `index.html`. -Otherwise, you can use the bikeshed Web API: +## Generating the specification only +With `Bikeshed` locally installed, type: + +```bash +make index.html ``` + +Alternatively, if you do not have `Bikeshed` locally installed, you can use the Bikeshed Web API to generate the specification (slower): + +```bash make online ``` + +Either approach will write the rendered specification to `index.html`. + +## Validating grammar only + +With `Tree-sitter` installed, type: + +```bash +make grammar/grammar.js +``` + diff --git a/wgsl/extract-grammar.py b/wgsl/extract-grammar.py new file mode 100644 index 0000000000..29b39e4ce5 --- /dev/null +++ b/wgsl/extract-grammar.py @@ -0,0 +1,550 @@ +#!/usr/bin/env python3 + +from datetime import date +from string import Template + +import os +import re +import subprocess +import sys + +from tree_sitter import Language, Parser + +HEADER = """ +// Copyright (C) [$YEAR] World Wide Web Consortium, +// (Massachusetts Institute of Technology, European Research Consortium for +// Informatics and Mathematics, Keio University, Beihang). +// All Rights Reserved. +// +// This work is distributed under the W3C (R) Software License [1] in the hope +// that it will be useful, but WITHOUT ANY WARRANTY; without even the implied +// warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. +// +// [1] http://www.w3.org/Consortium/Legal/copyright-software + +// **** This file is auto-generated. Do not edit. **** + +""".lstrip() + +scanner_filename = sys.argv[1] +scanner_file = open(scanner_filename, "r") +# Break up the input into lines, and skip empty lines. +scanner_lines = [j for i in [i.split("\n") + for i in scanner_file.readlines()] for j in i if len(j) > 0] +# Replace comments in rule text +scanner_lines = [re.sub('', '', line) for line in scanner_lines] + +grammar_filename = sys.argv[2] +grammar_path = os.path.dirname(grammar_filename) +os.makedirs(grammar_path, exist_ok=True) +grammar_file = open(grammar_filename, "w") + + +def scanner_escape_name(name): + return name.strip().replace("`", "").replace('-', '_').lower().strip() + + +def scanner_escape_regex(regex): + return re.escape(regex.strip()).strip().replace("/", "\\/").replace("\\_", "_").replace("\\%", "%").replace("\\;", ";").replace("\\<", "<").replace("\\>", ">").replace("\\=", "=").replace("\\,", ",").replace("\\:", ":").replace("\\!", "!") + + +class scanner_rule: + @staticmethod + def name(): + return "rule" + + @staticmethod + def begin(lines, i): + line = lines[i].rstrip() + return (line.startswith("
 [ + $._block_comment, + ], + + extras: $ => [ + $._comment, + $._block_comment, + $._space, + ], + + inline: $ => [ + $.global_decl, + $._reserved, + ], + + conflicts: $ => [ + [$.type_decl,$.function_call_expression], + ], + + word: $ => $.ident, + + rules: { +"""[1:-1] +grammar_source += "\n" + + +def grammar_from_rule_item(rule_item): + result = "" + item_choice = False + items = [] + i = 0 + while i < len(rule_item): + i_optional = False + i_repeatone = False + i_skip = 0 + i_item = "" + if rule_item[i].startswith("[=syntax/"): + i_item = rule_item[i].split("[=syntax/")[1].split("=]")[0] + i_item = f"$.{i_item}" + elif rule_item[i].startswith("`/"): + i_item = f"token({rule_item[i][1:-1]})" + elif rule_item[i].startswith("`'"): + i_item = f"token({rule_item[i][1:-1]})" + elif rule_item[i] == "(": + j = i + 1 + j_span = 0 + rule_subitem = [] + while j < len(rule_item): + if rule_item[j] == "(": + j_span += 1 + elif rule_item[j] == ")": + j_span -= 1 + rule_subitem.append(rule_item[j]) + j += 1 + if rule_item[j] == ")" and j_span == 0: + break + i_item = grammar_from_rule_item(rule_subitem) + i = j + if len(rule_item) - i > 1: + if rule_item[i + 1] == "+": + i_repeatone = True + i_skip += 1 + elif rule_item[i + 1] == "?": + i_optional = True + i_skip += 1 + elif rule_item[i + 1] == "*": + i_repeatone = True + i_optional = True + i_skip += 1 + elif rule_item[i + 1] == "|": + item_choice = True + i_skip += 1 + if i_repeatone: + i_item = f"repeat1({i_item})" + if i_optional: + i_item = f"optional({i_item})" + items.append(i_item) + i += 1 + i_skip + if item_choice == True: + result = f"choice({', '.join(items)})" + else: + if len(items) == 1: + result = items[0] + else: + result = f"seq({', '.join(items)})" + return result + + +def grammar_from_rule(key, value): + result = f" {key}: $ =>" + if len(value) == 1: + result += f" {grammar_from_rule_item(value[0])}" + else: + result += " choice(\n {}\n )".format( + ',\n '.join([grammar_from_rule_item(i) for i in value])) + return result + + +scanner_components[scanner_rule.name()]["_comment"] = [["`'//'`", '`/.*/`']] + +# Following sections are to allow out-of-order per syntactic grammar appearance of rules + + +rule_skip = set() + +for rule in ["translation_unit", "global_directive", "global_decl"]: + grammar_source += grammar_from_rule( + rule, scanner_components[scanner_rule.name()][rule]) + ",\n" + rule_skip.add(rule) + + +# Extract literals + + +for key, value in scanner_components[scanner_rule.name()].items(): + if key.endswith("_literal") and key not in rule_skip: + grammar_source += grammar_from_rule(key, value) + ",\n" + rule_skip.add(key) + + +# Extract constituents + + +def not_token_only(value): + result = False + for i in value: + result = result or len( + [j for j in i if not j.startswith("`/") and not j.startswith("`'")]) > 0 + return result + + +for key, value in scanner_components[scanner_rule.name()].items(): + if not key.startswith("_") and key != "ident" and not_token_only(value) and key not in rule_skip: + grammar_source += grammar_from_rule(key, value) + ",\n" + rule_skip.add(key) + + +# Extract tokens + + +for key, value in scanner_components[scanner_rule.name()].items(): + if not key.startswith("_") and key != "ident" and key not in rule_skip: + grammar_source += grammar_from_rule(key, value) + ",\n" + rule_skip.add(key) + + +# Extract underscore + + +for key, value in scanner_components[scanner_rule.name()].items(): + if key.startswith("_") and key != "_comment" and key != "_space" and key not in rule_skip: + grammar_source += grammar_from_rule(key, value) + ",\n" + rule_skip.add(key) + + +# Extract ident + + +grammar_source += grammar_from_rule( + "ident", scanner_components[scanner_rule.name()]["ident"]) + ",\n" +rule_skip.add("ident") + + +# Extract comment + + +grammar_source += grammar_from_rule( + "_comment", scanner_components[scanner_rule.name()]["_comment"]) + ",\n" +rule_skip.add("_comment") + + +# Extract space + + +grammar_source += grammar_from_rule( + "_space", scanner_components[scanner_rule.name()]["_space"]) +rule_skip.add("_space") + + +grammar_source += "\n" +grammar_source += r""" + }, +}); +"""[1:-1] + +headerTemplate = Template(HEADER) +grammar_file.write(headerTemplate.substitute( + YEAR=date.today().year) + grammar_source + "\n") +grammar_file.close() + +with open(grammar_path + "/package.json", "w") as grammar_package: + grammar_package.write('{\n') + grammar_package.write(' "name": "tree-sitter-wgsl",\n') + grammar_package.write(' "dependencies": {\n') + grammar_package.write(' "nan": "^2.15.0"\n') + grammar_package.write(' },\n') + grammar_package.write(' "devDependencies": {\n') + grammar_package.write(' "tree-sitter-cli": "^0.20.0"\n') + grammar_package.write(' },\n') + grammar_package.write(' "main": "bindings/node"\n') + grammar_package.write('}\n') + +# External scanner for nested block comments +# For the API, see https://tree-sitter.github.io/tree-sitter/creating-parsers#external-scanners +# See: https://github.com/tree-sitter/tree-sitter-rust/blob/master/src/scanner.c + +os.makedirs(os.path.join(grammar_path, "src"), exist_ok=True) +with open(os.path.join(grammar_path, "src", "scanner.c"), "w") as external_scanner: + external_scanner.write(r""" +#include +#include + +enum TokenType { + BLOCK_COMMENT, +}; + +void *tree_sitter_wgsl_external_scanner_create() { return NULL; } +void tree_sitter_wgsl_external_scanner_destroy(void *p) {} +unsigned tree_sitter_wgsl_external_scanner_serialize(void *p, char *buffer) { return 0; } +void tree_sitter_wgsl_external_scanner_deserialize(void *p, const char *b, unsigned n) {} + +static void advance(TSLexer *lexer) { + lexer->advance(lexer, false); +} + +bool tree_sitter_wgsl_external_scanner_scan(void *payload, TSLexer *lexer, + const bool *valid_symbols) { + while (iswspace(lexer->lookahead)) lexer->advance(lexer, true); + + if (lexer->lookahead == '/') { + advance(lexer); + if (lexer->lookahead != '*') return false; + advance(lexer); + + bool after_star = false; + unsigned nesting_depth = 1; + for (;;) { + switch (lexer->lookahead) { + case '\0': + /* This signals the end of input. Since nesting depth is + * greater than zero, the scanner is in the middle of + * a block comment. Block comments must be affirmatively + * terminated. + */ + return false; + case '*': + advance(lexer); + after_star = true; + break; + case '/': + if (after_star) { + advance(lexer); + after_star = false; + nesting_depth--; + if (nesting_depth == 0) { + lexer->result_symbol = BLOCK_COMMENT; + return true; + } + } else { + advance(lexer); + after_star = false; + if (lexer->lookahead == '*') { + nesting_depth++; + advance(lexer); + } + } + break; + default: + advance(lexer); + after_star = false; + break; + } + } + } + + return false; +} +"""[1:-1]) + +subprocess.run(["npm", "install"], cwd=grammar_path, check=True) +subprocess.run(["npx", "tree-sitter", "generate"], + cwd=grammar_path, check=True) +# Following are commented for future reference to expose playground +# Remove "--docker" if local environment matches with the container +# subprocess.run(["npx", "tree-sitter", "build-wasm", "--docker"], +# cwd=grammar_path, check=True) + +Language.build_library( + grammar_path + "/build/wgsl.so", + [ + grammar_path, + ] +) + +WGSL_LANGUAGE = Language(grammar_path + "/build/wgsl.so", "wgsl") + +parser = Parser() +parser.set_language(WGSL_LANGUAGE) + +error_list = [] + +for key, value in scanner_components[scanner_example.name()].items(): + if "expect-error" in key: + continue + value = value[:] + if "function-scope" in key: + value = ["fn function__scope____() {"] + value + ["}"] + if "type-scope" in key: + # Initiailize with zero-value expression. + value = ["let type_scope____: "] + value + ["="] + value + ["()"] + [";"] + program = "\n".join(value) + tree = parser.parse(bytes(program, "utf8")) + if tree.root_node.has_error: + error_list.append((program, tree)) + # TODO Semantic CI + +if len(error_list) > 0: + for error in error_list: + print("Example:") + print(error[0]) + print("Tree:") + print(error[1].root_node.sexp()) + raise Exception("Grammar is not compatible with examples!") diff --git a/wgsl/index.bs b/wgsl/index.bs deleted file mode 100644 index 7c83842079..0000000000 --- a/wgsl/index.bs +++ /dev/null @@ -1,7490 +0,0 @@ - - - - -
-{
-  "WebGPU": {
-    "authors": [
-      "Dzmitry Malyshau",
-      "Justin Fan",
-      "Kai Ninomiya"
-    ],
-    "href": "https://gpuweb.github.io/gpuweb/",
-    "title": "WebGPU",
-    "status": "Editor's Draft",
-    "publisher": "W3C",
-    "deliveredBy": [
-      "https://github.com/gpuweb/gpuweb"
-    ]
-  },
-  "VulkanMemoryModel": {
-    "authors": [
-      "Jeff Bolz",
-      "Alan Baker",
-      "Tobias Hector",
-      "David Neto",
-      "Robert Simpson",
-      "Brian Sumner"
-    ],
-    "href": "https://www.khronos.org/registry/vulkan/specs/1.2-extensions/html/vkspec.html#memory-model",
-    "title": "Vulkan Memory Model",
-    "publisher": "Khronos Group"
-  }
-}
-
- -# Introduction # {#intro} - -WebGPU Shader Language ([SHORTNAME]) is the shader language for [[!WebGPU]]. -That is, an application using the WebGPU API uses [SHORTNAME] to express the programs, known as shaders, -that run on the GPU. - -
- - [[stage(fragment)]] - fn main() -> [[location(0)]] vec4<f32> { - return vec4<f32>(0.4, 0.4, 0.8, 1.0); - } - -
- - -## Goals ## {#goals} - - * Trivially convertable to SPIR-V - * Constructs are defined as normative references to their SPIR-V counterparts - * All features in [SHORTNAME] are directly translatable to SPIR-V. (No polymorphism, no general pointers, no overloads, etc) - * Features and semantics are exactly the ones of SPIR-V - * Each item in this spec *must* provide the mapping to SPIR-V for the construct - -## Technical Overview ## {#technical-overview} - -WebGPU issues a unit of work to the GPU in the form of a [[WebGPU#gpu-command|GPU command]]. -[SHORTNAME] is concerned with two kinds of GPU commands: -* a draw command executes a [=GPURenderPipeline|render pipeline=] - in the context of [=pipeline input|inputs=], [=pipeline output|outputs=], and attached [=resources=]. -* a dispatch command executes a [=GPUComputePipeline|compute pipeline=] - in the context of [=pipeline input|inputs=] and attached [=resources=]. - -Both kinds of pipelines use shaders written in [SHORTNAME]. - -A shader is the portion of a [SHORTNAME] program that executes a [=shader stage=] in a pipeline. -A shader comprises: -* An [=entry point=] [=function/function=]. -* The transitive closure of all called functions, starting with the entry point. - This set includes both [=user-defined function|user-defined=] and [=built-in function|built-in=] functions. - (For a more rigorous definition, see "[=functions in a shader stage=]".) -* The set of variables and constants [=statically accessed=] by all those functions. -* The set of types used to define or analyze all those functions, variables, and constants. - -When executing a shader stage, the implementation: -* Binds [=resources=] to variables in the shader's [=resource interface of a shader|resource interface=], - making the contents of those resources available to the shader during execution. -* Allocates storage for other [=module scope|module-scope=] variables, - and populates that storage with the specified initial values. -* Populates the formal parameters of the entry point, if they exist, with the stage's pipeline inputs. -* Connects the entry point return value, if one exists, to the stage's pipeline outputs. -* Then it invokes the entry point. - -A [SHORTNAME] program is organized into: -* Functions, which specify execution behaviour. -* Statements, which are declarations or units of executable behaviour. -* Literals, which are text representations for pure mathematical values. -* Constants, each providing a name for a value computed at a specific time. -* Variables, each providing a name for storage for holding a value. -* Expressions, each of which combines a set of values to produce a result value. -* Types, each of which describes: - * A set of values. - * Constraints on supported expressions. - * The semantics of those expressions. - -[SHORTNAME] is an imperative language: behaviour is specified as a sequence of statements to execute. -Statements: -* Declare constants or variables -* Modify the contents of variables -* Modify execution order using structured programming constructs: - * Selective execution: if/else/elseif, switch - * Repetition: loop, for - * Escaping a nested execution construct: break, continue - * Refactoring: function call and return - * Discard (fragment shaders only): terminating the invocation and throwing away the output -* Evaluate expressions to compute values as part of the above behaviours. - -[SHORTNAME] is statically typed: each value computed by a particular expression is in a specific type, -determined only by examining the program source. - -[SHORTNAME] has types to describe booleans, numbers, vectors, matrices, and aggregations -of these in the form of arrays and structures. -Additional types describe memory. - -[SHORTNAME] has texture and sampler types. -Together with their associated built-in functions, these support functionality -commonly used for graphics rendering, and commonly provided by GPUs. - -The work of a shader stage is partitioned into one or more invocations, -each of which executes the entry point, but under slightly different conditions. -Invocations in a shader stage share access to certain variables: -* All invocations in the stage share the resources in the shader interface. -* In a [=compute shader stage|compute shader=], invocations in the same - [=compute shader stage/workgroup=] share - variables in the [=storage classes/workgroup=] [=storage class=]. - Invocations in different workgroups do not share those variables. - -However, the invocations act on different sets of pipeline inputs, including built-in inputs -that provide an identifying value to distinguish an invocation from its peers. -Also, each invocation has its own independent storage space in the form -of variables in the [=storage classes/private=] and [=storage classes/function=] storage classes. - -Invocations within a shader stage execute concurrently, and may often execute in parallel. -The shader author is responsible for ensuring the dynamic behaviour of the invocations -in a shader stage: -* Meet the uniformity requirements of certain primitive operations, including texture sampling and control barriers. -* Coordinate potentially conflicting accesses to shared variables, to avoid race conditions. - -[SHORTNAME] sometimes permits several possible behaviours for a given feature. -This is a portability hazard, as different implementations may exhibit the different behaviours. -The design of [SHORTNAME] aims to minimize such cases, but is constrained by feasibility, -and goals for achieving high performance across a broad range of devices. - -## Notation ## {#notation} - -The floor expression is defined over real numbers |x|: - -* ⌊|x|⌋ = |k|, where |k| is the unique integer such that |k| ≤ |x| < |k|+1 - -The ceiling expression is defined over real numbers |x|: - -* ⌈|x|⌉ = |k|, where |k| is the unique integer such that |k|-1 < |x| ≤ |k| - -The roundUp function is defined for positive integers |k| and |n| as: - -* roundUp(|k|, |n|) = ⌈|n| ÷ |k|⌉ × |k| - -The transpose of an |n|-column |m|-row matrix |A| is the |m|-column |n|-row matrix -|A|T formed by copying the rows of |A| as the columns of |A|T: - -* transpose(|A|) = |A|T -* transpose(|A|)|i|,|j| = |A||j|,|i| - -The transpose of a column vector is defined by interpreting the column vector as a 1-row matrix. -Similarly, the transpose of a row vector is defined by interpreting the row vector as a 1-column matrix. - -# Shader Lifecycle # {#program-lifecycle} - -There are four key events in the lifecycle of a [SHORTNAME] program and the shaders it may contain. -The first two correspond to the WebGPU API methods used to prepare a [SHORTNAME] program -for execution. -The last two are the start and end of execution of a shader. - -The events are: - -1. Shader module creation - * This occurs when the - [[WebGPU#dom-gpudevice-createshadermodule|WebGPU createShaderModule]] method - is called. - The source text for a [SHORTNAME] program is provided at this time. -2. Pipeline creation - * This occurs when the - [[WebGPU#dom-gpudevice-createcomputepipeline|WebGPU createComputePipeline]] method - or the - [[WebGPU#dom-gpudevice-createrenderputepipeline|WebGPU createRenderPipeline]] method - is invoked. - These methods use one or more previously created shader modules, together with other - configuration information. -3. Shader execution start - * This occurs when a [=draw command|draw=] or [=dispatch command=] is issued to the GPU, - begins executing the pipeline, - and invokes the [=shader stage=] [=entry point=] function. -4. Shader execution end - * This occurs when all work in the shader completes: - * all its [=invocations=] terminate - * and all accesses to [=resources=] complete - * outputs, if any, are passed to downstream pipeline stages. - -The events are ordered due to: -* data dependencies: shader execution requires a pipeline, and a pipeline requires a shader module. -* causality: the shader must start executing before it can finish executing. - -## Kinds of errors ## {#kinds-of-errors} - -A program error is a failure to satisfy the requirements of this specification. - -There are three kinds of errors, corresponding to the shader lifecycle: - -* A shader-creation error - is an error feasibly detectable at [=shader module creation=] time. - Detection must rely only on the [SHORTNAME] program source text - and other information available to the `createShaderModule` API method. - -* A pipeline-creation error - is an error feasibly detectable at [=pipeline creation=] time. - Detection must rely only on the [SHORTNAME] program source text - and other information available to the particular pipeline creation API method. - -* A dynamic error is an error occurring during shader execution. - These errors may or may not be detectable. - -Note: For example, a race condition may not be detectable. - -Each requirement in this specification corresponds to a single kind of error. -Generally, a requirement corresponds to the earliest error kind at which its violation could be feasibly detected. -When unclear, the corresponding error kind is explicitly specified. - -The WebGPU specification describes the consequences of each kind of error. - -TODO: Update the WebGPU spec, referring back to the three kinds of errors defined here. - -# Textual structure TODO # {#textual-structure} - -TODO: This is a stub. - -A [SHORTNAME] program is text. -This specification does not prescribe a particular encoding for that text. - -## Comments ## {#comments} - -Comments begin with `//` and continue to the end of the current line. There are no multi-line comments. - -TODO: What indicates the end of a line? (E.g. A line ends at the next linefeed or at the end of the program) - -## Tokens TODO ## {#tokens} - -## Literals TODO ## {#literals} - - - - -
TokenDefinition -
`DECIMAL_FLOAT_LITERAL``(-?[0-9]*.[0-9]+ | -?[0-9]+.[0-9]*)((e|E)(+|-)?[0-9]+)?` -
`HEX_FLOAT_LITERAL``-?0x([0-9a-fA-F]*.?[0-9a-fA-F]+ | [0-9a-fA-F]+.[0-9a-fA-F]*)(p|P)(+|-)?[0-9]+` -
`INT_LITERAL``-?0x[0-9a-fA-F]+ | 0 | -?[1-9][0-9]*` -
`UINT_LITERAL``0x[0-9a-fA-F]+u | 0u | [1-9][0-9]*u` -
- -Note: literals are parsed greedily. This means that for statements like `a -5` - this will *not* parse as `a` `minus` `5` but instead as `a` `-5` which - may be unexpected. A space must be inserted after the `-` if the first - expression is desired. - -
-const_literal
-  : INT_LITERAL
-  | UINT_LITERAL
-  | FLOAT_LITERAL
-  | TRUE
-  | FALSE
-
- -
-FLOAT_LITERAL
-  : DECIMAL_FLOAT_LITERAL
-  | HEX_FLOAT_LITERAL
-
- - -## Keywords TODO ## {#keywords} - -TODO: *Stub* - -See [[#keyword-summary]] for a list of keywords. - -## Identifiers TODO ## {#identifiers} - - - - -
TokenDefinition -
`IDENT``[a-zA-Z][0-9a-zA-Z_]*` -
- -An identifier must not have the same spelling as a keyword or as a reserved keyword. - -## Attributes ## {#attributes} - -An attribute modifies an object or type. -[SHORTNAME] provides a unified syntax for applying attributes. -Attributes are used for a variety of purposes such as specifying the interface with the API. -Generally speaking, from the language's point-of-view, attributes can be -ignored for the purposes of type and semantic checking. - -An attribute must not be specified more than once per object or type. - -
-attribute_list
-  : ATTR_LEFT (attribute COMMA)* attribute ATTR_RIGHT
-
-attribute
-  : IDENT PAREN_LEFT literal_or_ident PAREN_RIGHT
-  | IDENT
-
-literal_or_ident
-  : FLOAT_LITERAL
-  | INT_LITERAL
-  | UINT_LITERAL
-  | IDENT
-
- - - - - - -
Attributes defined in [SHORTNAME]
AttributeValid ValuesDescription -
`access` - `read`, `write`, or `read_write` - Must only be applied to a type used as a store type for a variable in - the [=storage classes/storage=] storage class or a variable of [storage - texture](#texture-storage) type. - - Specifies the access qualification of a storage [=resource=] variable. - -
`align` - positive i32 literal - Must only be applied to a member of a [=structure=] type. - - Must be a power of 2. - - See memory layout [alignment and size](#alignment-and-size). - -
`binding` - non-negative i32 literal - Must only be applied to a [=resource=] variable. - - Specifies the binding number of the resource in a bind [=attribute/group=]. - See [[#resource-interface]]. - -
`block` - *None* - Must only be applied to a [=structure=] type. - - Indicates this structure type represents the contents of a buffer - resource occupying a single binding slot in the [=resource interface of a - shader|shader's resource interface=]. - - The `block` attribute must be applied to a structure type used as the - [=store type=] of a [=uniform buffer=] or [=storage buffer=] variable. - - A structure type with the block attribute must not be: - * the element type of an [=array=] type - * the member type in another structure - -
`builtin` - a builtin variable identifier - Must only be applied to an entry point function parameter, entry point - return type, or member of a [=structure=]. - - Declares a builtin variable. - See [[#builtin-variables]]. - -
`constant_id` - non-negative i32 literal - Must only be applied to module scope constant declaration of [=scalar=] type. - - Specifies a [=pipeline-overridable=] constant. - -
`group` - non-negative i32 literal - Must only be applied to a [=resource=] variable. - - Specifies the binding group of the resource. - See [[#resource-interface]]. - -
`interpolate` - One or two parameters. - - The first parameter must be an [=interpolation type=]. - The second parameter, if present, must specify the [=interpolation sampling=]. - Must only be applied to an entry point function parameter, entry point - return type, or member of a [=structure=] type. - Must only be applied to declarations of scalars or vectors of floating-point type. - Must not be used with the [=compute=] shader stage. - If the first parameter is `flat`, the second parameter must not be specified. - - Specifies how the user-defined IO must be interpolated. - The attribute is only significant on user-defined [=vertex=] outputs - and [=fragment=] inputs. - See [[#interpolation]]. - -
`location` - non-negative i32 literal - Must only be applied to an entry point function parameter, entry point - return type, or member of a [=structure=] type. - Must only be applied to declarations of [=numeric scalar=] or [=numeric - vector=] type. - Must not be used with the [=compute=] shader stage. - - Specifies a part of the user-defined IO of an entry point. - See [[#input-output-locations]]. - -
`size` - positive i32 literal - Must only be applied to a member of a [=structure=] type. - - The number of bytes reserved in the struct for this member. - -
`stage` - `compute`, `vertex`, or `fragment` - Must only be applied to a function declaration. - - Declares an entry point by specifying its pipeline stage. - -
`stride` - positive i32 literal - Must only be applied to an [=array=] type. - - The number of bytes from the start of one element of the array to the - start of the next element. - -
`workgroup_size` - One, two or three parameters. - - Each parameter is either a positive i32 literal or the name of a - [=pipeline-overridable=] constant of i32 type. - Must only be applied to a [=compute shader stage=] function declaration. - - Specifies the x, y, and z dimensions of the [=workgroup grid=] for the compute shader. - - The first parameter specifies the x dimension. - The second parameter, if provided, specifies the y dimension, otherwise is assumed to be 1. - The third parameter, if provided, specifies the z dimension, otherwise is assumed to be 1. - Each dimension must be at least 1 and at most an upper bound specified by the WebGPU API. - -
- - -## Directives TODO ## {#directives} - -A directive is a token sequence which modifies how a [SHORTNAME] -program is processed by a WebGPU implementation. -See [[#enable-directive-section]]. - -## Declaration and scope ## {#declaration-and-scope} - -A declaration associates an identifier with one of -the following kinds of objects: -* a type -* a value -* a variable -* a function -* a formal parameter - -In other words, a declaration introduces a name for an object. - -The scope of a declaration is the set of -program locations where a use of the declared identifier potentially denotes -its associated object. -We say the identifier is in scope -(of the declaration) at those source locations. - -Each kind of declaration has its own rule for determining its scope. -In general the scope is a span of text beginning immediately after the end of the -declaration. - -Certain objects are provided by the WebGPU implementation, -and are treated as if they have already been declared at the start of a [SHORTNAME] program. -We say such objects are predeclared. -Their scope is the entire [SHORTNAME] program. -Examples of predeclared objects are: -* [=built-in functions=], and -* built-in types. - -A declaration must not introduce a name when that identifier is -already in scope with the same end scope as another instance of that name. -When an identifier is used in scope of one or more declarations for that name, -the identifier will denote the object of the declaration appearing closest to -that use. -We say the identifier use resolves to that declaration. - -Note: A declaration always precedes its identifier's scope. -Therefore, the nearest in scope declaration of an identifier always precedes the -use of the identifier. - -
- - // Invalid, cannot reuse built-in function names. - var<private> modf : f32 = 0.0; - - // Valid, foo_1 is in scope until the end of the program. - var<private> foo : f32 = 0.0; // foo_1 - - // Valid, bar_1 is in scope until the end of the program. - var<private> bar : u32 = 0u; // bar_1 - - // Valid, my_func_1 is in scope until the end of the program. - // Valid, foo_2 is in scope until the end of the function. - fn my_func(foo : f32) { // my_func_1, foo_2 - // Any reference to 'foo' resolves to the function parameter. - - // Invalid, the scope of foo_3 ends at the of the function. - var foo : f32; // foo_3 - - // Valid, bar_2 is in scope until the end of the function. - var bar : u32; // bar_2 - // References to 'bar' resolve to bar_2 - { - // Valid, bar_3 is in scope until the end of the compound statement. - var bar : u32; // bar_3 - // References to 'bar' resolve to bar_3 - - // Invalid, bar_4 has the same end scope as bar_3. - var bar : i32; // bar_4 - - // Valid, i_1 is in scope until the end of the for loop - for (var i : i32 = 0; i < 10; i = i + 1) { // i_1 - // Invalid, i_2 has the same end scope as i_1. - var i : i32 = 1; // i_2. - } - } - - // Invalid, bar_5 has the same end scope as bar_2. - var bar : u32; // bar_5 - } - - // Invalid, bar_6 has the same end scope as bar_1. - var<private> bar : u32 = 1u; // bar_6 - - // Invalid, my_func_2 has the same end scope as my_func_1. - fn my_func() { } // my_func_2 - - // Valid, my_foo_1 is in scope until the end of the program. - fn my_foo( - // Valid, my_foo_2 is in scope until the end of the function. - my_foo : i32 // my_foo_2 - ) { } - -
- -There are multiple levels of scoping depending on how and where things are -declared. - -When an identifier is used, it must be in scope for some declaration, or as part of a directive. - -A declaration is at module scope if the declaration appears outside -the text of any other declaration. - -Note: Only a [=function declaration=] can contain other declarations. - -# Types # {#types} - -Programs calculate values. - -In [SHORTNAME], a type is set of values, and each value belongs to exactly one type. -A value's type determines the syntax and semantics of operations that can be performed on that value. - -For example, the mathematical number 1 corresponds to three distinct values in [SHORTNAME]: -* the 32-bit signed integer value `1`, -* the 32-bit unsigned integer value `1u`, and -* the 32-bit floating point value `1.0`. - -[SHORTNAME] treats these as different because their machine representation and operations differ. - -A type is either [=predeclared=], or created in WGSL source via a [=declaration=]. - -We distinguish between the *concept* of a type and the *syntax* in [SHORTNAME] to denote that type. -In many cases the spelling of a type in this specification is the same as its [SHORTNAME] syntax. -For example: -* the set of 32-bit unsigned integer values is spelled `u32` in this specification, - and also in a [SHORTNAME] program. -* the spelling is different for structure types, or types containing structures. - -Some [SHORTNAME] types are only used for analyzing a source program and -for determining the program's runtime behaviour. -This specification will describe such types, but they do not appear in [SHORTNAME] source text. - -Note: [SHORTNAME] [=reference types=] are not written in [SHORTNAME] programs. See TODO forward reference to ptr/ref. - -## Type Checking ## {#type-checking-section} - -A [SHORTNAME] value is computed by evaluating an expression. -An expression is a segment of source text -parsed as one of the [SHORTNAME] grammar rules whose name ends with "`_expression`". -An expression *E* can contain subexpressions which are expressions properly contained -in the outer expression *E*. - -The particular value produced by an expression evaluation depends on: -* static context: - the source text surrounding the expression, and -* dynamic context: - the state of the invocation evaluating the expression, - and the execution context in which the invocation is running. - -The values that may result from evaluating a particular expression will always belong to a specific [SHORTNAME] type, -known as the static type of the expression. -The rules of [SHORTNAME] are designed so that the static type of an expression depends only on the expression's static context. - -Statements often use expressions, and may place requirements on the static types of those expressions. -For example: -* The condition expression of an `if` statement must be of type [=bool=]. -* In a `let` declaration, the initializer must evaluate to the declared type of the constant. - -Type checking a successfully parsed [SHORTNAME] program is the process of mapping -each expression to its static type, -and determining if the type requirements of each statement are satisfied. - -A type assertion is a mapping from some [SHORTNAME] source expression to a [SHORTNAME] type. -The notation - -> *e* : *T* - -is a type assertion meaning *T* is the static type of [SHORTNAME] expression *e*. - -Note: A type assertion is a statement of fact about the text of a program. -It is not a runtime check. - -Finding static types for expressions can be performed by recursively applying type rules. -A type rule has two parts: -* A conclusion, stated as a type assertion for an expression. - The expression in the type assertion is specified schematically, - using *italicized* names to denote subexpressions - or other syntactically-determined parameters. -* Preconditions, consisting of: - * Type assertions for subexpressions, when there are subexpressions. - * Conditions on the other schematic parameters, if any. - * How the expression is used in a statement. - * Optionally, other static context. - -A type rule applies to an expression when: -* The rule's conclusion matches a valid parse of the expression, and -* The rule's preconditions are satisfied. - -TODO: write an example such as `1+2`, or `3 - a`, where `a` is in-scope of a let declaration with `i32` type. - -The type rules are designed so that if parsing succeeds, at most one type rule will apply to each expression. -If a type rule applies to an expression, then the conclusion is asserted, and therefore determines the static type of the expression. - -A [SHORTNAME] source program is well-typed when: -* The static type can be determined for each expression in the program by applying the type rules, and -* The type requirements for each statement are satisfied. - -Otherwise there is a type error and the source program is not a valid [SHORTNAME] program. - -[SHORTNAME] is a statically typed language -because type checking a [SHORTNAME] program will either succeed or -discover a type error, while only having to inspect the program source text. - -TODO(dneto): Lazy-decay is a tie-breaking rule. The above description can accomodate it by -using priority-levels on potentially-matching type rules. - -### Type rule tables ### {#typing-tables-section} - -The [SHORTNAME] [=type rules=] are organized into type rule tables, -with one row per type rule. - -The semantics of an expression is the effect of evaluating that expression, -and is primarily the production of a result value. -The *Description* column of the type rule that applies to an expression will specify the expression's semantics. -The semantics usually depends on the values of the type rule parameters, including -the assumed values of any subexpressions. -Sometimes the semantics of an expression includes effects other than producing -a result value, such as the non-result-value effects of its subexpressions. - -TODO: example: non-result-value effect is any side effect of a function call subexpression. - -For convenience, the type tables use the following shorthands: - - -
*Scalar*[=scalar=] types: one of bool, i32, u32, f32 -
*BoolVec*[[#vector-types]] with bool component -
*Int*i32 or u32 -
*IntVec*[[#vector-types]] with an *Int* component -
*Integral**Int* or [[#vector-types]] with an *Int* component -
*SignedIntegral*i32 or [[#vector-types]] with an i32 component -
*FloatVec*[[#vector-types]] with f32 component -
*Floating*f32 or *FloatVec* -
*Arity(T)*number of components in [[#vector-types]] *T* -
- -TODO(dneto): Do we still need all these shorthands? - -## Plain Types ## {#plain-types-section} - -Plain types are the types for representing boolean values, numbers, vectors, -matrices, or aggregations of such values. - -Note: Plain types in [SHORTNAME] are analogous to Plain-Old-Data types in C++. - -### Boolean Type ### {#bool-type} - -The bool type contains the values `true` and `false`. - -### Integer Types ### {#integer-types} - -The u32 type is the set of 32-bit unsigned integers. - -The i32 type is the set of 32-bit signed integers. -It uses a two's complementation representation, with the sign bit in the most significant bit position. - -### Floating Point Type ### {#floating-point-types} - -The f32 type is the set of 32-bit floating point values of the IEEE 754 binary32 (single precision) format. -See [[#floating-point-evaluation]] for details. - -### Scalar Types ### {#scalar-types} - -The scalar types are [=bool=], [=i32=], [=u32=], and [=f32=]. - -The numeric scalar types are [=i32=], [=u32=], and [=f32=]. - -### Vector Types ### {#vector-types} - -A vector is a grouped sequence of 2, 3, or 4 [=scalar=] components. - - - - -
TypeDescription -
vec*N*<*T*>Vector of *N* elements of type *T*. - *N* must be in {2, 3, 4} and *T* - must be one of the [=scalar=] types. - We say *T* is the component type of the vector. -
- -A vector is a numeric vector if its component type is a [=numeric scalar=]. - -Key use cases of a vector include: - -* to express both a direction and a magnitude. -* to express a position in space. -* to express a color in some color space. - For example, the components could be intensities of red, green, and blue, - while the fourth component could be an alpha (opacity) value. - -Many operations on vectors act component-wise, i.e. the result vector is -formed by operating on each component independently. - -
- - vec2<f32> // is a vector of two f32s. - -
- -### Matrix Types ### {#matrix-types} - -A matrix is a grouped sequence of 2, 3, or 4 floating point vectors. - - - - - -
TypeDescription -
mat|N|x|M|<f32> - Matrix of |N| columns and |M| rows, where |N| and |M| are both in {2, 3, 4}. - Equivalently, it can be viewed as |N| column vectors of type vec|M|<f32>. -
- -The key use case for a matrix is to embody a linear transformation. -In this interpretation, the vectors of a matrix are treated as column vectors. - -The product operator (`*`) is used to either: - -* scale the transformation by a scalar magnitude. -* apply the transformation to a vector. -* combine the transformation with another matrix. - -See [[#arithmetic-expr]]. - -
- - mat2x3<f32> // This is a 2 column, 3 row matrix of 32-bit floats. - // Equivalently, it is 2 column vectors of type vec3<f32>. - -
- -### Array Types ### {#array-types} - -An array is an indexable grouping of element values. - - - - -
TypeDescription -
array<|E|,|N|>An |N|-element array of elements of type |E|.
- |N| must be 1 or larger. -
array<|E|>A runtime-sized array of elements of type |E|, - also known as a runtime array. - These may only appear in specific contexts.
-
- - -The first element in an array is at index 0, and each successive element is at the next integer index. -See [[#array-access-expr]]. - -An array element type must be one of: -* a [=scalar=] type -* a vector type -* a matrix type -* an array type -* a [=structure=] type - -[SHORTNAME] defines the following attributes that can be applied to array types: -* [=attribute/stride=] - -Restrictions on runtime-sized arrays: -* The last member of the structure type defining the [=store type=] - for a variable in the [=storage classes/storage=] storage class may be a runtime-sized array. -* A runtime-sized array must not be used as the store type or contained within - a store type in any other cases. -* The type of an expression must not be a runtime-sized array type. - -Issue: (dneto): Complete description of `Array` - -### Structure Types ### {#struct-types} - -A structure is a grouping of named member values. - - - - - -
TypeDescription -
struct<|T|1,...,|T|N> - An ordered tuple of *N* members of types - |T|n through |T|N, with |N| being an integer greater than 0. - A structure type declaration specifies an identifier name for each member. - Two members of the same structure type must not have the same name. -
- -A structure member type must be one of: -* a [=scalar=] type -* a vector type -* a matrix type -* an array type -* a [=structure=] type - -Note: The structure member type restriction and the array element type restriction are -mutually reinforcing. -Combined, they imply that a pointer may not appear in any level -of nesting within either an array or structure. -Similarly, the same limitations apply to textures and samplers. - -
- - // A structure with two members. - struct Data { - a : i32; - b : vec2<f32>; - }; - -
- -
-struct_decl
-  : attribute_list* STRUCT IDENT struct_body_decl
-
- -
-struct_body_decl
-  : BRACE_LEFT struct_member* BRACE_RIGHT
-
-struct_member
-  : attribute_list* variable_ident_decl SEMICOLON
-
- -[SHORTNAME] defines the following attributes that can be applied to structure types: - * [=attribute/block=] - -[SHORTNAME] defines the following attributes that can be applied to structure members: - * [=attribute/builtin=] - * [=attribute/location=] - * [=attribute/stride=] - * [=attribute/align=] - * [=attribute/size=] - -Note: Layout attributes may be required if the structure type is used -to define a [=uniform buffer=] or a [=storage buffer=]. See [[#memory-layouts]]. - -
- - struct my_struct { - a : f32; - b : vec4<f32>; - }; - -
- -
- - OpName %my_struct "my_struct" - OpMemberName %my_struct 0 "a" - OpMemberDecorate %my_struct 0 Offset 0 - OpMemberName %my_struct 1 "b" - OpMemberDecorate %my_struct 1 Offset 4 - %my_struct = OpTypeStruct %float %v4float - -
- -
- - // Runtime Array - type RTArr = [[stride(16)]] array<vec4<f32>>; - [[block]] struct S { - a : f32; - b : f32; - data : RTArr; - }; - -
- -
- - OpName %my_struct "my_struct" - OpMemberName %my_struct 0 "a" - OpMemberDecorate %my_struct 0 Offset 0 - OpMemberName %my_struct 1 "b" - OpMemberDecorate %my_struct 1 Offset 4 - OpMemberName %my_struct 2 "data" - OpMemberDecorate %my_struct 2 Offset 16 - OpDecorate %rt_arr ArrayStride 16 - %rt_arr = OpTypeRuntimeArray %v4float - %my_struct = OpTypeStruct %float %v4float %rt_arr - -
- -### Composite Types ### {#composite-types} - -A type is composite if it has internal structure -expressed as a composition of other types. -The internal parts do not overlap, and are called components. - -The composite types are: - -* [=vector=] type -* [=matrix=] type -* [=array=] type -* [=structure=] type - -A [=plain type|plain=] type is either [=composite=] or [=scalar=]. - -## Memory ## {#memory} - -In [SHORTNAME], a value of [=storable=] type may be stored in memory, for later retrieval. -This section describes the structure of memory, and how [SHORTNAME] types are used to -describe the contents of memory. - -In general [SHORTNAME] follows the [[!VulkanMemoryModel|Vulkan Memory Model]]. - -### Memory Locations ### {#memory-locations-section} - -Memory consists of a set of distinct memory locations. -Each memory location is 8-bits -in size. An operation affecting memory interacts with a set of one or more -memory locations. - -Two sets of memory locations overlap if the intersection of -their sets of memory locations is non-empty. Each variable declaration has a -set of memory locations that does not overlap with the sets of memory locations of -any other variable declaration. Memory operations on structures and arrays may -access padding between elements, but must not access padding at the end of the -structure or array. - -### Storable Types ### {#storable-types} - -The following types are storable: - -* [[#scalar-types]] -* [[#vector-types]] -* [[#matrix-types]] -* [[#array-types]] if its element type is storable. -* [[#struct-types]] if all its members are storable. -* [[#atomic-types]] - -### IO-shareable Types ### {#io-shareable-types} - -The following types are IO-shareable: - -* [=scalar=] types -* [=numeric vector=] types -* [[#matrix-types]] -* [[#array-types]] if its element type is IO-shareable, and the array is not [=runtime-sized=] -* [[#struct-types]] if all its members are IO-shareable - -The following kinds of values must be of IO-shareable type: - -* Values read from or written to built-in variables. -* Values accepted as inputs from an upstream pipeline stage. -* Values written as output for downstream processing in the pipeline, or to an output attachment. - -Note: Only built-in pipeline inputs may have a boolean type. -A user input or output data attribute must not be of [=bool=] type or contain a [=bool=] type. -See [[#pipeline-inputs-outputs]]. - -### Host-shareable Types ### {#host-shareable-types} - -Host-shareable types are used to describe the contents of buffers which are shared between -the host and the GPU, or copied between host and GPU without format translation. -When used for this purpose, the type must be additionally decorated with layout attributes -as described in [[#memory-layouts]]. -We will see in [[#module-scope-variables]] that the [=store type=] of [=uniform buffer=] and [=storage buffer=] -variables must be host-shareable. - -The following types are host-shareable: - -* [=numeric scalar=] types -* [=numeric vector=] types -* [[#matrix-types]] -* [[#array-types]] if the array element type is host-shareable -* [[#struct-types]] if each member is host-shareable -* [[#atomic-types]] - -[SHORTNAME] defines the following attributes that affect memory layouts: - * [=attribute/stride=] - * [=attribute/align=] - * [=attribute/size=] - -Note: An [=IO-shareable=] type *T* would also be host-shareable if *T* and its subtypes have -appropriate [=stride=] attributes, and if *T* is not [=bool=] and does not contain a [=bool=]. -Additionally, a [=runtime-sized=] array is host-shareable but is not IO-shareable. - -Note: Both IO-shareable and host-shareable types have concrete sizes, but counted differently. -IO-shareable types are sized by a location-count metric, see [[#input-output-locations]]. -Host-shareable types are sized by a byte-count metric, see [[#memory-layouts]]. - -### Storage Classes ### {#storage-class} - -Memory locations are partitioned into storage classes. -Each storage class has unique properties determining -mutability, visibility, the values it may contain, -and how to use variables with it. - - - - - -
Storage Classes
Storage class - Readable by shader?
Writable by shader? -
Sharing among invocations - Variable scope - Restrictions on stored values - Notes -
function - Read-write - Same invocation only - [=Function scope=] - [=Storable=] - -
private - Read-write - Same invocation only - [=Module scope=] - [=Storable=] - -
workgroup - Read-write - Invocations in the same [=compute shader stage|compute shader=] [=compute shader stage/workgroup=] - [=Module scope=] - [=Storable=] - -
uniform - Read-only - Invocations in the same [=shader stage=] - [=Module scope=] - [=Host-shareable=] - For [=uniform buffer=] variables -
storage - Readable.
- Also writable if the variable is not read-only. -
Invocations in the same [=shader stage=] - [=Module scope=] - [=Host-shareable=] - For [=storage buffer=] variables -
handle - Read-only - Invocations in the same shader stage - [=Module scope=] - Opaque representation of handle to a sampler or texture - Used for sampler and texture variables
- The token `handle` is reserved: it is never used in a [SHORTNAME] program. -
- -Issue: The note about read-only [=storage classes/storage=] variables may change depending -on the outcome of https://github.com/gpuweb/gpuweb/issues/935 - -
-storage_class
-  : IN
-  | OUT
-  | FUNCTION
-  | PRIVATE
-  | WORKGROUP
-  | UNIFORM
-  | STORAGE
-
- - - - -
WGSL storage classSPIR-V storage class -
uniformUniform -
workgroupWorkgroup -
handleUniformConstant -
storageStorageBuffer -
privatePrivate -
functionFunction -
- - -### Memory Layout ### {#memory-layouts} - -[=Uniform buffer=] and [=storage buffer=] variables are used to share -bulk data organized as a sequence of bytes in memory. -Buffers are shared between the CPU and the GPU, or between different shader stages -in a pipeline, or between different pipelines. - -Because buffer data are shared without reformatting or translation, -buffer producers and consumers must agree on the memory layout, -which is the description of how the bytes in a buffer are organized into typed [SHORTNAME] values. - -The [=store type=] of a buffer variable must be [=host-shareable=], with fully elaborated memory layout, as described below. - -Each buffer variable must be declared in either the [=storage classes/uniform=] or [=storage classes/storage=] storage classes. - -The memory layout of a type is significant only when evaluating an expression with: -* a variable in the [=storage classes/uniform=] or [=storage classes/storage=] storage class, or -* a pointer into the [=storage classes/uniform=] or [=storage classes/storage=] storage class. - -An 8-bit byte is the most basic unit of [=host-shareable=] memory. -The terms defined in this section express counts of 8-bit bytes. - -We will use the following notation: -* AlignOf(|T|) is the alignment of host-shareable type |T|. -* AlignOf(|S|, |M|) is the alignment of member |M| of the host-shareable structure |S|. -* SizeOf(|T|) is the size of host-shareable type |T|. -* SizeOf(|S|, |M|) is the size of member |M| of the host-shareable structure |S|. -* StrideOf(|A|) is the element stride of host-shareable array type |A|. -* OffsetOf(|S|, |M|) is the offset of member |M| from the start of the host-shareable structure |S|. - - -#### Alignment and Size #### {#alignment-and-size} - -Each [=host-shareable=] data type has a default alignment and size value. -The alignment and size values for a given structure member can differ from the -defaults if the [=attribute/align=] and / or [=attribute/size=] decorations are used. - -Alignment guarantees that a value's address in memory will be a multiple of the -specified value. This can enable more efficient hardware instructions to be used -to access the value or satisfy more restrictive hardware requirements on certain -storage classes (see [storage class constraints](#storage-class-constraints)). - -Note: Each alignment value is always a power of two, by construction. - -The size of a type or structure member is the number of contiguous bytes -reserved in host-shareable memory for the purpose of storing a value of the type -or structure member. -The size may include non-addressable padding at the end of the type. -Consequently, loads and stores of a value might access fewer memory locations -than the value's size. - -Alignment and size for host-shareable types are defined recursively in the -following table: - - - - - -
- Default alignment and size for host-shareable types
-
Host-shareable type |T| - [=AlignOf=](|T|) - [=SizeOf=](|T|) -
[=i32=], [=u32=], or [=f32=] - 4 - 4 -
vec2<|T|> - 8 - 8 -
vec3<|T|> - 16 - 12 -
vec4<|T|> - 16 - 16 -
mat|N|x|M| (col-major)
-

(General form)

-
[=AlignOf=](vec|M|) - [=SizeOf=](array<vec|M|, |N|>) -
mat2x2<f32> - 8 - 16 -
mat3x2<f32> - 8 - 24 -
mat4x2<f32> - 8 - 32 -
mat2x3<f32> - 16 - 32 -
mat3x3<f32> - 16 - 48 -
mat4x3<f32> - 16 - 64 -
mat2x4<f32> - 16 - 32 -
mat3x4<f32> - 16 - 48 -
mat4x4<f32> - 16 - 64 -
struct |S| - max([=AlignOf=](S, M1), ... , [=AlignOf=](S, Mn))
-
[=roundUp=]([=AlignOf=](|S|), [=OffsetOf=](|S|, |L|) + [=SizeOf=](|S|, |L|))

- Where |L| is the last member of the structure -
array<|E|, |N|>
-

(Implicit stride)

-
[=AlignOf=](|E|) - |N| * [=roundUp=]([=AlignOf=](|E|), [=SizeOf=](|E|)) -
array<|E|>
-

(Implicit stride)

-
[=AlignOf=](|E|) - Nruntime * [=roundUp=]([=AlignOf=](|E|), [=SizeOf=](|E|))

- Where Nruntime is the runtime-determined number of elements of |T| -
[[[=stride=](|Q|)]]
array<|E|, |N|> -
[=AlignOf=](|E|) - |N| * |Q| -
[[[=stride=](|Q|)]]
array<|E|> -
[=AlignOf=](|E|) - Nruntime * |Q| -
atomic<|T|> - [=AlignOf=](|T|) - [=SizeOf=](|T|) -
- - -#### Structure Layout Rules #### {#structure-layout-rules} - -Each structure member has a default size and alignment value. These values are -used to calculate each member's byte offset from the start of the structure. - -Structure members will use their type's size and alignment, unless the -structure member is explicitly annotated with [=attribute/size=] and / or -[=attribute/align=]. decorations, in which case those member decorations take -precedence. - -The first structure member always has a zero byte offset from the start of the -structure. - -Subsequent members have the following byte offset from the start of the structure: -

- [=OffsetOf=](|S|, MN) = [=roundUp=]([=AlignOf=](|S|, MN), [=OffsetOf=](|S|, MN-1) + [=SizeOf=](|S|, MN-1)
- Where MN is the current member and MN-1 is the previous member -

- -Structure members must not overlap. If a structure member is decorated with the -[=attribute/size=] attribute, the value must be at least as large as the -default size of the member's type. - -The alignment of a structure is equal to the largest alignment of all of its -members: -

- [=AlignOf=](|S|) = max([=AlignOf=](|S|, M1), ... , [=AlignOf=](|S|, MN)) -

- -The size of a structure is equal to the offset plus the size of its last member, -rounded to the next multiple of the structure's alignment: -

- [=SizeOf=](|S|) = [=roundUp=]([=AlignOf=](|S|), [=OffsetOf=](|S|, |L|) + [=SizeOf=](|S|, |L|))
- Where |L| is the last member of the structure -

- -
- - [[block]] struct A { // align(8) size(24) - u : f32; // offset(0) align(4) size(4) - v : f32; // offset(4) align(4) size(4) - w : vec2<f32>; // offset(8) align(8) size(8) - x : f32; // offset(16) align(4) size(4) - // -- implicit struct size padding -- // offset(20) size(4) - }; - - [[block]] struct B { // align(16) size(160) - a : vec2<f32>; // offset(0) align(8) size(8) - // -- implicit member alignment padding -- // offset(8) size(8) - b : vec3<f32>; // offset(16) align(16) size(12) - c : f32; // offset(28) align(4) size(4) - d : f32; // offset(32) align(4) size(4) - // -- implicit member alignment padding -- // offset(36) size(12) - e : A; // offset(40) align(8) size(24) - f : vec3<f32>; // offset(64) align(16) size(12) - // -- implicit member alignment padding -- // offset(76) size(4) - g : array<A, 3>; // offset(80) align(8) size(72) stride(24) - h : i32; // offset(152) align(4) size(4) - // -- implicit struct size padding -- // offset(156) size(4) - }; - - [[group(0), binding(0)]] - var<storage> storage_buffer : [[access(read_write)]] B; - -
- -
- - [[block]] struct A { // align(8) size(32) - u : f32; // offset(0) align(4) size(4) - v : f32; // offset(4) align(4) size(4) - w : vec2<f32>; // offset(8) align(8) size(8) - [[size(16)]] x: f32; // offset(16) align(4) size(16) - }; - - [[block]] struct B { // align(16) size(208) - a : vec2<f32>; // offset(0) align(8) size(8) - // -- implicit member alignment padding -- // offset(8) size(8) - b : vec3<f32>; // offset(16) align(16) size(12) - c : f32; // offset(28) align(4) size(4) - d : f32; // offset(32) align(4) size(4) - // -- implicit member alignment padding -- // offset(36) size(12) - [[align(16)]] e : A; // offset(48) align(16) size(32) - f : vec3<f32>; // offset(80) align(16) size(12) - // -- implicit member alignment padding -- // offset(92) size(4) - g : [[stride(32)]] array<A, 3>; // offset(96) align(8) size(96) - h : i32; // offset(192) align(4) size(4) - // -- implicit struct size padding -- // offset(196) size(12) - }; - - [[group(0), binding(0)]] - var<uniform> uniform_buffer : B; - -
- -#### Array Layout Rules #### {#array-layout-rules} - -An array element stride is the number of bytes from the start of one array -element to the start of the next element. - -The first array element always has a zero byte offset from the start of the -array. - -If the array type is annotated with an explicit [=stride=] decoration then this -will be used as the array stride, otherwise the array uses an implicit stride -equal to the size of the array's element type, rounded up to the alignment of -the element type: - -

- [=StrideOf=](array<|T|[, |N|]>) = [=roundUp=]([=AlignOf=](T), [=SizeOf=](T)) -

- -In all cases, the array stride must be a multiple of the element alignment. - -
- - // Array with an implicit element stride of 16 bytes - var implicit_stride : array<vec3<f32>, 8>; - - // Array with an explicit element stride of 32 bytes - var explicit_stride : [[stride(32)]] array<vec3<f32>, 8>; - -
- -Arrays decorated with the [=stride=] attribute must have a stride that is at -least the size of the element type, and be a multiple of the element type's -alignment value. - -The array size is equal to the element stride multiplied by the number of -elements: -

- [=SizeOf=](array<|T|, |N|>) = [=StrideOf=](array<|T|, |N|>) × |N|
- [=SizeOf=](array<|T|>) = [=StrideOf=](array<|T|>) × Nruntime -

- -The array alignment is equal to the element alignment: -

- [=AlignOf=](array<|T|[, N]>) = [=AlignOf=](|T|) -

- -For example, the layout for a `[[stride(S)]] array` type is equivalent to -the following structure: - -
- - struct Array { - [[size(S)]] element_0 : T; - [[size(S)]] element_1 : T; - [[size(S)]] element_2 : T; - }; - -
- -#### Internal Layout of Values #### {#internal-value-layout} - -This section describes how the internals of a value are placed in the byte locations -of a buffer, given an assumed placement of the overall value. -These layouts depend on the value's type, the [=attribute/stride=] attribute on -array types, and the [=attribute/align=] and [=attribute/size=] attributes on -structure type members. - -The data will appear identically regardless of storage class. - -When a value |V| of type [=u32=] or [=i32=] is placed at byte offset |k| of a -host-shared buffer, then: - * Byte |k| contains bits 0 through 7 of |V| - * Byte |k|+1 contains bits 8 through 15 of |V| - * Byte |k|+2 contains bits 16 through 23 of |V| - * Byte |k|+3 contains bits 24 through 31 of |V| - -Note: Recall that [=i32=] uses twos-complement representation, so the sign bit -is in bit position 31. - -A value |V| of type [=f32=] is represented in IEEE 754 binary32 format. -It has one sign bit, 8 exponent bits, and 23 fraction bits. -When |V| is placed at byte offset |k| of host-shared buffer, then: - * Byte |k| contains bits 0 through 7 of the fraction. - * Byte |k|+1 contains bits 8 through 15 of the fraction. - * Bits 0 through 6 of byte |k|+2 contain bits 16 through 23 of the fraction. - * Bit 7 of byte |k|+2 contains bit 0 bit of the exponent. - * Bits 0 through 6 of byte |k|+3 contain bits 1 through 7 of the exponent. - * Bit 7 of byte |k|+3 contains the sign bit. - -Note: The above rules imply that numeric values in host-shared buffers -are stored in little-endian format. - -When a value |V| of vector type vec|N|<|T|> is placed at -byte offset |k| of a host-shared buffer, then: - * |V|.x is placed at byte offset |k| - * |V|.y is placed at byte offset |k|+4 - * If |N| ≥ 3, then |V|.z is placed at byte offset |k|+8 - * If |N| ≥ 4, then |V|.w is placed at byte offset |k|+12 - -When a matrix value |M| is placed at byte offset |k| of a host-shared memory buffer, then: - * If |M| has 2 rows, then: - * Column vector |i| of |M| is placed at byte offset |k| + 8 × |i| - * If |M| has 3 or 4 rows, then: - * Column vector |i| of |M| is placed at byte offset |k| + 16 × |i| - -When a value of array type |A| is placed at byte offset |k| of a host-shared memory buffer, -then: - * Element |i| of the array is placed at byte offset |k| + |i| × |Stride|(|A|) - -When a value of structure type |S| is placed at byte offset |k| of a host-shared memory buffer, -then: - * The |i|'th member of the structure value is placed at byte offset |k| + [=OffsetOf=](|S|,|i|) - - -#### Storage Class Constraints #### {#storage-class-constraints} - -The [=storage classes/storage=] and [=storage classes/uniform=] storage classes -have different buffer layout constraints which are described in this section. - -All structure and array types directly or indirectly referenced by a variable -must obey the constraints of the variable's storage class. -Violations of a storage class constraint result in a compile-time error. - -In this section we define RequiredAlignOf(|S|, |C|) as the -required alignment of host-shareable type |S| when used by storage class |C|. - - - - - - - - - -
- Alignment requirements of a host-shareable type for - [=storage classes/storage=] and [=storage classes/uniform=] storage classes -
Host-shareable type |S| - [=RequiredAlignOf=](|S|, [=storage classes/storage=]) - [=RequiredAlignOf=](|S|, [=storage classes/uniform=]) -
[=i32=], [=u32=], or [=f32=] - [=AlignOf=](|S|) - [=AlignOf=](|S|) -
vec|N|<`T`> - [=AlignOf=](|S|) - [=AlignOf=](|S|) -
mat|N|x|M|<f32> - [=AlignOf=](|S|) - [=AlignOf=](|S|) -
array<|T|,|N|> - [=AlignOf=](|T|) - [=roundUp=](16, [=AlignOf=](|T|)) -
array<|T|> - [=AlignOf=](|T|) - [=roundUp=](16, [=AlignOf=](|T|)) -
struct<T0, ..., TN> - max([=AlignOf=](T0), ..., [=AlignOf=](TN)) - [=roundUp=](16, max([=AlignOf=](T0), ..., [=AlignOf=](TN)))
-
atomic<|T|> - [=AlignOf=](|T|) - [=AlignOf=](|T|) -
- -All structure members of type |T| must have a byte offset from the start of the -structure that is a multiple of the [=RequiredAlignOf=](|T|, |C|) for the storage -class |C|: - -

- [=OffsetOf=](|S|, |M|) = |k| × [=RequiredAlignOf=](|T|, C)
- Where |k| is a non-negative integer and |M| is a member of structure |S| with type |T| -

- -All arrays of element type |T| must have an element [=stride=] that is a -multiple of [=RequiredAlignOf=](|T|, |C|) for the storage class |C|: - -

- [=StrideOf=](array<|T|[, |N|]>) = |k| × [=RequiredAlignOf=](|T|, C)
- Where |k| is a non-negative integer -

- - -The [=storage classes/uniform=] storage class also requires that: - -* Array elements are aligned to 16 byte boundaries. -* Structures must have a size that is divisible by 16 bytes. - -Note: When underlying the target is a Vulkan device, we assume the device does -not support the `scalarBlockLayout` feature. -Therefore, a data value must not be placed in the padding at the end of a structure or matrix, -nor in the padding at the last element of an array. -Counting such padding as part of the size allows [SHORTNAME] to capture this constraint. - -## Memory View Types ## {#memory-view-types} - -In addition to calculating with [=plain types|plain=] values, a [SHORTNAME] program will -also often read values from memory or write values to memory. -Operations that read or write to memory are called memory accesses. -Each memory access is performed via a [=memory view=]. - -A memory view is a set of [=memory locations=] in a particular [=storage class=], -together with an interpretation of the contents of those locations as a [SHORTNAME] [=type=]. - - -[SHORTNAME] has two kinds of types for representing memory views: -[=reference types=] and [=pointer types=]. - - - - - - -
ConstraintTypeDescription -
|SC| is a [=storage class=],
|T| is a [=storable=] type -
ref<|SC|,|T|> - The reference type - identified with the set of [=memory views=] for memory locations in |SC| holding values of type |T|.
- In this context |T| is known as the store type.
- Reference types are not written [SHORTNAME] progam source; instead they are used to analyze a [SHORTNAME] program. -
|SC| is a [=storage class=],
|T| is a [=storable=] type -
ptr<|SC|,|T|> - The pointer type - identified with the set of [=memory views=] for memory locations in |SC| holding values of type |T|.
- In this context |T| is known as the pointee type.
- Pointer types appear in [SHORTNAME] progam source. -
- -
- - fn my_function( - // 'ptr<function,i32>' is the type of a pointer value that references storage - // for keeping an 'i32' value, using memory locations in the 'function' storage - // class. Here 'i32' is the pointee type. - ptr_int: ptr<function,i32>, - - // 'ptr<private,array<f32,50>>' is the type of a pointer value that refers to - // storage for keeping an array of 50 elements of type 'f32', using memory - // locations in the 'private' storage class. - // Here the pointee type is 'array<f32,50>'. - ptr_array: ptr<private, array<f32, 50>> - ) { } - -
- -Reference types and pointer types are both sets of memory views: -a particular memory view is associated with a unique reference value and also a unique pointer value: - -
-Each pointer value |p| of type ptr<|SC|,|T|> corresponds to a unique reference value |r| of type ref<|SC|,|T|>, -and vice versa, -where |p| and |r| describe the same memory view. -
- -In [SHORTNAME] a reference value always corresponds to the memory view -for some or all of the memory locations for some variable. -This defines the originating variable for the reference value. -A pointer value always corresponds to a reference value, and so the originating variable -of a pointer is the same as the originating variable of the corresponding reference. - -Note: The originating variable is a dynamic concept. -The originating variable for a formal parameter of a function depends on the call sites for the function. -Different call sites may supply pointers into different originating variables. - -References and pointers are distinguished by how they are used: - -* The type of a [=variable=] is a reference type. -* The [=address-of=] operation (unary `&`) converts a reference value to its corresponding pointer value. -* The [=indirection=] operation (unary `*`) converts a pointer value to its corresponding reference value. -* A const declaration can be of pointer type, but not of reference type. -* A [=formal parameter=] can be of pointer type, but not of reference type. -* An [=assignment statement=] updates the contents of memory via a reference: - * The left-hand side of the assignment statement must be of reference type. - * The right-hand side of the assignment statement must evaluate to the store type of the left-hand side. -* The Load Rule: Inside a function, a reference is automatically dereferenced (read from) to satisfy type rules: - * In a function, when a reference expression |r| with store type |T| is used in a statement or an expression, where - * The only potentially matching type rules require |r| to have a value of type |T|, then - * That type rule requirement is considered to have been met, and - * The result of evaluating |r| in that context is the value (of type |T|) stored in the memory locations - referenced by |r| at the time of evaluation. - -Defining references in this way enables simple idiomatic use of variables: - -
- - [[stage(compute)]] - fn main() { - // 'i' has reference type ref<function,i32> - // The memory locations for 'i' store the i32 value 0. - var i: i32 = 0; - - // 'i + 1' can only match a type rule where the 'i' subexpression is of type i32. - // So the expression 'i + 1' has type i32, and at evaluation, the 'i' subexpression - // evaluates to the i32 value stored in the memory locations for 'i' at the time - // of evaluation. - const one: i32 = i + 1; - - // Update the value in the locations referenced by 'i' so they hold the value 2. - i = one + 1; - - // Update the value in the locations referenced by 'i' so they hold the value 5. - // The evaluation of the right-hand-side occurs before the assignment takes effect. - i = i + 3; - } - -
- -
- - var<private> age: i32; - fn get_age() -> i32 { - // The type of the expression in the return statement must be 'i32' since it - // must match the declared return type of the function. - // The 'age' expression is of type ref<private,i32>. - // Apply the Load Rule, since the store type of the reference matches the - // required type of the expression, and no other type rule applies. - // The evaluation of 'age' in this context is the i32 value loaded from the - // memory locations referenced by 'age' at the time the return statement is - // executed. - return age; - } - - fn caller() { - age = 21; - // The copy_age constant will get the i32 value 21. - const copy_age: i32 = get_age(); - } - -
- -Defining pointers in this way enables two key use cases: - -* Using a const declaration with pointer type, to form a short name for part of the contents of a variable. -* Using a formal parameter of a function to refer to the storage of a variable that is accessible to the calling function. - * The call to such a function must supply a pointer value for that operand. - This often requires using the unary `&` operation to get a pointer to the variable's contents. - -Note: The following examples use [SHORTNAME] features explained later in this specification. - -
- - struct Particle { - position: vec3<f32>; - velocity: vec3<f32>; - }; - [[block]] struct System { - active_index: i32; - timestep: f32; - particles: array<Particle,100>; - }; - [[group(0), binding(0)]] var<storage> system: [[access(read_write)]] System; - - [[stage(compute)]] - fn main() { - // Form a pointer to a specific Particle in storage memory. - const active_particle: ptr<storage,Particle> = - &system.particles[system.active_index]; - - const delta_position: vec3<f32> = (*active_particle).velocity * system.timestep; - const current_position: vec3<f32> = (*active_particle).position; - (*active_particle).position = delta_position + current_position; - } - -
- -
- - fn add_one(x: ptr<function,i32>) { - // Update the locations for 'x' to contain the next higher integer value, - // (or to wrap around to the largest negative i32 value). - // On the left-hand side, unary '*' converts the pointer to a reference that - // can then be assigned to. - // On the right-hand side: - // - Unary '*' converts the pointer to a reference - // - The only matching type rule is for addition (+) and requires '*x' to - // have type i32, which is the store type for '*x'. So the Load Rule - // applies and '*x' evaluates to the value stored in the memory for '*x' - // at the time of evaluation, which is the i32 value for 0. - // - Add 1 to 0, to produce a final value of 1 for the right-hand side. - // Store 1 into the memory for '*x'. - *x = *x + 1; - } - - [[stage(compute)]] - fn main() { - var i: i32 = 0; - - // Modify the contents of 'i' so it will contain 1. - // Use unary '&' to get a pointer value for 'i'. - // This is a clear signal that the called function has access to the storage - // for 'i', and may modify it. - add_one(&i); - const one: i32 = i; // 'one' has value 1. - } - -
- -### Forming reference and pointer values ### {#forming-references-and-pointers} - -A reference value is formed in one of the following ways: - -* The identifer [=resolves|resolving=] to an [=in scope|in-scope=] variable *v* denotes the reference value for *v*'s storage. - * The resolved variable is the [=originating variable=] for the reference. -* Use the [=indirection=] (unary `*`) operation on a pointer. - * The originating variable of the result is defined as the originating variable of the pointer. -* Use a composite reference component expression. - In each case the originating variable of the result is defined as the originating variable of the - original reference. - * Given a reference with a vector store type, appending a single-letter vector access phrase - results in a reference to the named component of the vector. - See [[#component-reference-from-vector-reference]]. - * Given a reference with a vector store type, appending an array index access phrase - results in a reference to the indexed component of the vector. - See [[#component-reference-from-vector-reference]]. - * Given a reference with a matrix store type, appending an array index access phrase - results in a reference to the indexed column vector of the matrix. - See [[#matrix-access-expr]]. - * Given a reference with an array store type, appending an array index access phrase - results in a reference to the indexed element of the array. - See [[#array-access-expr]]. - * Given a reference with a structure store type, appending a member access phrase - results in a reference to the named member of the structure. - See [[#struct-access-expr]]. - -
- - struct S { - age: i32; - weight: f32; - }; - var<private> person : S; - - fn f() { - var uv: vec2<f32>; - // Evaluate the left-hand side of the assignment: - // Evaluate 'uv.x' to yield a reference: - // 1. First evaluate 'uv', yielding a reference to the storage for - // the 'uv' variable. The result has type ref<function,vec2<f32>>. - // 2. Then apply the '.x' vector access phrase, yielding a reference to - // the storage for the first component of the vector pointed at by the - // reference value from the previous step. - // The result has type ref<function,f32>. - // Evaluating the right-hand side of the assignment yields the f32 value 1.0. - // Store the f32 value 1.0 into the storage memory locations referenced by uv.x. - uv.x = 1.0; - - // Evaluate the left-hand side of the assignment: - // Evaluate 'uv[1]' to yield a reference: - // 1. First evaluate 'uv', yielding a reference to the storage for - // the 'uv' variable. The result has type ref<function,vec2<f32>>. - // 2. Then apply the '[1]' array index phrase, yielding a reference to - // the storage for second component of the vector referenced from - // the previous step. The result has type ref<function,f32>. - // Evaluating the right-hand side of the assignment yields the f32 value 2.0. - // Store the f32 value 2.0 into the storage memory locations referenced by uv[1]. - uv[1] = 2.0; - - var m: mat3x2<f32>; - // When evaluating 'm[2]': - // 1. First evaluate 'm', yielding a reference to the storage for - // the 'm' variable. The result has type ref<function,mat3x2<f32>>. - // 2. Then apply the '[2]' array index phrase, yielding a reference to - // the storage for the third column vector pointed at by the reference - // value from the previous step. - // Therefore the 'm[2]' expression has type ref<function,vec2<f32>>. - // The 'const' declaration is for type vec2<f32>, so the declaration - // statement requires the initializer to be of type vec2<f32>. - // The Load Rule applies (because no other type rule can apply), and - // the evaluation of the initializer yields the vec2<f32> value loaded - // from the memory locations referenced by 'm[2]' at the time the declaration - // is executed. - const p_m_col2: vec2<f32> = m[2]; - - var A: array<i32,5>; - // When evaluating 'A[4]' - // 1. First evaluate 'A', yielding a reference to the storage for - // the 'A' variable. The result has type ptr<function,array<i32,5>>. - // 2. Then apply the '[4]' array index phrase, yielding a reference to - // the storage for the fifth element of the array referenced by - // the reference value from the previous step. - // The result value has type ref<function,i32>. - // The const declaration requires the right-hand-side to be of type i32. - // The Load Rule applies (because no other type rule can apply), and - // the evaluation of the initializer yields the i32 value loaded from - // the memory locations referenced by 'A[5]' at the time the declaration - // is executed. - const A_4_value: i32 = A[4]; - - // When evaluating 'person.weight' - // 1. First evaluate 'person', yielding a reference to the storage for - // the 'person' variable declared at module scope. - // The result has type ref<private,S>. - // 2. Then apply the '.weight' member access phrase, yielding a reference to - // the storage for the second member of the memory referenced by - // the reference value from the previous step. - // The result has type ref<private,f32>. - // The const declaration requires the right-hand-side to be of type f32. - // The Load Rule applies (because no other type rule can apply), and - // the evaluation of the initializer yields the f32 value loaded from - // the memory locations referenced by 'person.weight' at the time the - // declaration is executed. - const person_weight: f32 = person.weight; - } - -
- -A pointer value is formed in one of the following ways: - -* Use the [=address-of=] (unary '&') operator on a reference. - * The originating variable of the result is defined as the originating variable of the reference. -* If a function [=formal parameter=] has pointer type, then when the function is invoked - at runtime the uses of the formal parameter denote the pointer value - provided to the corresponding operand at the call site in the calling function. - * The originating variable of the formal parameter (at runtime) is defined as - the originating variable of the pointer operand at the call site. - -
- - // Declare a variable in the private storage class, for storing an f32 value. - var<private> x: f32; - - fn f() { - // Declare a variable in the function storage class, for storing an i32 value. - var y: i32; - - // The name 'x' resolves to the module-scope variable 'x', - // and has reference type ref<private,f32>. - // Applying the unary '&' operator converts the reference to a pointer. - const x_ptr: ptr<private,f32> = &x; - - // The name 'y' resolves to the function-scope variable 'y', - // and has reference type ref<private,i32>. - // Applying the unary '&' operator converts the reference to a pointer. - const y_ptr: ptr<function,i32> = &y; - - // A new variable, distinct from the variable declared at module scope. - var x: u32; - - // Here, the name 'x' resolves to the function-scope variable 'x' declared in - // the previous statement, and has type ref<function,u32>. - // Applying the unary '&' operator converts the reference to a pointer. - const inner_x_ptr: ptr<function,u32> = &x; - } - -
- - -### Comparison with references and pointers in other languages ### {#pointers-other-languages} - -This section is informative, not normative. - -References and pointers in [SHORTNAME] are more restricted than in other languages. -In particular: - -* In [SHORTNAME] a reference can't directly be declared as an alias to another reference or variable, - either as a variable or as a formal parameter. -* In [SHORTNAME] pointers and references are not [=storable=]. - That is, the content of a [SHORTNAME] variable may not contain a pointer or a reference. -* In [SHORTNAME] a function must not return a pointer or reference. -* In [SHORTNAME] there is no way to convert between integer values and pointer values. -* In [SHORTNAME] there is no way to forcibly change the type of a pointer value into another pointer type. - * A composite component reference expression is different: - it takes a reference to a composite value and yields a reference to - one of the components or elements inside the composite value. - These are considered different references in [SHORTNAME], even though they may - have the same machine address at a lower level of implementation abstraction. -* In [SHORTNAME] there is no way to forcibly change the type of a reference value into another reference type. -* In [SHORTNAME] there is no way to allocate new storage from a "heap". -* In [SHORTNAME] there is no way to explicitly destroy a variable. - The storage for a [SHORTNAME] variable becomes inaccessible only when the variable goes out of scope. - -Note: From the above rules, it is not possible to form a "dangling" pointer, -i.e. a pointer that does not reference the storage for a valid (or "live") -originating variable. - -## Texture and Sampler Types ## {#texture-types} - -A texel is a scalar or vector used as the smallest independently accessible element of a [=texture=]. -The word *texel* is short for *texture element*. - -A texture is a collection of texels supporting special operations useful for rendering. -In [SHORTNAME], those operations are invoked via texture builtin functions. -See [[#texture-builtin-functions]] for a complete list. - -A [SHORTNAME] texture corresponds to a [[WebGPU#gputexture|WebGPU GPUTexture]]. - -A texture is either arrayed, or non-arrayed: - -* A non-arrayed texture is a grid of texels. Each texel has a unique grid coordinate. -* An arrayed texture is a homegeneous array of grids of texels. - In an arrayed texture, each texel is identified with its unique combination of array index and grid coordinate. - -A texture has the following features: - -: texel format -:: The data in each texel. See [[#texel-formats]] -: dimensionality -:: The number of dimensions in the grid coordinates, and how the coordinates are interpreted. - The number of dimensions is 1, 2, or 3. - In some cases the third coordinate is decomposed so as to specify a cube face and a layer index. -: size -:: The extent of grid coordinates along each dimension -: mipmap levels -:: The mipmap level count is at least 1 for sampled textures, and equal to 1 for storage textures.
- Mip level 0 contains a full size version of the texture. - Each successive mip level contains a filtered version of the previous mip level - at half the size (within rounding) of the previous mip level.
- When sampling a texture, an explicit or implicitly-computed level-of-detail is used - to select the mip levels from which to read texel data. These are then combined via - filtering to produce the sampled value. -: arrayed -:: whether the texture is arrayed -: array size -:: the number of homogeneous grids, if the texture is arrayed - -A texture's representation is typically optimized for rendering operations. -To achieve this, many details are hidden from the programmer, including data layouts, data types, and -internal operations that cannot be expressed directly in the shader language. - -As a consequence, a shader does not have direct access to the texel storage within a texture variable. -Instead, use texture builtin functions as follows: - -* Within the shader: - * Declare a module-scope variable in the [=storage classes/handle=] storage class, - where the [=store type=] is one of the texture types described in later sections. - * Inside a function, call one of the texture builtin functions, and provide - the texture variable as the first parameter. -* When constructing the WebGPU pipeline, the texture variable's store type and binding - must be compatible with the corresponding bind group layout entry. - -In this way, the set of supported operations for a texture type -is determined by the availability of texture builtin functions accepting that texture type -as the first parameter. - -### Texel formats ### {#texel-formats} - -In [SHORTNAME], certain texture types are parameterized by texel format. - -A texel format is characterized by: - -: channels -:: Each channel contains a scalar. - A texel format has up to four channels: `r`, `g`, `b`, and `a`, - normally corresponding to the concepts of red, green, blue, and alpha channels. -: channel format -:: The number of bits in the channel, and how those bits are interpreted. - -Each texel format in [SHORTNAME] corresponds to a [[WebGPU#enumdef-gputextureformat|WebGPU GPUTextureFormat]] -with the same name. - -Only certain texel formats are used in [SHORTNAME] source code. -The channel formats used to define those texel formats are listed in the -Channel Formats table. -The last column specfies the conversion from the stored channel bits to the value used in the shader. -This is also known as the channel transfer function, or CTF. - - - - - -
Channel Formats
Channel format - Number of stored bits - Interpetation of stored bits - Shader typeShader value -(Channel Transfer Function) -
8unorm8unsigned integer |v| ∈ {0,...,255}f32 |v| ÷ 255 -
8snorm8signed integer |v| ∈ {-128,...,127}f32 max(-1, |v| ÷ 127) -
8uint8unsigned integer |v| ∈ {0,...,255}u32 |v| ÷ 255 -
8sint8signed integer |v| ∈ {-128,...,127}i32 max(-1, |v| ÷ 127) -
16uint16unsigned integer |v| ∈ {0,...,65535}u32 |v| -
16sint16signed integer |v| ∈ {-32768,...,32767}i32 |v| -
16float16IEEE 754 16-bit floating point value |v|, with 1 sign bit, 5 exponent bits, 10 mantissa bitsf32|v| -
32uint3232-bit unsigned integer value |v|u32|v| -
32sint3232-bit signed integer value |v|i32|v| -
32float32IEEE 754 32-bit floating point value |v|f32|v| -
- -The texel formats listed in the -Texel Formats for Storage Textures table -correspond to the [[WebGPU#plain-color-formats|WebGPU plain color formats]] -which support the [[WebGPU#dom-gputextureusage-storage|WebGPU STORAGE]] usage. -These texel formats are used to parameterize the storage texture types defined -in [[#texture-storage]]. - -When the texel format does not have all four channels, then: - -* When reading the texel: - * If the texel format has no green channel, then the second component of the shader value is 0. - * If the texel format has no blue channel, then the third component of the shader value is 0. - * If the texel format has no alpha channel, then the fourth component of the shader value is 1. -* When writing the texel, shader value components for missing channels are ignored. - -The last column in the table below uses the format-specific -[=channel transfer function=] from the [=channel formats=] table. - - - - - -
Texel Formats for Storage Textures
Texel format - Channel format - Channels in memory order - Corresponding shader value -
rgba8unorm8unormr, g, b, avec4<f32>(CTF(r), CTF(g), CTF(b), CTF(a)) -
rgba8snorm8snormr, g, b, avec4<f32>(CTF(r), CTF(g), CTF(b), CTF(a)) -
rgba8uint8uintr, g, b, avec4<u32>(CTF(r), CTF(g), CTF(b), CTF(a)) -
rgba8sint8sintr, g, b, avec4<i32>(CTF(r), CTF(g), CTF(b), CTF(a)) -
rgba16uint16uintr, g, b, avec4<u32>(CTF(r), CTF(g), CTF(b), CTF(a)) -
rgba16sint16sintr, g, b, avec4<i32>(CTF(r), CTF(g), CTF(b), CTF(a)) -
rgba16float16floatr, g, b, avec4<f32>(CTF(r), CTF(g), CTF(b), CTF(a)) -
r32uint32uintrvec4<u32>(CTF(r), 0u, 0u, 1u) -
r32sint32sintrvec4<i32>(CTF(r), 0, 0, 1) -
r32float32floatrvec4<f32>(CTF(r), 0.0, 0.0, 1.0) -
rg32uint32uintr, gvec4<u32>(CTF(r), CTF(g), 0.0, 1.0) -
rg32sint32sintr, gvec4<i32>(CTF(r), CTF(g), 0.0, 1.0) -
rg32float32floatr, gvec4<f32>(CTF(r), CTF(g), 0.0, 1.0) -
rgba32uint32uintr, g, b, avec4<u32>(CTF(r), CTF(g), CTF(b), CTF(a)) -
rgba32sint32sintr, g, b, avec4<i32>(CTF(r), CTF(g), CTF(b), CTF(a)) -
rgba32float32floatr, g, b, avec4<f32>(CTF(r), CTF(g), CTF(b), CTF(a)) -
- -The following table lists the correspondence between WGSL texel formats and -[SPIR-V image formats](https://www.khronos.org/registry/spir-v/specs/unified1/SPIRV.html#_a_id_image_format_a_image_format). - - - - - -
Mapping texel formats to SPIR-V
Texel format - SPIR-V Image Format - SPIR-V Enabling Capability -
rgba8unormRgba8Shader -
rgba8snormRgba8SnormShader -
rgba8uintRgba8uiShader -
rgba8sintRgba8iShader -
rgba16uintRgba16uiShader -
rgba16sintRgba16iShader -
rgba16floatRgba16fShader -
r32uintR32uiShader -
r32sintR32iShader -
r32floatR32fShader -
rg32uintRg32uiStorageImageExtendedFormats -
rg32sintRg32iStorageImageExtendedFormats -
rg32floatRg32fStorageImageExtendedFormats -
rgba32uintRgba32uiShader -
rgba32sintRgba32iShader -
rgba32floatRgba32fShader -
- -### Sampled Texture Types ### {#sampled-texture-type} - -
-`texture_1d`
-  %1 = OpTypeImage %type 1D 0 0 0 1 Unknown
-
-`texture_2d`
-  %1 = OpTypeImage %type 2D 0 0 0 1 Unknown
-
-`texture_2d_array`
-  %1 = OpTypeImage %type 2D 0 1 0 1 Unknown
-
-`texture_3d`
-  %1 = OpTypeImage %type 3D 0 0 0 1 Unknown
-
-`texture_cube`
-  %1 = OpTypeImage %type Cube 0 0 0 1 Unknown
-
-`texture_cube_array`
-  %1 = OpTypeImage %type Cube 0 1 0 1 Unknown
-
-* type must be `f32`, `i32` or `u32` -* The parameterized type for the images is the type after conversion from sampling. - E.g. you can have an image with texels with 8bit unorm components, but when you sample - them you get a 32-bit float result (or vec-of-f32). - -### Multisampled Texture Types ### {#multisampled-texture-type} - -
-`texture_multisampled_2d`
-  %1 = OpTypeImage %type 2D 0 0 1 1 Unknown
-
-* type must be `f32`, `i32` or `u32` - -### Storage Texture Types ### {#texture-storage} - -A read-only storage texture supports reading a single texel without the use of a sampler, -with automatic conversion of the stored texel value to a usable shader value. A write-only storage -texture supports writing a single texel, with automatic conversion -of the shader value to a stored texel value. -See [[#texture-builtin-functions]]. - -A storage texture type must be parameterized by one of the -[=storage-texel-format|texel formats for storage textures=]. -The texel format determines the conversion function as specified in [[#texel-formats]]. - -For a write-only storage texture the *inverse* of the conversion function is used to convert the shader value to -the stored texel. - -TODO(dneto): Move description of the conversion to the builtin function that actually does the reading. - -
-`texture_storage_1d`
-  // %1 = OpTypeImage sampled_type 1D 0 0 0 2 image_format
-
-`texture_storage_2d`
-  // %1 = OpTypeImage sampled_type 2D 0 0 0 2 image_format
-
-`texture_storage_2d_array`
-  // %1 = OpTypeImage sampled_type 2D 0 1 0 2 image_format
-
-`texture_storage_3d`
-  // %1 = OpTypeImage sampled_type 3D 0 0 0 2 texel_format
-
- -In the SPIR-V mapping: -* The *Image Format* parameter of the image type declaration is - as specified by the SPIR-V texel format correspondence table in [[#texel-formats]]. -* The *Sampled Type* parameter of the image type declaration is - the SPIR-V scalar type corresponding to the channel format for the texel format. - -When mapping to SPIR-V, a read-only storage texture variable must have a `NonWritable` decoration and -a write-only storage texture variable must have a `NonReadable` decoration. - -For example: - -
- - var tbuf : [[access(read)]] texture_storage_1d<rgba8unorm>; - - // Maps to the following SPIR-V: - // OpDecorate %tbuf NonWritable - // ... - // %float = OpTypeFloat 32 - // %image_type = OpTypeImage %float 1D 0 0 0 2 Rgba8 - // %image_ptr_type = OpTypePointer UniformConstant %image_type - // %tbuf = OpVariable %image_ptr_type UniformConstant - -
- -
- - var tbuf : [[access(write)]] texture_storage_1d<rgba8unorm>; - - // Maps to the following SPIR-V: - // OpDecorate %tbuf NonReadable - // ... - // %float = OpTypeFloat 32 - // %image_type = OpTypeImage %float 1D 0 0 0 2 Rgba8 - // %image_ptr_type = OpTypePointer UniformConstant %image_type - // %tbuf = OpVariable %image_ptr_type UniformConstant - -
- -### Depth Texture Types ### {#texture-depth} -
-`texture_depth_2d`
-  %1 = OpTypeImage %f32 2D 1 0 0 1 Unknown
-
-`texture_depth_2d_array`
-  %1 = OpTypeImage %f32 2D 1 1 0 1 Unknown
-
-`texture_depth_cube`
-  %1 = OpTypeImage %f32 Cube 1 0 0 1 Unknown
-
-`texture_depth_cube_array`
-  %1 = OpTypeImage %f32 Cube 1 1 0 1 Unknown
-
- -### Sampler Type ### {#sampler-type} -
-sampler
-  OpTypeSampler
-
-sampler_comparison
-  OpTypeSampler
-
- -### Texture Types Grammar ### {#texture-types-grammar} -TODO: Add texture usage validation rules. - -
-texture_sampler_types
-  : sampler_type
-  | depth_texture_type
-  | sampled_texture_type LESS_THAN type_decl GREATER_THAN
-  | multisampled_texture_type LESS_THAN type_decl GREATER_THAN
-  | storage_texture_type LESS_THAN texel_format GREATER_THAN
-
-sampler_type
-  : SAMPLER
-  | SAMPLER_COMPARISON
-
-sampled_texture_type
-  : TEXTURE_1D
-  | TEXTURE_2D
-  | TEXTURE_2D_ARRAY
-  | TEXTURE_3D
-  | TEXTURE_CUBE
-  | TEXTURE_CUBE_ARRAY
-
-multisampled_texture_type
-  : TEXTURE_MULTISAMPLED_2D
-
-storage_texture_type
-  : TEXTURE_STORAGE_1D
-  | TEXTURE_STORAGE_2D
-  | TEXTURE_STORAGE_2D_ARRAY
-  | TEXTURE_STORAGE_3D
-
-depth_texture_type
-  : TEXTURE_DEPTH_2D
-  | TEXTURE_DEPTH_2D_ARRAY
-  | TEXTURE_DEPTH_CUBE
-  | TEXTURE_DEPTH_CUBE_ARRAY
-
-texel_format
-  : R8UNORM
-     R8  -- Capability: StorageImageExtendedFormats
-  | R8SNORM
-     R8Snorm  -- Capability: StorageImageExtendedFormats
-  | R8UINT
-     R8ui  -- Capability: StorageImageExtendedFormats
-  | R8SINT
-     R8i  -- Capability: StorageImageExtendedFormats
-  | R16UINT
-     R16ui  -- Capability: StorageImageExtendedFormats
-  | R16SINT
-     R16i  -- Capability: StorageImageExtendedFormats
-  | R16FLOAT
-     R16f  -- Capability: StorageImageExtendedFormats
-  | RG8UNORM
-     Rg8  -- Capability: StorageImageExtendedFormats
-  | RG8SNORM
-     Rg8Snorm  -- Capability: StorageImageExtendedFormats
-  | RG8UINT
-     Rg8ui  -- Capability: StorageImageExtendedFormats
-  | RG8SINT
-     Rg8i  -- Capability: StorageImageExtendedFormats
-  | R32UINT
-     R32ui
-  | R32SINT
-     R32i
-  | R32FLOAT
-     R32f
-  | RG16UINT
-     Rg16ui  -- Capability: StorageImageExtendedFormats
-  | RG16SINT
-     Rg16i  -- Capability: StorageImageExtendedFormats
-  | RG16FLOAT
-     Rg16f  -- Capability: StorageImageExtendedFormats
-  | RGBA8UNORM
-     Rgba8
-  | RGBA8UNORM-SRGB
-     ???
-  | RGBA8SNORM
-     Rgba8Snorm
-  | RGBA8UINT
-     Rgba8ui
-  | RGBA8SINT
-     Rgba8i
-  | BGRA8UNORM
-     Rgba8  ???
-  | BGRA8UNORM-SRGB
-     ???
-  | RGB10A2UNORM
-     Rgb10A2  -- Capability: StorageImageExtendedFormats
-  | RG11B10FLOAT
-     R11fG11fB10f  -- Capability: StorageImageExtendedFormats
-  | RG32UINT
-     Rg32ui  -- Capability: StorageImageExtendedFormats
-  | RG32SINT
-     Rg32i  -- Capability: StorageImageExtendedFormats
-  | RG32FLOAT
-     Rg32f  -- Capability: StorageImageExtendedFormats
-  | RGBA16UINT
-     Rgba16ui
-  | RGBA16SINT
-     Rgba16i
-  | RGBA16FLOAT
-     Rgba16f
-  | RGBA32UINT
-     Rgba32ui
-  | RGBA32SINT
-     Rgba32i
-  | RGBA32FLOAT
-     Rgba32f
-
-
- -## Atomic Types ## {#atomic-types} - -Operations on atomic objects in [SHORTNAME] are mutually ordered for each object. -That is, during execution of a shader stage, for each atomic object A, all -agents observe the same order of operations applied to A. -The ordering for distinct atomic objects may not be related in any way; no -causality is implied. -Note that variables in [=storage classes/workgroup=] storage are shared within a -[=compute shader stage/workgroup=], but are not shared between different -workgroups. - -Atomic objects may only be operated on by the -[[#atomic-builtin-functions|atomic builtin functions]]. - -Atomic types may only be instantiated by variables in the [=storage -classes/workgroup=] storage class or `read_write` [=attribute/access=] variables in the -[=storage classes/storage=] storage class. - - - - -
TypeDescription -
atomic<|T|> - Atomic of type |T|. |T| must be either [=u32=] or [=i32=]. -
- -TODO: Add links the eventual memory model descriptions. - -
-  
-    [[block]] struct S {
-      a : atomic<i32>;
-      b : atomic<u32>;
-    };
-    
-    [[group(0), binding(0)]]
-    var<storage> x : [[access(read_write)]] S;
-    
-    // Maps to the following SPIR-V:
-    // - When atomic types are members of a struct, the Volatile decoration
-    //   is annotated on the member.
-    // OpDecorate %S Block
-    // OpMemberDecorate %S 0 Volatile
-    // OpMemberDecorate %S 1 Volatile
-    // ...
-    // %i32 = OpTypeInt 32 1
-    // %u32 = OpTypeInt 32 0
-    // %S = OpTypeStruct %i32 %u32
-    // %ptr_storage_S = OpTypePointer StorageBuffer %S
-    // %x = OpVariable %ptr_storage_S StorageBuffer
-  
-
- -
-  
-    var<workgroup> x : atomic<u32>;
-    
-    // Maps to the following SPIR-V:
-    // - When atomic types are directly instantiated by a variable,  the Volatile
-    //   decoration is annotated on the OpVariable.
-    // OpDecorate %x Volatile
-    // ...
-    // %u32 = OpTypeInt 32 0
-    // %ptr_workgroup_u32 = OpTypePointer Workgroup %S
-    // %x = OpVariable %ptr_workgroup_u32 Workgroup
-  
-
- -## Type Aliases TODO ## {#type-aliases} - -
-type_alias
-  : TYPE IDENT EQUAL type_decl
-
- -
- - type Arr = array<i32, 5>; - - type RTArr = [[stride(16)]] array<vec4<f32>>; - -
- -## Type Declaration Grammar ## {#type-declarations} - -
-type_decl
-  : IDENT
-  | BOOL
-  | FLOAT32
-  | INT32
-  | UINT32
-  | VEC2 LESS_THAN type_decl GREATER_THAN
-  | VEC3 LESS_THAN type_decl GREATER_THAN
-  | VEC4 LESS_THAN type_decl GREATER_THAN
-  | POINTER LESS_THAN storage_class COMMA type_decl GREATER_THAN
-  | attribute_list* ARRAY LESS_THAN type_decl (COMMA INT_LITERAL)? GREATER_THAN
-  | MAT2x2 LESS_THAN type_decl GREATER_THAN
-  | MAT2x3 LESS_THAN type_decl GREATER_THAN
-  | MAT2x4 LESS_THAN type_decl GREATER_THAN
-  | MAT3x2 LESS_THAN type_decl GREATER_THAN
-  | MAT3x3 LESS_THAN type_decl GREATER_THAN
-  | MAT3x4 LESS_THAN type_decl GREATER_THAN
-  | MAT4x2 LESS_THAN type_decl GREATER_THAN
-  | MAT4x3 LESS_THAN type_decl GREATER_THAN
-  | MAT4x4 LESS_THAN type_decl GREATER_THAN
-  | texture_sampler_types
-
- -When the type declaration is an identifer, then the expression must be in scope of a -declaration of the identifier as a type alias or structure type. - -
- - identifier - Allows to specify types created by the type command - - bool - %1 = OpTypeBool - - f32 - %2 = OpTypeFloat 32 - - i32 - %3 = OpTypeInt 32 1 - - u32 - %4 = OpTypeInt 32 0 - - vec2<f32> - %7 = OpTypeVector %float 2 - - array<f32, 4> - %uint_4 = OpConstant %uint 4 - %9 = OpTypeArray %float %uint_4 - - [[stride(32)]] array<f32, 4> - OpDecorate %9 ArrayStride 32 - %uint_4 = OpConstant %uint 4 - %9 = OpTypeArray %float %uint_4 - - array<f32> - %rtarr = OpTypeRuntimeArray %float - - mat2x3<f32> - %vec = OpTypeVector %float 3 - %6 = OpTypeMatrix %vec 2 - -
- -
- - // Storage buffers - [[group(0), binding(0)]] - var<storage> buf1 : [[access(read)]] Buffer; // Can read, cannot write. - [[group(0), binding(1)] - var<storage> buf2 : [[access(read_write)]] Buffer; // Can both read and write - - // Uniform buffer. Always read-only, and has more restrictive layout rules. - struct ParamsTable {}; - [[group(0), binding(2)]] - var<uniform> params : ParamsTable; - -
- -# `var` and `let` # {#variables} - -TODO: *Stub* (describe what a constant is): A constant is a name for a value, declared via a `let` declaration. -What types are permitted? Storable, plus pointer to store type. - -TODO(dneto): A `let` may not be of type pointer-to-handle. A function parameter may not have type pointer-to-handle. -Otherwise we'd have a need to make a pointer-to-handle type expression. But we've reserved the [=storage classes/handle=] keyword. -When translating from SPIR-V, you must trace through -the OpCopyObject (or no-index OpAccessChain) instructions that might be between -the pointer-to-array and the pointer-to-struct. - -A variable is a named reference to storage that can contain a value of a -particular storable type. - -Two types are associated with a variable: its [=store type=] (the type of value -that may be placed in the referenced storage) and its [=reference type=] (the type -of the variable itself). If a variable has store type *T* and [=storage class=] *S*, -then its reference type is pointer-to-*T*-in-*S*. - -A variable declaration: - -* Determines the variable’s name, storage class, and store type (and hence its [=reference type=]). -* Ensures the execution environment allocates storage for a value of the store type, for the lifetime of the variable. -* Optionally have an *initializer* expression, if the variable is in the [=storage classes/private=] or [=storage classes/function=] storage classes. - If present, the initializer's type must match the store type of the variable. - -See [[#module-scope-variables]] and [[#function-scope-variables]] for rules about where -a variable in a particular storage class can be declared, -and when the storage class decoration is required, optional, or forbidden. - -
-variable_statement
-  : variable_decl
-  | variable_decl EQUAL short_circuit_or_expression
-  | LET (IDENT | variable_ident_decl) EQUAL short_circuit_or_expression
-
-variable_decl
-  : VAR variable_storage_decoration? variable_ident_decl
-
-variable_ident_decl
-  : IDENT COLON attribute_list* type_decl
-
-variable_storage_decoration
-  : LESS_THAN storage_class GREATER_THAN
-
-
- -The `let` identifiers denote values that are immutable. -When a `let` identifier is declared without the corresponding type, -e.g. `let foo = 4`, the type is automatically inferred from the expression to the right of `=`. -If the type is provided, e.g `let foo: i32 = 4`, it has to match exactly to the type of the initializer expression. - -Variables in the [=storage classes/storage=] storage class and variables with a -[storage texture](#texture-storage) type must have an [=access=] attribute -applied to the store type. - -Two variables with overlapping lifetimes will not have overlapping storage. - -When a variable is created, its storage contains an initial value as follows: - -* For variables in the [=storage classes/private=] or [=storage classes/function=] storage classes: - * The zero value for the store type, if the variable declaration has no initializer. - * Otherwise, it is the result of evaluating the initializer expression at that point in the program execution. -* For variables in other storage classes, the execution environment provides the initial value. - -Consider the following snippet of WGSL: -
- - var i: i32; // Initial value is 0. Not recommended style. - loop { - var twice: i32 = 2 * i; // Re-evaluated each iteration. - i = i + 1; - break if (i == 5); - } - -
-The loop body will execute five times. -Variable `i` will take on values 0, 1, 2, 3, 4, 5, and variable `twice` will take on values 0, 2, 4, 6, 8. - -Consider the following snippet of WGSL: -
- - var x : f32 = 1.0; - let y = x * x + x + 1; - -
-Because `x` is a variable, all accesses to it turn into load and store operations. -If this snippet was compiled to SPIR-V, it would be represented as -
- - %temp_1 = OpLoad %float %x - %temp_2 = OpLoad %float %x - %temp_3 = OpFMul %float %temp_1 %temp_2 - %temp_4 = OpLoad %float %x - %temp_5 = OpFAdd %float %temp_3 %temp_4 - %y = OpFAdd %float %temp_5 %one - -
-However, it is expected that either the browser or the driver optimizes this intermediate representation -such that the redundant loads are eliminated. - -## Module Scope Variables ## {#module-scope-variables} - -A variable or constant declared outside a function is at [=module scope=]. -The name is available for use immediately after its declaration statement, until the end -of the program. - -Variables at [=module scope=] are restricted as follows: - -* The variable must not be in the [=storage classes/function=] storage class. -* A variable in the [=storage classes/private=], [=storage classes/workgroup=], [=storage classes/uniform=], or [=storage classes/storage=] storage classes: - * Must be declared with an explicit storage class decoration. - * Must use a [=store type=] as described in [[#storage-class]]. -* If the [=store type=] is a texture type or a sampler type, then the variable declaration must not - have a storage class decoration. The storage class will always be [=storage classes/handle=]. - -A variable in the [=storage classes/uniform=] storage class is a uniform buffer variable. -Its [=store type=] must be a [=host-shareable=] structure type with [=attribute/block=] attribute, -satisfying the [storage class constraints](#storage-class-constraints). - -A variable in the [=storage classes/storage=] storage class is a storage buffer variable. -Its [=store type=] must be a [=host-shareable=] structure type with [=attribute/block=] attribute, -satisfying the [storage class constraints](#storage-class-constraints). - -As described in [[#resource-interface]], -uniform buffers, storage buffers, textures, and samplers form the -[=resource interface of a shader=]. -Such variables are declared with [=attribute/group=] and [=attribute/binding=] decorations. - -
- - var<private> decibels: f32; - var<workgroup> worklist: array<i32,10>; - - [[block]] struct Params { - specular: f32; - count: i32; - }; - [[group(0)]], binding(2)]] - var<uniform> param: Params; // A uniform buffer - - [[block]] struct PositionsBuffer { - pos: array<vec2<f32>>; - }; - [[group(0), binding(0)]] - var<storage> pbuf: [[access(read_write)]] PositionsBuffer; // A storage buffer - - [[group(0), binding(1)]] - var filter_params: sampler; // Textures and samplers are always in "handle" storage. - -
- -
-global_variable_decl
-  : attribute_list* variable_decl (EQUAL const_expr)?
-
- -
- - [[location(2)]] - OpDecorate %variable Location 2 - - [[group(4), binding(3)]] - OpDecorate %variable DescriptorSet 4 - OpDecorate %variable Binding 3 - -
- -[SHORTNAME] defines the following attributes that can be applied to global variables: - * [=attribute/binding=] - * [=attribute/group=] - -## Module Constants ## {#module-constants} - -A *module constant* declares a name for a value, outside of all function declarations. -The name is available for use after the end of the declaration, -until the end of the [SHORTNAME] program. - -When the declaration has no attributes, an initializer expression must be present, -and the name denotes the value of that expression. - -
- - let golden : f32 = 1.61803398875; // The golden ratio - let e2 : vec3<i32> = vec3<i32>(0,1,0); // The second unit vector for three dimensions. - -
- -When the declaration uses the [=constant_id=] attribute, -the constant is pipeline-overridable. In this case: - - * The type must one of the [=scalar=] types. - * The initializer expression is optional. - * The attribute's literal operand is known as the pipeline constant ID, - and must be a non-negative integer value representable in 32 bits. - * Pipeline constant IDs must be unique within the [SHORTNAME] program: Two module constants - must not use the same pipeline constant ID. - * The application can specify its own value for the name at pipeline-creation time. - The pipeline creation API accepts a mapping from the pipeline constant ID - to a value of the constant's type. - If the mapping has an entry for the ID, the value in the mapping is used. - Otherwise, the initializer expression must be present, and its value is used. - -Issue(dneto): What happens if the application supplies a constant ID that is not in the program? -Proposal: pipeline creation fails with an error. - -
- - [[constant_id(0)]] let has_point_light : bool = true; // Algorithmic control - [[constant_id(1200)]] let specular_param : f32 = 2.3; // Numeric control - [[constant_id(1300)]] let gain : f32; // Must be overridden - -
- -When a variable or feature is used within control flow that depends on the -value of a constant, then that variable or feature is considered to be used by the -program. -This is true regardless of the value of the constant, whether that value -is the one from the constant's declaration or from a pipeline override. - -
-global_constant_decl
-  : attribute_list* LET variable_ident_decl global_const_initializer?
-
-global_const_initializer
-  : EQUAL const_expr
-
-const_expr
-  : type_decl PAREN_LEFT (const_expr COMMA)* const_expr PAREN_RIGHT
-  | const_literal
-
- -
- - -1 - %a = OpConstant %int -1 - - 2 - %b = OpConstant %uint 2 - - 3.2 - %c = OpConstant %float 3.2 - - true - %d = OpConstantTrue - - false - %e = OpConstant False - - vec4<f32>(1.2, 2.3, 3.4, 2.3) - %f0 = OpConstant %float 1.2 - %f1 = OpConstant %float 2.3 - %f2 = OpConstant %float 3.4 - %f = OpConstantComposite %v4float %f0 %f1 %f2 %f1 - -
- -Issue(dneto): The WebGPU pipeline creation API must specify how API-supplied values are mapped to -shader scalar values. For booleans, I suggest using a 32-bit integer, where only 0 maps to `false`. -If [SHORTNAME] gains non-32-bit numeric scalars, I recommend overridable constants continue being 32-bit -numeric types. - -## Function Scope Variables and Constants ## {#function-scope-variables} - -A variable or constant declared in a declaration statement in a function body is in function scope. -The name is available for use immediately after its declaration statement, -and until the end of the brace-delimited list of statements immediately enclosing the declaration. - -A variable declared in function scope is always in the [=storage classes/function=] storage class. -The variable storage decoration is optional. -The variable's [=store type=] must be [=storable=]. - -
- - fn f() { - var<function> count : u32; // A variable in function storage class. - var delta : i32; // Another variable in the function storage class. - var sum : f32 = 0.0; // A function storage class variable with initializer. - let unit : i32 = 1; // A constant. Let declarations don't use a storage class. - } - -
- -A variable or constant declared in the first clause of a `for` statement is available for use in the second -and third clauses and in the body of the `for` statement. - - -## Never-alias assumption TODO ## {#never-alias-assumption} - -# Expressions TODO # {#expressions} - -## Literal Expressions TODO ## {#literal-expressions} - - - - - -
Scalar literal type rules
PreconditionConclusionNotes -
`true` : boolOpConstantTrue %bool -
`false` : boolOpConstantFalse %bool -
*INT_LITERAL* : i32OpConstant %int *literal* -
*UINT_LITERAL* : u32OpConstant %uint *literal* -
*FLOAT_LITERAL* : f32OpConstant %float *literal* -
- -## Type Constructor Expressions TODO ## {#type-constructor-expr} - - - - - -
Scalar constructor type rules
PreconditionConclusionNotes -
*e* : bool`bool(e)` : boolIdentity.
In the SPIR-V translation, the ID of this expression reuses the ID of the operand. -
*e* : i32`i32(e)` : i32Identity.
In the SPIR-V translation, the ID of this expression reuses the ID of the operand. -
*e* : u32`u32(e)` : u32Identity.
In the SPIR-V translation, the ID of this expression reuses the ID of the operand. -
*e* : f32`f32(e)` : f32Identity.
In the SPIR-V translation, the ID of this expression reuses the ID of the operand. -
- - - - - - - - - - - - - - - - -
Vector constructor type rules, where *T* is a scalar type
PreconditionConclusionNotes -
|e| : |T| - `vec`|N|`<`|T|`>(`|e|`)` : vec|N|<|T|> - Evaluates |e| once. Results in the |N|-element vector where each component has the value of |e|. -
*e1* : *T*
- *e2* : *T* -
`vec2(e1,e2)` : vec2<*T*> - OpCompositeConstruct -
*e* : vec2<T> - `vec2(e)` : vec2<*T*> - Identity. The result is |e|. -
*e1* : *T*
- *e2* : *T*
- *e3* : *T* -
`vec3(e1,e2,e3)` : vec3<*T*> - OpCompositeConstruct -
*e1* : *T*
- *e2* : vec2<*T*> -
`vec3(e1,e2)` : vec3<*T*>
- `vec3(e2,e1)` : vec3<*T*> -
OpCompositeConstruct -
*e* : vec3<T> - `vec3(e)` : vec3<*T*> - Identity. The result is |e|. -
*e1* : *T*
- *e2* : *T*
- *e3* : *T*
- *e4* : *T* -
`vec4(e1,e2,e3,e4)` : vec4<*T*> - OpCompositeConstruct -
*e1* : *T*
- *e2* : *T*
- *e3* : vec2<*T*> -
`vec4(e1,e2,e3)` : vec4<*T*>
- `vec4(e1,e3,e2)` : vec4<*T*>
- `vec4(e3,e1,e2)` : vec4<*T*> -
OpCompositeConstruct -
*e1* : vec2<*T*>
- *e2* : vec2<*T*> -
`vec4(e1,e2)` : vec4<*T*> - OpCompositeConstruct -
*e1* : *T*
- *e2* : vec3<*T*> -
`vec4(e1,e2)` : vec4<*T*>
- `vec4(e2,e1)` : vec4<*T*> -
OpCompositeConstruct -
*e* : vec4<T> - `vec4(e)` : vec4<*T*> - Identity. The result is |e|. -
- - - - - - - - -
Matrix constructor type rules
PreconditionConclusionNotes -
*e1* : vec2
- *e2* : vec2
- *e3* : vec2
- *e4* : vec2 -
`mat2x2(e1,e2)` : mat2x2
- `mat3x2(e1,e2,e3)` : mat3x2
- `mat4x2(e1,e2,e3,e4)` : mat4x2 -
Column by column construction.
- OpCompositeConstruct -
*e1* : vec3
- *e2* : vec3
- *e3* : vec3
- *e4* : vec3 -
`mat2x3(e1,e2)` : mat2x3
- `mat3x3(e1,e2,e3)` : mat3x3
- `mat4x3(e1,e2,e3,e4)` : mat4x3 -
Column by column construction.
- OpCompositeConstruct -
*e1* : vec4
- *e2* : vec4
- *e3* : vec4
- *e4* : vec4 -
`mat2x4(e1,e2)` : mat2x4
- `mat3x4(e1,e2,e3)` : mat3x4
- `mat4x4(e1,e2,e3,e4)` : mat4x4 -
Column by column construction.
- OpCompositeConstruct -
- - - - - - -
Array constructor type rules
PreconditionConclusionNotes -
*e1* : *T*
- ...
- *eN* : *T*
-
`array<`*T*,*N*`>(e1,...,eN)` : array<*T*, *N*> - Construction of an array from elements -
-TODO: Should this only work for storable sized arrays? https://github.com/gpuweb/gpuweb/issues/982 - - - - - - - -
Structure constructor type rules
PreconditionConclusionNotes -
*e1* : *T1*
- ...
- *eN* : *TN*
- *T1* is storable
- ...
- *TN* is storable
- S is a structure type with members having types *T1* ... *TN*.
- The expression is in the scope of declaration of S. -
`S(e1,...,eN)` : S - Construction of a structure from members -
- - -## Zero Value Expressions ## {#zero-value-expr} - -Each storable type *T* has a unique *zero value*, written in WGSL as the type followed by an empty pair of parentheses: *T* `()`. - -Issue: We should exclude being able to write the zero value for an [=runtime-sized=] array. https://github.com/gpuweb/gpuweb/issues/981 - -The zero values are as follows: - -* `bool()` is `false` -* `i32()` is 0 -* `u32()` is 0 -* `f32()` is 0.0 -* The zero value for an *N*-element vector of type *T* is the *N*-element vector of the zero value for *T*. -* The zero value for an *N*-column *M*-row matrix of `f32` is the matrix of those dimensions filled with 0.0 entries. -* The zero value for an *N*-element array with storable element type *E* is an array of *N* elements of the zero value for *E*. -* The zero value for a storable structure type *S* is the structure value *S* with zero-valued members. - - - - - -
Scalar zero value type rules
PreconditionConclusionNotes -
`bool()` : boolfalse
Zero value (OpConstantNull for bool) -
`i32()` : i320
Zero value (OpConstantNull for i32) -
`u32()` : u320u
Zero value (OpConstantNull for u32) -
`f32()` : f320.0
Zero value (OpConstantNull for f32) -
- - - - - - - - - - - -
Vector zero type rules, where *T* is a scalar type
PreconditionConclusionNotes -
- `vec2()` : vec2<*T*> - Zero value (OpConstantNull) -
- `vec3()` : vec3<*T*> - Zero value (OpConstantNull) -
- `vec4()` : vec4<*T*> - Zero value (OpConstantNull) -
- - -
- - vec2<f32>() // The zero-valued vector of two f32 elements. - vec2<f32>(0.0, 0.0) // The same value, written explicitly. - - vec3<i32>() // The zero-valued vector of four i32 elements. - vec3<i32>(0, 0, 0) // The same value, written explicitly. - -
- - - - - - - - - - -
Matrix zero type rules
PreconditionConclusionNotes -
- `mat2x2()` : mat2x2
- `mat3x2()` : mat3x2
- `mat4x2()` : mat4x2 -
Zero value (OpConstantNull) -
- `mat2x3()` : mat2x3
- `mat3x3()` : mat3x3
- `mat4x3()` : mat4x3 -
Zero value (OpConstantNull) -
- `mat2x4()` : mat2x4
- `mat3x4()` : mat3x4
- `mat4x4()` : mat4x4 -
Zero value (OpConstantNull) -
- - - - - - -
Array zero type rules
PreconditionConclusionNotes -
*T* is storable - `array<`*T*,*N*`>()` : array<*T*, *N*> - Zero-valued array (OpConstantNull) -
- -
- - array<bool, 2>() // The zero-valued array of two booleans. - array<bool, 2>(false, false) // The same value, written explicitly. - -
- - - - - - - -
Structure zero type rules
PreconditionConclusionNotes -
`S` is a storable structure type.
- The expression is in the scope of declaration of S. -
`S()` : S - Zero-valued structure: a structure of type S where each member is the zero value for its member type. -
- (OpConstantNull) -
- -
- - struct Student { - grade : i32; - GPA : f32; - attendance : array<bool,4>; - }; - - fn func() { - var s : Student; - - // The zero value for Student - s = Student(); - - // The same value, written explicitly. - s = Student(0, 0.0, array<bool,4>(false, false, false, false)); - - // The same value, written with zero-valued members. - s = Student(i32(), f32(), array<bool,4>()); - } - -
- - -## Conversion Expressions ## {#conversion-expr} - - - - - - - - - - - - - - -
Scalar conversion type rules
PreconditionConclusionNotes -
|e| : u32`bool(`|e|`)` : bool - Coercion to boolean.
- The result is false if |e| is 0, and true otherwise.
- (Use OpINotEqual to compare |e| against 0.) -
|e| : i32`bool(`|e|`)` : bool - Coercion to boolean.
- The result is false if |e| is 0, and true otherwise.
- (Use OpINotEqual to compare |e| against 0.) -
|e| : f32`bool(`|e|`)` : bool - Coercion to boolean.
- The result is false if |e| is 0.0 or -0.0, and true otherwise. - In particular NaN and infinity values map to true.
- (Use OpFUnordNotEqual to compare |e| against `0.0`.) -
|e| : u32`i32(`|e|`)` : i32 - Reinterpretation of bits.
- The result is the unique value in [=i32=] that is equal to (|e| mod 232).
- (OpBitcast) -
|e| : f32`i32(`|e|`)` : i32Value conversion, including invalid cases. (OpConvertFToS) -
|e| : i32`u32(`|e|`)` : u32 - Reinterpretation of bits.
- The result is the unique value in [=u32=] that is equal to (|e| mod 232).
- (OpBitcast) -
|e| : f32`u32(`|e|`)` : u32 - Value conversion, including invalid cases. (OpConvertFToU) -
|e| : i32`f32(`|e|`)` : f32Value conversion, including invalid cases. (OpConvertSToF) -
|e| : u32`f32(`|e|`)` : f32Value conversion, including invalid cases. (OpConvertUToF) -
- -Details of conversion to and from floating point are explained in [[#floating-point-conversion]]. - - - - - - - - - - - - - - -
Vector conversion type rules
PreconditionConclusionNotes -
|e| : vec|N|<u32> - `vec`|N|<`bool`>`(`|e|`)` : vec|N|<bool> - Component-wise coercion of a unsigned integer vector to a boolean vector.
- Component |i| of the result is `bool(`|e|`[`|i|`])`
- (OpINotEqual to compare |e| against a zero vector.) - -
|e| : vec|N|<i32> - `vec`|N|<`bool`>`(`|e|`)` : vec|N|<bool> - Component-wise coercion of a signed integer vector to a boolean vector.
- Component |i| of the result is `bool(`|e|`[`|i|`])`
- (OpINotEqual to compare |e| against a zero vector.) - -
|e| : vec|N|<f32> - `vec`|N|<`bool`>`(`|e|`)` : vec|N|<bool> - Component-wise coercion of a floating point vector to a boolean vector.
- Component |i| of the result is `bool(`|e|`[`|i|`])`
- (OpFUnordNotEqual to compare |e| against a zero vector.) - -
|e| : vec|N|<u32> - `vec`|N|<`i32`>`(`|e|`)` : vec|N|<i32> - Component-wise reinterpretation of bits.
- Component |i| of the result is `i32(`|e|`[`|i|`])`
- (OpBitcast) - -
|e| : vec|N|<f32> - `vec`|N|<`i32`>`(`|e|`)` : vec|N|<i32> - Component-wise value conversion to signed integer, including invalid cases.
- Component |i| of the result is `i32(`|e|`[`|i|`])`
- (OpConvertFToS) - -
|e| : vec|N|<i32> - `vec`|N|<`u32`>`(`|e|`)` : vec|N|<u32> - Component-wise reinterpretation of bits.
- Component |i| of the result is `u32(`|e|`[`|i|`])`
- (OpBitcast) - -
|e| : vec|N|<f32> - `vec`|N|<`u32`>`(`|e|`)` : vec|N|<u32> - Component-wise value conversion to unsigned integer, including invalid cases.
- Component |i| of the result is `u32(`|e|`[`|i|`])`
- (OpConvertFToU) - -
|e| : vec|N|<i32> - `vec`|N|<`f32`>`(`|e|`)` : vec|N|<f32> - Component-wise value conversion to floating point, including invalid cases.
- Component |i| of the result is `f32(`|e|`[`|i|`])`
- (OpConvertSToF) - -
|e| : vec|N|<u32> - `vec`|N|<`f32`>`(`|e|`)` : vec|N|<f32> - Component-wise value conversion to floating point, including invalid cases.
- Component |i| of the result is `f32(`|e|`[`|i|`])`
- (ConvertUToF) - -
- -## Reinterpretation of Representation Expressions ## {#bitcast-expr} - -A `bitcast` expression is used to reinterpet the bit representation of a -value in one type as a value in another type. - - - - - - - - - -
Scalar bitcast type rules
PreconditionConclusionNotes -
|e| : |T|,
- |T| is one of i32, u32, f32 -
bitcast<|T|>(|e|) : |T| - Identity transform.
- The result is |e|. -
In the SPIR-V translation, the ID of this expression reuses the ID of the operand. -
|e| : |T|,
- |T| is one of u32, f32 -
bitcast<i32>(|e|) : i32 - Reinterpretation of bits as a signed integer.
- The result is the reinterpretation of the 32 bits in the representation of |e| as a [=i32=] value. - (OpBitcast) -
|e| : |T|,
- |T| is one of i32, f32 -
bitcast<u32>(|e|) : u32 - Reinterpretation of bits as an unsigned integer.
- The result is the reinterpretation of the 32 bits in the representation of |e| as a [=u32=] value. - (OpBitcast) -
|e| : |T|,
- |T| is one of i32, u32 -
bitcast<f32>(|e|) : f32 - Reinterpretation of bits as a floating point value.
- The result is the reinterpretation of the 32 bits in the representation of |e| as a [=f32=] value. - (OpBitcast) -
- - - - - - - - - -
Vector bitcast type rules
PreconditionConclusionNotes -
|e| : vec<|N|>|T|>,
- |T| is one of i32, u32, f32 -
bitcast<vec|N|<|T|>>(|e|) : |T| - Identity transform.
- The result is |e|. -
In the SPIR-V translation, the ID of this expression reuses the ID of the operand. -
|e| : vec<|N|>|T|>,
- |T| is one of u32, f32 -
bitcast<vec|N|<i32>>(|e|) : vec|N|<i32> - Component-wise reinterpretation of bits.
- Component |i| of the result is `bitcast(`|e|`[`|i|`])`
- (OpBitcast) -
|e| : vec<|N|>|T|>,
- |T| is one of i32, f32 -
bitcast<vec|N|<u32>>(|e|) : vec|N|<u32> - Component-wise reinterpretation of bits.
- Component |i| of the result is `bitcast(`|e|`[`|i|`])`
- (OpBitcast) -
|e| : vec<|N|>|T|>,
- |T| is one of i32, u32 -
bitcast<vec|N|<f32>>(|e|) : vec|N|<f32> - Component-wise Reinterpretation of bits.
- Component |i| of the result is `bitcast(`|e|`[`|i|`])`
- (OpBitcast) - -
- -## Composite Value Decomposition Expressions ## {#composite-value-decomposition-expr} - -### Vector Access Expression ### {#vector-access-expr} - -Accessing members of a vector can be done either using array subscripting (e.g. `a[2]`) or using a sequence of convenience names, each mapping to an element of the source vector. - -
    -
  • The colour set of convenience names: `r`, `g`, `b`, `a` for vector elements 0, 1, 2, and 3 respectively. -
  • The dimensional set of convenience names: `x`, `y`, `z`, `w` for vector elements 0, 1, 2, and 3, respectively. -
- -The convenience names are accessed using the `.` notation. (e.g. `color.bgra`). - -NOTE: the convenience letterings can not be mixed. (i.e. you can not use `rybw`). - -Using a convenience letter, or array subscript, which accesses an element past the end of the vector is an error. - -The convenience letterings can be applied in any order, including duplicating letters as needed. You can provide 1 to 4 letters when extracting components from a vector. Providing more then 4 letters is an error. - -The result type depends on the number of letters provided. Assuming a `vec4` - - - -
AccessorResult type -
r`f32` -
rg`vec2` -
rgb`vec3` -
rgba`vec4` -
- -
- - var a : vec3<f32> = vec3<f32>(1., 2., 3.); - var b : f32 = a.y; // b = 2.0 - var c : vec2<f32> = a.bb; // c = (3.0, 3.0) - var d : vec3<f32> = a.zyx; // d = (3.0, 2.0, 1.0) - var e : f32 = a[1]; // e = 2.0 - -
- -#### Vector single component selection #### {#vector-single-component} - - - - - -
Vector decomposition: single component selection
PreconditionConclusionDescription -
|e| : vec|N|<|T|>
-
- |e|`.x` : |T|
- |e|`.r` : |T| -
Select the first component of |e|
- (OpCompositeExtract with selection index 0) -
|e| : vec|N|<|T|>
-
- |e|`.y` : |T|
- |e|`.g` : |T| -
Select the second component of |e|
- (OpCompositeExtract with selection index 1) -
|e| : vec|N|<|T|>
- |N| is 3 or 4 -
- |e|`.z` : |T|
- |e|`.b` : |T| -
Select the third component of |e|
- (OpCompositeExtract with selection index 2) -
|e| : vec4<|T|> - - |e|`.w` : |T|
- |e|`.a` : |T| -
Select the fourth component of |e|
- (OpCompositeExtract with selection index 3) -
|e| : vec|N|<|T|>
- |i| : *Int* -
- |e|[|i|] : |T| - Select the |i|'th component of vector
- The first component is at index |i|=0.
- If |i| is outside the range [0,|N|-1], then an index in the range [0, |N|-1] is used instead. - (OpVectorExtractDynamic) -
- -Issue: Which index is used when it's out of bounds? - -#### Vector multiple component selection #### {#vector-multi-component} - - - - - - - - - - -
Vector decomposition: multiple component selection -
PreconditionConclusionDescription -
- |e| : vec|N|<|T|>
- |I| is the letter `x`, `y`, `z`, or `w`
- |J| is the letter `x`, `y`, `z`, or `w`
-
- |e|`.`|I||J| : vec2<|T|>
-
Computes the two-element vector with first component |e|.|I|, and second component |e|.|J|.
- Letter `z` is valid only when |N| is 3 or 4.
- Letter `w` is valid only when |N| is 4.
- (OpVectorShuffle) -
- |e| : vec|N|<|T|>
- |I| is the letter `r`, `g`, `b`, or `a`
- |J| is the letter `r`, `g`, `b`, or `a`
-
- |e|`.`|I||J| : vec2<|T|>
-
Computes the two-element vector with first component |e|.|I|, and second component |e|.|J|.
- Letter `b` is valid only when |N| is 3 or 4.
- Letter `a` is valid only when |N| is 4.
- (OpVectorShuffle) -
- |e| : vec|N|<|T|>
- |I| is the letter `x`, `y`, `z`, or `w`
- |J| is the letter `x`, `y`, `z`, or `w`
- |K| is the letter `x`, `y`, `z`, or `w`
-
- |e|`.`|I||J||K| : vec3<|T|>
-
Computes the three-element vector with first component |e|.|I|, second component |e|.|J|, and third component |e|.|K|.
- Letter `z` is valid only when |N| is 3 or 4.
- Letter `w` is valid only when |N| is 4.
- (OpVectorShuffle) -
- |e| : vec|N|<|T|>
- |I| is the letter `r`, `g`, `b`, or `a`
- |J| is the letter `r`, `g`, `b`, or `a`
- |K| is the letter `r`, `g`, `b`, or `a`
-
- |e|`.`|I||J||K| : vec3<|T|>
-
Computes the three-element vector with first component |e|.|I|, second component |e|.|J|, and third component |e|.|K|.
- Letter `b` is only valid when |N| is 3 or 4.
- Letter `a` is only valid when |N| is 4.
- (OpVectorShuffle) -
- |e| : vec|N|<|T|>
- |I| is the letter `x`, `y`, `z`, or `w`
- |J| is the letter `x`, `y`, `z`, or `w`
- |K| is the letter `x`, `y`, `z`, or `w`
- |L| is the letter `x`, `y`, `z`, or `w`
-
- |e|`.`|I||J||K||L| : vec4<|T|>
-
Computes the four-element vector with first component |e|.|I|, second component |e|.|J|, third component |e|.|K|, and fourth component |e|.|L|.
- Letter `z` is valid only when |N| is 3 or 4.
- Letter `w` is valid only when |N| is 4.
- (OpVectorShuffle) -
- |e| : vec|N|<|T|>
- |I| is the letter `r`, `g`, `b`, or `a`
- |J| is the letter `r`, `g`, `b`, or `a`
- |K| is the letter `r`, `g`, `b`, or `a`
- |L| is the letter `r`, `g`, `b`, or `a`
-
- |e|`.`|I||J||K||L| : vec4<|T|>
-
Computes the four-element vector with first component |e|.|I|, second component |e|.|J|, third component |e|.|K|, and fourth component |e|.|L|.
- Letter `b` is only valid when |N| is 3 or 4.
- Letter `a` is only valid when |N| is 4.
- (OpVectorShuffle) -
- -#### Component reference from vector reference #### {#component-reference-from-vector-reference} - - - - - - - - - - -
Getting a reference to a component from a reference to a vector
PreconditionConclusionDescription -
|r| : ref<|SC|,vec|N|<|T|>>
-
- |r|`.x` : ref<|SC|,|T|>
- |r|`.r` : ref<|SC|,|T|>
-
Compute a reference to the first component of the vector referenced by the reference |r|.
- The [=originating variable=] of the resulting reference is - the same as the originating variable of |r|.
- (OpAccessChain with index value 0) -
|r| : ref<|SC|,vec|N|<|T|>>
-
- |r|`.y` : ref<|SC|,|T|>
- |r|`.g` : ref<|SC|,|T|>
-
Compute a reference to the second component of the vector referenced by the reference |r|.
- The [=originating variable=] of the resulting reference is - the same as the originating variable of |r|.
- (OpAccessChain with index value 1) -
|r| : ref<|SC|,vec|N|<|T|>>
- |N| is 3 or 4 -
- |r|`.z` : ref<|SC|,|T|>
- |r|`.b` : ref<|SC|,|T|>
-
Compute a reference to the third component of the vector referenced by the reference |r|.
- The [=originating variable=] of the resulting reference is - the same as the originating variable of |r|.
- (OpAccessChain with index value 2) -
|r| : ref<|SC|,vec4<|T|>>
-
- |r|`.w` : ref<|SC|,|T|>
- |r|`.a` : ref<|SC|,|T|>
-
Compute a reference to the fourth component of the vector referenced by the reference |r|.
- The [=originating variable=] of the resulting reference is - the same as the originating variable of |r|.
- (OpAccessChain with index value 3) -
|r| : ref<|SC|,vec|N|<|T|>>
- |i| : *Int* -
- |r|[|i|] : ref<|SC|,|T|> - Compute a reference to the |i|'th component of the vector referenced by the reference |r|.
- If |i| is outside the range [0,|N|-1], then an index in the range [0, |N|-1] is used instead.
- The [=originating variable=] of the resulting reference is - the same as the originating variable of |r|.
- (OpAccessChain) -
- -### Matrix Access Expression ### {#matrix-access-expr} - - - - - - -
Column vector extraction
PreconditionConclusionDescription -
- |e| : mat|N|x|M|<|T|>
- |i| : *Int* -
- |e|[|i|] : vec|M|<|T|> - The result is the |i|'th column vector of |e|.
- If |i| is outside the range [0,|N|-1], then an index in the range [0, |N|-1] is used instead.
- (OpCompositeExtract) -
- - - - - - -
Getting a reference to a column vector from a reference to a matrix
PreconditionConclusionDescription -
- |r| : ref<|SC|,mat|N|x|M|<|T|>>
- |i| : *Int* -
- |r|[|i|] : ref<vec|M|<|SC|,|T|>> - Compute a reference to the |i|'th column vector of the matrix referenced by the reference |r|.
- If |i| is outside the range [0,|N|-1], then an index in the range [0, |N|-1] is used instead.
- The [=originating variable=] of the resulting reference is - the same as the originating variable of |r|.
- (OpAccessChain) -
- -### Array Access Expression ### {#array-access-expr} - - - - - - -
Array element extraction
PreconditionConclusionDescription -
- |e| : array<|T|,|N|>
- |i| : *Int* -
- |e|[|i|] : |T| - The result is the value of the |i|'th element of the array value |e|.
- If |i| is outside the range [0,|N|-1], then an index in the range [0, |N|-1] is used instead.
- (OpCompositeExtract) -
- - - - - - - -
Getting a reference to an array element from a reference to an array
PreconditionConclusionDescription -
- |r| : ref<|SC|,array<|T|,|N|>>
- |i| : *Int* -
- |r|[|i|] : ref<|SC|,|T|> - Compute a reference to the |i|'th element of the array referenced by the reference |r|.
- If |i| is outside the range [0,|N|-1], then an index in the range [0, |N|-1] is used instead.
- The [=originating variable=] of the resulting reference is - the same as the originating variable of |r|.
- (OpAccessChain) -
|r| : ref<|SC|,array<|T|>>
- |i| : *Int* -
- |r|[|i|] : ref<|SC|,|T|> - Compute a reference to the |i|'th element of the runtime-sized array referenced by the reference |r|.
- If at runtime the array has |N| elements, and |i| is outside the range [0,|N|-1], then an index in the - range [0, |N|-1] is used instead.
- The [=originating variable=] of the resulting reference is - the same as the originating variable of |r|.
- (OpAccessChain) -
- -### Structure Access Expression ### {#struct-access-expr} - - - - - - -
Structure member extraction
PreconditionConclusionDescription -
- |S| is a structure type
- |M| is the identifier name of a member of |S|, having type |T|
- |e| : |S|
-
- |e|.|M| : |T| - The result is the value of the member with name |M| from the structure value |e|.
- (OpCompositeExtract, using the member index) -
- - - - - - -
Getting a reference to a structure member from a reference to a structure
PreconditionConclusionDescription -
- |S| is a structure type
- |M| is the name of a member of |S|, having type |T|
- |r| : ref<|SC|,|S|>
-
- |r|.|M| : ref<|SC|,|T|> - Given a reference to a structure, the result is a reference to the structure member with identifier name |M|.
- The [=originating variable=] of the resulting reference is - the same as the originating variable of |r|.
- (OpAccessChain, using the index of the structure member) -
- -## Logical Expressions TODO ## {#logical-expr} - - - - -
Unary logical operations
PreconditionConclusionNotes -
|e| : bool`!`|e| : *bool* - Logical negation. Yields true when |e| is false, and false when |e| is true.
(OpLogicalNot) -
|e| : vec|N|<bool>`!`|e| : vec|N|<bool> - Component-wise logical negation. Component |i| of the result is `!(`|e|`[`|i|`])`.
(OpLogicalNot) -
- - - - - -
Binary logical expressions
PreconditionConclusionNotes -
*e1* : bool
*e2* : bool
`e1 || e2` : bool - Short-circuiting "or". Yields `true` if either `e1` or `e2` are true; evaluates `e2` only if `e1` is false. -
*e1* : bool
*e2* : bool
`e1 && e2` : bool - Short-circuiting "and". Yields `true` if both `e1` and `e2` are true; evaluates `e2` only if `e1` is true. -
*e1* : bool
*e2* : bool
`e1 | e2` : bool - Logical "or". Evaluates both `e1` and `e2`; yields `true` if either are `true`. -
*e1* : bool
*e2* : bool
`e1 & e2` : bool - Logical "and". Evaluates both `e1` and `e2`; yields `true` if both are `true`. -
*e1* : *T*
*e2* : *T*
*T* is *BoolVec*
`e1 | e2` : *T*Component-wise logical "or" -
*e1* : *T*
*e2* : *T*
*T* is *BoolVec*
`e1 & e2` : *T*Component-wise logical "and" -
- - -## Arithmetic Expressions TODO ## {#arithmetic-expr} - - - - - -
Unary arithmetic expressions
PreconditionConclusionNotes -
*e* : *T*, *T* is *SignedIntegral*`-e` : *T*Signed integer negation. OpSNegate -
*e* : *T*, *T* is *Floating*`-e` : *T*Floating point negation. OpFNegate -
- - - - - -
Binary arithmetic expressions over scalars
PreconditionConclusionNotes -
*e1* : u32
*e2* : u32
`e1 + e2` : u32Integer addition, modulo 232 (OpIAdd) -
*e1* : i32
*e2* : i32
`e1 + e2` : i32Integer addition, modulo 232 (OpIAdd) -
*e1* : f32
*e2* : f32
`e1 + e2` : f32Floating point addition (OpFAdd) -
*e1* : u32
*e2* : u32
`e1 - e2` : u32Integer subtraction, modulo 232 (OpISub) -
*e1* : i32
*e2* : i32
`e1 - e2` : i32Integer subtraction, modulo 232 (OpISub) -
*e1* : f32
*e2* : f32
`e1 - e2` : f32Floating point subtraction (OpFSub) -
*e1* : u32
*e2* : u32
`e1 * e2` : u32Integer multiplication, modulo 232 (OpIMul) -
*e1* : i32
*e2* : i32
`e1 * e2` : i32Integer multiplication, modulo 232 (OpIMul) -
*e1* : f32
*e2* : f32
`e1 * e2` : f32Floating point multiplication (OpFMul) -
*e1* : u32
*e2* : u32
`e1 / e2` : u32Unsigned integer division (OpUDiv) -
*e1* : i32
*e2* : i32
`e1 / e2` : i32Signed integer division (OpSDiv) -
*e1* : f32
*e2* : f32
`e1 / e2` : f32Floating point division (OpFDiv) -
*e1* : u32
*e2* : u32
`e1 % e2` : u32Unsigned integer modulus (OpUMod) -
*e1* : i32
*e2* : i32
`e1 % e2` : i32Signed integer remainder, where sign of non-zero result matches sign of *e2* (OpSMod) -
*e1* : f32
*e2* : f32
`e1 % e2` : f32Floating point modulus, where sign of non-zero result matches sign of *e2* (OpFMod) -
- - - - - -
Binary arithmetic expressions over vectors
PreconditionConclusionNotes -
*e1* : *T*
*e2* : *T*
*T* is *IntVec*
`e1 + e2` : *T*Component-wise integer addition (OpIAdd) -
*e1* : *T*
*e2* : *T*
*T* is *FloatVec*
`e1 + e2` : *T*Component-wise floating point addition (OpIAdd) -
*e1* : *T*
*e2* : *T*
*T* is *IntVec*
`e1 - e2` : *T*Component-wise integer subtraction (OpISub) -
*e1* : *T*
*e2* : *T*
*T* is *FloatVec*
`e1 - e2` : *T*Component-wise floating point subtraction (OpISub) -
*e1* : *T*
*e2* : *T*
*T* is *IntVec*
`e1 * e2` : *T*Component-wise integer multiplication (OpIMul) -
*e1* : *T*
*e2* : *T*
*T* is *FloatVec*
`e1 * e2` : *T*Component-wise floating point multiplication (OpIMul) -
*e1* : *T*
*e2* : *T*
*T* is *IntVec* with unsigned component
`e1 / e2` : *T*Component-wise unsigned integer division (OpUDiv) -
*e1* : *T*
*e2* : *T*
*T* is *IntVec* with signed component
`e1 / e2` : *T*Component-wise signed integer division (OpSDiv) -
*e1* : *T*
*e2* : *T*
*T* is *FloatVec*
`e1 / e2` : *T*Component-wise floating point division (OpFDiv) -
*e1* : *T*
*e2* : *T*
*T* is *IntVec* with unsigned component
`e1 % e2` : *T*Component-wise unsigned integer modulus (OpUMod) -
*e1* : *T*
*e2* : *T*
*T* is *IntVec* with signed component
`e1 % e2` : *T*Component-wise signed integer remainder (OpSMod) -
*e1* : *T*
*e2* : *T*
*T* is *FloatVec*
`e1 % e2` : *T*Component-wise floating point modulus (OpFMod) -
- - - - - - - - - - - - - - - - -
Binary arithmetic expressions with mixed scalar and vector operands
PreconditionsConclusionsSemantics -
|S| is one of f32, i32, u32
- |V| is vec|N|<|S|>
- |es|: |S|
- |ev|: |V| -
|ev| `+` |es|: |V| - |ev| `+` |V|(|es|) -
|es| `+` |ev|: |V| - |V|(|es|) `+` |ev| -
|ev| `-` |es|: |V| - |ev| `-` |V|(|es|) -
|es| `-` |ev|: |V| - |V|(|es|) `-` |ev| -
|ev| `*` |es|: |V| - |ev| `*` |V|(|es|) -
|es| `*` |ev|: |V| - |V|(|es|) `*` |ev| -
|ev| `/` |es|: |V| - |ev| `/` |V|(|es|) -
|es| `/` |ev|: |V| - |V|(|es|) `/` |ev| -
|S| is one of i32, u32
- |V| is vec|N|<|S|>
- |es|: |S|
- |ev|: |V| -
|ev| `%` |es|: |V| - |ev| `%` |V|(|es|) -
|es| `%` |ev|: |V| - |V|(|es|) `%` |ev| -
- - - - - - - - - - - - -
Matrix arithmetic
PreconditionsConclusionsSemantics -
|e1|, |e2|: mat|M|x|N|<f32> - |e1| `+` |e2|: mat|M|x|N|<f32>
-
Matrix addition: column |i| of the result is |e1|[i] + |e2|[i] -
|e1| `-` |e2|: mat|M|x|N|<f32> - Matrix subtraction: column |i| of the result is |e1|[|i|] - |e2|[|i|] -
|m|: mat|M|x|N|<f32>
- |s|: f32 -
|m| `*` |s| : mat|M|x|N|<f32>
-
Component-wise scaling: (|m| `*` |s|)[i][j] is |m|[i][j] `*` |s| -
|s| `*` |m| : mat|M|x|N|<f32>
-
Component-wise scaling: (|s| `*` |m|)[i][j] is |m|[i][j] `*` |s| -
|m|: mat|M|x|N|<f32>
- |v|: vec|M|<f32> -
|m| `*` |v| : vec|N|<f32>
-
Linear algebra matrix-column-vector product: - Component |i| of the result is `dot`(|m|[|i|],|v|) -
OpMatrixTimesVector -
- |m|: mat|M|x|N|<f32>
- |v|: vec|N|<f32> -
|v| `*` |m| : vec|M|<f32>
-
Linear algebra row-vector-matrix product:
- [=transpose=](transpose(|m|) `*` transpose(|v|)) -
OpVectorTimesMatrix -
|e1|: mat|K|x|N|<f32>
- |e2|: mat|M|x|K|<f32> -
|e1| `*` |e2| : mat|M|x|N|<f32>
-
Linear algebra matrix product.
OpMatrixTimesMatrix - -
- -## Comparison Expressions TODO ## {#comparison-expr} - - - - - -
Comparisons over scalars
PreconditionConclusionNotes -
*e1* : bool
- *e2* : bool
-
`e1 == e2` : bool - Equality (OpLogicalEqual) -
*e1* : bool
- *e2* : bool
-
`e1 != e2` : bool - Inequality (OpLogicalNotEqual) -
*e1* : i32
- *e2* : i32
-
`e1 == e2` : bool - Equality (OpIEqual) -
*e1* : i32
- *e2* : i32
-
`e1 != e2` : bool - Inequality (OpINotEqual) -
*e1* : i32
- *e2* : i32
-
`e1 < e2` : bool - Less than (OpSLessThan) -
*e1* : i32
- *e2* : i32
-
`e1 <= e2` : bool - Less than or equal (OpSLessThanEqual) -
*e1* : i32
- *e2* : i32
-
`e1 >= e2` : bool - Greater than or equal (OpSGreaterThanEqual) -
*e1* : i32
- *e2* : i32
-
`e1 > e2` : bool - Greater than (OpSGreaterThan) -
*e1* : u32
- *e2* : u32
-
`e1 == e2` : bool - Equality (OpIEqual) -
*e1* : u32
- *e2* : u32
-
`e1 != e2` : bool - Inequality (OpINotEqual) -
*e1* : u32
- *e2* : u32
-
`e1 < e2` : bool - Less than (OpULessThan) -
*e1* : u32
- *e2* : u32
-
`e1 <= e2` : bool - Less than or equal (OpULessThanEqual) -
*e1* : u32
- *e2* : u32
-
`e1 >= e2` : bool - Greater than or equal (OpUGreaterThanEqual) -
*e1* : u32
- *e2* : u32
-
`e1 > e2` : bool - Greater than (OpUGreaterThan) -
*e1* : f32
- *e2* : f32
-
`e1 == e2` : bool - Equality (OpFOrdEqual) -
*e1* : f32
- *e2* : f32
-
`e1 != e2` : bool - Equality (OpFOrdNotEqual) -
*e1* : f32
- *e2* : f32
-
`e1 < e2` : bool - Less than (OpFOrdLessThan) -
*e1* : f32
- *e2* : f32
-
`e1 <= e2` : bool - Less than or equal (OpFOrdLessThanEqual) -
*e1* : f32
- *e2* : f32
-
`e1 >= e2` : bool - Greater than or equal (OpFOrdGreaterThanEqual) -
*e1* : f32
- *e2* : f32
-
`e1 > e2` : bool - Greater than (OpFOrdGreaterThan) -
- - - - - -
Comparisons over vectors
PreconditionConclusionNotes -
*e1* : *T*
- *e2* : *T*
- *T* is vec*N*<bool> -
`e1 == e2` : vec*N*<bool> - Component-wise equality
- Component |i| of the result is `(`|e1|`[`|i|`] == `|e2|`[`|i|`])`
- (OpLogicalEqual) -
*e1* : *T*
- *e2* : *T*
- *T* is vec*N*<bool> -
`e1 != e2` : vec*N*<bool> - Component-wise inequality
- Component |i| of the result is `(`|e1|`[`|i|`] != `|e2|`[`|i|`])`
- (OpLogicalNotEqual) -
*e1* : *T*
- *e2* : *T*
- *T* is vec*N*<i32> -
`e1 == e2` : vec*N*<bool> - Component-wise equality (OpIEqual) -
*e1* : *T*
- *e2* : *T*
- *T* is vec*N*<i32> -
`e1 != e2` : vec*N*<bool> - Component-wise inequality (OpINotEqual) -
*e1* : *T*
- *e2* : *T*
- *T* is vec*N*<i32> -
`e1 < e2` : vec*N*<bool> - Component-wise less than (OpSLessThan) -
*e1* : *T*
- *e2* : *T*
- *T* is vec*N*<i32> -
`e1 <= e2` : vec*N*<bool> - Component-wise less than or equal (OpSLessThanEqual) -
*e1* : *T*
- *e2* : *T*
- *T* is vec*N*<i32> -
`e1 >= e2` : vec*N*<bool> - Component-wise greater than or equal (OpSGreaterThanEqual) -
*e1* : *T*
- *e2* : *T*
- *T* is vec*N*<i32> -
`e1 > e2` : vec*N*<bool> - Component-wise greater than (OpSGreaterThan) -
*e1* : *T*
- *e2* : *T*
- *T* is vec*N*<u32> -
`e1 == e2` : vec*N*<bool> - Component-wise equality (OpIEqual) -
*e1* : *T*
- *e2* : *T*
- *T* is vec*N*<u32> -
`e1 != e2` : vec*N*<bool> - Component-wise inequality (OpINotEqual) -
*e1* : *T*
- *e2* : *T*
- *T* is vec*N*<u32> -
`e1 < e2` : vec*N*<bool> - Component-wise less than (OpULessThan) -
*e1* : *T*
- *e2* : *T*
- *T* is vec*N*<u32> -
`e1 <= e2` : vec*N*<bool> - Component-wise less than or equal (OpULessThanEqual) -
*e1* : *T*
- *e2* : *T*
- *T* is vec*N*<u32> -
`e1 >= e2` : vec*N*<bool> - Component-wise greater than or equal (OpUGreaterThanEqual) -
*e1* : *T*
- *e2* : *T*
- *T* is vec*N*<u32> -
`e1 > e2` : vec*N*<bool> - Component-wise greater than (OpUGreaterThan) - *T* is vec*N*<u32> -
*e1* : *T*
- *e2* : *T*
- *T* is vec*N*<f32> -
`e1 == e2` : vec*N*<bool> - Component-wise equality (OpFOrdEqual) -
*e1* : *T*
- *e2* : *T*
- *T* is vec*N*<f32> -
`e1 != e2` : vec*N*<bool> - Component-wise inequality (OpFOrdNotEqual) -
*e1* : *T*
- *e2* : *T*
- *T* is vec*N*<f32> -
`e1 < e2` : vec*N*<bool> - Component-wise less than (OpFOrdLessThan) -
*e1* : *T*
- *e2* : *T*
- *T* is vec*N*<f32> -
`e1 <= e2` : vec*N*<bool> - Component-wise less than or equal (OpFOrdLessThanEqual) -
*e1* : *T*
- *e2* : *T*
- *T* is vec*N*<f32> -
`e1 >= e2` : vec*N*<bool> - Component-wise greater than or equal (OpFOrdGreaterThanEqual) -
*e1* : *T*
- *e2* : *T*
- *T* is vec*N*<f32> -
`e1 > e2` : vec*N*<bool> - Component-wise greater than (OpFOrdGreaterThan) -
- -## Bit Expressions TODO ## {#bit-expr} - - - - - - - - - -
Unary bitwise operations
PreconditionConclusionNotes -
|e| : u32
-
`~`|e| : u32 - Bitwise complement on unsigned integers. Result is the mathematical value (232 - 1 - |e|). -
OpNot -
|e| : vec|N|<u32> - `~`|e| : vec|N|<u32> - Component-wise unsigned complement. Component |i| of the result is `~(`|e|`[`|i|`])`. -
OpNot -
|e| : i32
-
`~`|e| : i32 - Bitwise complement on signed integers. Result is i32(~u32(|e|)). -
OpNot -
|e| : vec|N|<i32> - `~`|e| : vec|N|<i32> - Component-wise signed complement. Component |i| of the result is `~(`|e|`[`|i|`])`. -
OpNot -
- - - - - -
Binary bitwise operations
PreconditionConclusionNotes -
*e1* : *T*
- *e2* : *T*
- *T* is *Integral* -
`e1 | e2` : *T* - Bitwise-or -
*e1* : *T*
- *e2* : *T*
- *T* is *Integral* -
`e1 & e2` : *T* - Bitwise-and -
*e1* : *T*
- *e2* : *T*
- *T* is *Integral* -
`e1 ^ e2` : *T* - Bitwise-exclusive-or -
- - - - - - -
Bit shift expressions
PreconditionConclusionNotes -
|e1| : |T|
- |e2| : u32
- |T| is *Int* -
|e1| `<<` |e2| : |T| - Shift left:
- Shift |e1| left, inserting zero bits at the least significant positions, - and discarding the most significant bits. - The number of bits to shift is the value of |e2| modulo the bit width of |e1|.
- (OpShiftLeftLogical) -
|e1| : vec|N|<|T|>
- |e2| : vec|N|<u32>
- |T| is *Int* -
|e1| `<<` |e2| : vec|N|<|T|> - Component-wise shift left:
- Component |i| of the result is `(`|e1|`[`|i|`] << `|e2|`[`|i|`])`
- (OpShiftLeftLogical) -
|e1| : u32
- |e2| : u32
-
|e1| `>>` |e2| `: u32` - Logical shift right:
- Shift |e1| right, inserting zero bits at the most significant positions, - and discarding the least significant bits. - The number of bits to shift is the value of |e2| modulo the bit width of |e1|. - (OpShiftRightLogical) -
|e1| : vec|N|<u32>
- |e2| : vec|N|<u32>
-
|e1| `>>` |e2| : vec|N|<u32> - Component-wise logical shift right:
- Component |i| of the result is `(`|e1|`[`|i|`] >> `|e2|`[`|i|`])` - (OpShiftRightLogical) -
|e1| : i32
- |e2| : u32
-
|e1| `>>` |e2| : i32 - Arithmetic shift right:
- Shift |e1| right, copying the sign bit of |e1| into the most significant positions, - and discarding the least significant bits. - The number of bits to shift is the value of |e2| modulo the bit width of |e1|. - (OpShiftRightArithmetic) -
|e1| : vec|N|<i32>
- |e2| : vec|N|<u32>
-
|e1| `>>` |e2| : vec|N|<i32> - Component-wise arithmetic shift right:
- Component |i| of the result is `(`|e1|`[`|i|`] >> `|e2|`[`|i|`])` - (OpShiftRightArithmetic) -
- -## Function Call Expression TODO ## {#function-call-expr} - -TODO: *Stub*. Call to function that has a [=return type=] is an expression. - -## Variable Identifier Expression ## {#var-identifier-expr} - - - - - - -
Getting a reference from a variable name
PreconditionConclusionDescription -
- |v| is an identifier [=resolves|resolving=] to - an [=in scope|in-scope=] variable declared in [=storage class=] |SC| - with [=store type=] |T| - - |v| : ref<|SC|,|T|> - Result is a reference to the storage for the named variable |v|. -
- -## Formal Parameter Expression ## {#formal-parameter-expr} - - - - - - -
Getting the value of an identifier declared as a formal parameter to a function
PreconditionConclusionDescription -
- |a| is an identifier [=resolves|resolving=] to - an [=in scope|in-scope=] formal paramter declaration with type |T| - - |a| : |T| - Result is the value supplied for the corresponding function call operand at the call site - invoking this instance of the function. -
- -## Address-Of Expression ## {#address-of-expr} - -The address-of operator converts a reference to its corresponding pointer. - - - - - - -
Getting a pointer from a reference
PreconditionConclusionDescription -
- |r| : ref<|SC|,|T|> - - `&`|r| : ptr<|SC|,|T|> - Result is the pointer value corresponding to the - same [=memory view=] as the reference value |r|. -
- -## Indirection Expression ## {#indirection-expr} - -The indirection operator converts a pointer to its corresponding reference. - - - - - - -
Getting a reference from a pointer
PreconditionConclusionDescription -
- |p| : ptr<|SC|,|T|> - - `*`|p| : ref<|SC|,|T|> - Result is the reference value corresponding to the - same [=memory view=] as the pointer value |p|. -
- -## Constant Identifier Expression ## {#constant-identifier-expr} - - - - - - - -
Getting the value of a `let`-declared identifier
PreconditionConclusionDescription -
- |c| is an identifier [=resolves|resolving=] to - an [=in scope|in-scope=] [=pipeline-overridable=] `let` declaration with type |T| - - |c| : |T| - If pipeline creation specified a value for the [=pipeline constant ID|constant ID=], - then the result is that value. - This value may be different for different pipeline instances.
- Otherwise, the result is the value computed for the initializer expression. - Pipeline-overridable constants appear at module-scope, so evaluation occurs - before the shader begins execution.
- Note: Pipeline creation fails if no initial value was specified in the API call - and the `let`-declaration has no intializer expression. -
- |c| is an identifier [=resolves|resolving=] to - an [=in scope|in-scope=] `let` declaration with type |T|, - and is not pipeline-overridable - - |c| : |T| - Result is the value computed for the initializer expression.
- For a `let` declaration at module scope, evaluation occurs before the shader begins execution.
- For a `let` declaration inside a function, evaluation occurs each time control reaches - the declaration.
-
- - -## Expression Grammar Summary ## {#expression-grammar} - -
-primary_expression
-  : IDENT argument_expression_list?
-  | type_decl argument_expression_list
-  | const_literal
-  | paren_rhs_statement
-  | BITCAST LESS_THAN type_decl GREATER_THAN paren_rhs_statement
-      OpBitcast
-
-argument_expression_list
-  : PAREN_LEFT ((short_circuit_or_expression COMMA)* short_circuit_or_expression)? PAREN_RIGHT
-
-postfix_expression
-  :
-  | BRACKET_LEFT short_circuit_or_expression BRACKET_RIGHT postfix_expression
-  | PERIOD IDENT postfix_expression
-
-unary_expression
-  : singular_expression
-  | MINUS unary_expression
-      OpSNegate
-      OpFNegate
-  | BANG unary_expression
-      OpLogicalNot
-  | TILDE unary_expression
-      OpNot
-  | STAR unary_expression
-  | AND unary_expression
-
-singular_expression
-  : primary_expression postfix_expression
-
-multiplicative_expression
-  : unary_expression
-  | multiplicative_expression STAR unary_expression
-      OpVectorTimesScalar
-      OpMatrixTimesScalar
-      OpVectorTimesMatrix
-      OpMatrixTimesVector
-      OpMatrixTimesMatrix
-      OpIMul
-      OpFMul
-  | multiplicative_expression FORWARD_SLASH unary_expression
-      OpUDiv
-      OpSDiv
-      OpFDiv
-  | multiplicative_expression MODULO unary_expression
-      OpUMOd
-      OpSMod
-      OpFMod
-
-additive_expression
-  : multiplicative_expression
-  | additive_expression PLUS multiplicative_expression
-      OpIAdd
-      OpFAdd
-  | additive_expression MINUS multiplicative_expression
-      OpFSub
-      OpISub
-
-shift_expression
-  : additive_expression
-  | shift_expression SHIFT_LEFT additive_expression
-        OpShiftLeftLogical
-  | shift_expression SHIFT_RIGHT additive_expression
-        OpShiftRightLogical or OpShiftRightArithmetic
-
-relational_expression
-  : shift_expression
-  | relational_expression LESS_THAN shift_expression
-        OpULessThan
-        OpFOrdLessThan
-  | relational_expression GREATER_THAN shift_expression
-        OpUGreaterThan
-        OpFOrdGreaterThan
-  | relational_expression LESS_THAN_EQUAL shift_expression
-        OpULessThanEqual
-        OpFOrdLessThanEqual
-  | relational_expression GREATER_THAN_EQUAL shift_expression
-        OpUGreaterThanEqual
-        OpFOrdGreaterThanEqual
-
-equality_expression
-  : relational_expression
-  | relational_expression EQUAL_EQUAL relational_expression
-        OpIEqual
-        OpFOrdEqual
-  | relational_expression NOT_EQUAL relational_expression
-        OpINotEqual
-        OpFOrdNotEqual
-
-and_expression
-  : equality_expression
-  | and_expression AND equality_expression
-
-exclusive_or_expression
-  : and_expression
-  | exclusive_or_expression XOR and_expression
-
-inclusive_or_expression
-  : exclusive_or_expression
-  | inclusive_or_expression OR exclusive_or_expression
-
-short_circuit_and_expression
-  : inclusive_or_expression
-  | short_circuit_and_expression AND_AND inclusive_or_expression
-
-short_circuit_or_expression
-  : short_circuit_and_expression
-  | short_circuit_or_expression OR_OR short_circuit_and_expression
-
- - -# Statements TODO # {#statements} - -## Compound Statement ## {#compound-statement} - -A compound statement is a brace-enclosed group of zero or more statements. -When a declaration is one of those statements, its identifier is [=in scope=] -from the start of the next statement until the end of the compound statement. - -
-compound_statement
-  : BRACE_LEFT statements BRACE_RIGHT
-
- -## Assignment Statement ## {#assignment} - -An assignment statement replaces the contents of a variable, -or a portion of a variable, with a new value. - -The -expression to the left of the equals token is the left-hand side, -and the -expression to the right of the equals token is the right-hand side. - - - - - -
PreconditionStatementDescription -
|r| : ref<|SC|,|T|>,
- |e| : |T|,
- |T| is [=storable=],
- |SC| is a writable [=storage class=] -
|r| = |e|; - Evaluates |e|, evaluates |r|, then writes the value computed for |e| into - the [=memory locations=] referenced by |r|.
- (OpStore) -
- -The [=originating variable=] of the left-hand side must not have an `access(read)` access attribute. - -In the simplest case, the left hand side of the assignment statement is the -name of a variable. See [[#forming-references-and-pointers]] for other cases. - -
- - struct S { - age: i32; - weight: f32; - }; - var<private> person : S; - - fn f() { - var a: i32 = 20; - a = 30; // Replace the contents of 'a' with 30. - - person.age = 31; // Write 31 into the age field of the person variable. - - var uv: vec2<f32>; - uv.y = 1.25; // Place 1.25 into the second component of uv. - - const uv_x_ptr: ptr<function,f32> = &uv.x; - *uv_x_ptr = 2.5; // Place 2.5 into the first component of uv. - - var friend : S; - // Copy the contents of the 'person' variable into the 'friend' variable. - friend = person; - } - -
- -
-assignment_statement
-  : singular_expression EQUAL short_circuit_or_expression
-      If singular_expression is a variable, this maps to OpStore to the variable.
-      Otherwise, singular expression is a pointer expression in an Assigning (L-value) context
-      which maps to OpAccessChain followed by OpStore
-
- -## Control flow TODO ## {#control-flow} - -### Sequence TODO ### {#sequence-statement} - -### If/elseif/else Statement TODO ### {#if-statement} - -
-if_statement
-  : IF paren_rhs_statement compound_statement elseif_statement? else_statement?
-
-elseif_statement
-  : ELSE_IF paren_rhs_statement compound_statement elseif_statement?
-
-else_statement
-  : ELSE compound_statement
-
- - -### Switch Statement ### {#switch-statement} - -
-switch_statement
-  : SWITCH paren_rhs_statement BRACE_LEFT switch_body+ BRACE_RIGHT
-
-switch_body
-  : CASE case_selectors COLON BRACE_LEFT case_body BRACE_RIGHT
-  | DEFAULT COLON BRACE_LEFT case_body BRACE_RIGHT
-
-case_selectors
-  : const_literal (COMMA const_literal)*
-
-case_body
-  :
-  | statement case_body
-  | FALLTHROUGH SEMICOLON
-
- -A switch statement transfers control to one of a set of case clauses, or to the `default` clause, -depending on the evaluation of a selector expression. - -The selector expression must be of a scalar integer type. -If the selector value equals a value in a case selector list, then control is transferred to -the body of that case clause. -If the selector value does not equal any of the case selector values, then control is -transferred to the `default` clause. - -Each switch statement must have exactly one default clause. - -The case selector values must have the same type as the selector expression. - -A literal value must not appear more than once in the case selectors for a switch statement. - -Note: The value of the literal is what matters, not the spelling. -For example `0`, `00`, and `0x0000` all denote the zero value. - -When control reaches the end of a case body, control normally transfers to the first statement -after the switch statement. -Alternately, executing a `fallthrough` statement transfers control to the body of the next case clause or -default clause, whichever appears next in the switch body. -A `fallthrough` statement must not appear as the last statement in the last clause of a switch. -When a declaration appears in a case body, its identifier is [=in scope=] from -the start of the next statement until the end of the case body. - -Note: Identifiers declared in a case body are not [=in scope=] of case bodies -which are reachable via a `fallthrough` statement. - - -### Loop Statement ### {#loop-statement} - -
-loop_statement
-  : LOOP BRACE_LEFT statements continuing_statement? BRACE_RIGHT
-
- -The loop body is special form [compound -statement](#compound-statement) that executes repeatedly. -Each execution of the loop body is called an iteration. - -The identifier of a declaration in a loop is [=in scope=] from the start of the -next statement until the end of the loop body. -The declaration is executed each time it is reached, so each new iteration -creates a new instance of the variable or constant, and re-initializes it. - -This repetition can be interrupted by a [[#break-statement]], `return`, or -`discard`. - -Optionally, the last statement in the loop body may be a -[[#continuing-statement]]. - -Note: The loop statement is one of the biggest differences from other shader -languages. - -This design directly expresses loop idioms commonly found in compiled code. -In particular, placing the loop update statements at the end of the loop body -allows them to naturally use values defined in the loop body. - -
- - int a = 2; - for (int i = 0; i < 4; i++) { - a *= 2; - } - -
- -
- - let a : i32 = 2; - var i : i32 = 0; // <1> - loop { - if (i >= 4) { break; } - - a = a * 2; - - i = i + 1; - } - -
-* <1> The initialization is listed before the loop. - -
- - int a = 2; - const int step = 1; - for (int i = 0; i < 4; i += step) { - if (i % 2 == 0) continue; - a *= 2; - } - -
- -
- - var a : i32 = 2; - var i : i32 = 0; - loop { - if (i >= 4) { break; } - - let step : i32 = 1; - - i = i + 1; - if (i % 2 == 0) { continue; } - - a = a * 2; - } - -
- -
- - var a : i32 = 2; - var i : i32 = 0; - loop { - if (i >= 4) { break; } - - let step : i32 = 1; - - if (i % 2 == 0) { continue; } - - a = a * 2; - - continuing { // <2> - i = i + step; - } - } - -
-* <2> The continue construct is placed at the end of the `loop` - -### For Statement ### {#for-statement} - -
-for_statement
-  : FOR PAREN_LEFT for_header PAREN_RIGHT compound_statement
-
-for_header
-  : (variable_statement | assignment_statement | func_call_statement)? SEMICOLON
-     short_circuit_or_expression? SEMICOLON
-     (assignment_statement | func_call_statement)?
-
- -The `for(initializer; condition; continuing) { body }` statement is syntactic sugar on top of a [[#loop-statement]] with the same `body`. Additionally: -* If `initializer` is non-empty, it is executed inside an additional scope before the first iteration. -* If `condition` is non-empty, it is checked at the beginning of the loop body and if unsatisfied then a [[#break-statement]] is executed. -* If `continuing` is non-empty, it becomes a [[#continuing-statement]] at the end of the loop body. - -The `initializer` of a for loop is executed once prior to executing the loop. -When a declaration appears in the initializer, its identifier is [=in scope=] until the end of the `body`. -Unlike declarations in the `body`, the declaration is not re-initialized each iteration. - -The `condition`, `body` and `continuing` execute in that order to form a loop [=iteration=]. -The `body` is a special form of [compound statement](#compound-statement). -The identifier of a declaration in the `body` is [=in scope=] from the start of -the next statement until the end of the `body`. -The declaration is executed each time it is reached, so each new iteration -creates a new instance of the variable or constant, and re-intializes it. - -
- - for(var i : i32 = 0; i < 4; i = i + 1) { - if (a == 0) { - continue; - } - a = a + 2; - } - -
- -Converts to: - -
- - { // Introduce new scope for loop variable i - var i : i32 = 0; - var a : i32 = 0; - loop { - if (!(i < 4)) { - break; - } - - if (a == 0) { - continue; - } - a = a + 2; - - continuing { - i = i + 1; - } - } - } - -
- - -### Break ### {#break-statement} - -
-break_statement
-  : BREAK
-
- -Use a `break` statement to transfer control to the first statement -after the body of the nearest-enclosing [[#loop-statement]] -or [[#switch-statement]]. - -When a `break` statement is placed such that it would exit from a loop's [[#continuing-statement]], -then: - -* The `break` statement must appear as either: - * The only statement in the true-branch clause of an `if` that has: - * no `else` clause or an empty `else` clause - * no `elseif` clauses - * The only statement in the `else` clause of an `if` that has an empty true-branch clause and no `elseif` clauses. -* That `if` statement must appear last in the `continuing` clause. - -
- - var a : i32 = 2; - var i : i32 = 0; - loop { - let step : i32 = 1; - - if (i % 2 == 0) { continue; } - - a = a * 2; - - continuing { - i = i + step; - if (i >= 4) { break; } - } - } - -
- -
- - var a : i32 = 2; - var i : i32 = 0; - loop { - let step : i32 = 1; - - if (i % 2 == 0) { continue; } - - a = a * 2; - - continuing { - i = i + step; - if (i < 4) {} else { break; } - } - } - -
- -
- - var a : i32 = 2; - var i : i32 = 0; - - loop { - let step : i32 = 1; - - if (i % 2 == 0) { continue; } - - a = a * 2; - - continuing { - i = i + step; - break; // Invalid: too early - if (i < 4) { i = i + 1; } else { break; } // Invalid: if is too complex, and too early - if (i >= 4) { break; } else { i = i + 1; } // Invalid: if is too complex - } - } - -
- -### Continue ### {#continue-statement} - -
-continue_statement
-  : CONTINUE
-
- -Use a `continue` statement to transfer control in the nearest-enclosing [[#loop-statement]]: - -* forward to the [[#continuing-statement]] at the end of the body of that loop, if it exists. -* otherwise backward to the first statement in the loop body, starting the next iteration - -A `continue` statement must not be placed such that it would transfer -control to an enclosing [[#continuing-statement]]. -(It is a *forward* branch when branching to a `continuing` statement.) - -A `continue` statement must not be placed such that it would transfer -control past a declaration used in the targeted continuing construct. - -
- - var i : i32 = 0; - loop { - if (i >= 4) { break; } - if (i % 2 == 0) { continue; } // <3> - - let step : i32 = 2; - - continuing { - i = i + step; - } - } - -
-* <3> The `continue` is invalid because it bypasses the declaration of `step` used in the `continuing` construct - -### Continuing Statement ### {#continuing-statement} - -
-continuing_statement
-  : CONTINUING compound_statement
-
- -A *continuing* construct is a block of statements to be executed at the end of a loop iteration. -The construct is optional. - -The block of statements must not contain a return or discard statement. - -### Return Statement ### {#return-statement} - -
-return_statement
-  : RETURN short_circuit_or_expression?
-
- -A return statement ends execution of the current function. -If the function is an [=entry point=], then the current shader invocation -is terminated. -Otherwise, evaluation continues with the next expression or statement after -the evaluation of the call site of the current function invocation. - -If the function doesn't have a [=return type=], then the return statement is -optional. If the return statement is provided for such a function, it must not -supply a value. -Otherwise the expression must be present, and is called the *return value*. -In this case the call site of this function invocation evaluates to the return value. -The type of the return value must match the return type of the function. - - -### Discard Statement ### {#discard-statement} - -The `discard` statement must only be used in a [=fragment=] shader stage. -Executing a `discard` statement will: - -* immediately terminate the current invocation, and -* prevent evaluation and generation of a return value for the [=entry point=], and -* prevent the current fragment from being processed downstream in the [=GPURenderPipeline=]. - -Only statements -executed prior to the `discard` statement will have observable effects. - -Note: A `discard` statement may be executed by any -[=functions in a shader stage|function in a fragment stage=] and the effect is the same: -immediate termination of the invocation. - -After a `discard` statement is executed, control flow is non-uniform for the -duration of the entry point. - -Issue: [[#uniform-control-flow]] needs to state whether all invocations being discarded maintains uniform control flow. - -
- - var<private> will_emit_color: bool = false; - - fn discard_if_shallow(pos: vec4<f32>) { - if (pos.z < 0.001) { - // If this is executed, then the will_emit_color flag will - // never be set to true. - discard; - } - will_emit_color = true; - } - - [[stage(fragment)]] - fn main([[builtin(position)]] coord_in: vec4<f32>) - -> [[location(0)]] vec4<f32> - { - discard_if_shallow(coord_in); - - // Set the flag and emit red, but only if the helper function - // did not execute the discard statement. - will_emit_color = true; - return vec4<f32>(1.0, 0.0, 0.0, 1.0); - } - -
- -## Function Call Statement TODO ## {#function-call-statement} - -
-func_call_statement
-  : IDENT argument_expression_list
-
- -## Statements Grammar Summary ## {#statements-summary} - -
-compound_statement
-  : BRACE_LEFT statements BRACE_RIGHT
-
-paren_rhs_statement
-  : PAREN_LEFT short_circuit_or_expression PAREN_RIGHT
-
-statements
-  : statement*
-
-statement
-  : SEMICOLON
-  | return_statement SEMICOLON
-  | if_statement
-  | switch_statement
-  | loop_statement
-  | for_statement
-  | func_call_statement SEMICOLON
-  | variable_statement SEMICOLON
-  | break_statement SEMICOLON
-  | continue_statement SEMICOLON
-  | DISCARD SEMICOLON
-  | assignment_statement SEMICOLON
-  | compound_statement
-
- - -# Functions # {#functions} - -A function performs computational work when invoked. - -A function is invoked in one of the following ways: -* By evaluating a function call expression. See [[#function-call-expr]]. -* By executing a function call statement. See [[#function-call-statement]]. -* An [=entry point=] function is invoked by the WebGPU implementation to perform - the work of a [=shader stage=] in a [=pipeline=]. See [[#entry-points]] - -There are two kinds of functions: -* A [=built-in function=] is provided by the [SHORTNAME] implementation, - and is always available to a [SHORTNAME] program. - See [[#builtin-functions]]. -* A user-defined function is declared in a [SHORTNAME] program. - -## Declaring a user-defined function ## {#function-declaration-sec} - -A function declaration creates a user-defined function, by specifying: -* An optional set of attributes. -* The name of the function. -* The formal parameter list: an ordered sequence of zero - or more [=formal parameter=] declarations, - separated by commas, and - surrounded by parentheses. -* An optional, possibly decorated, return type. -* The function body. - -A function declaration must only occur at [=module scope=]. -The function name is [=in scope=] from the start of the formal parameter list -until the end of the program. - -A formal parameter declaration specifies an identifier name and a type for a value that must be -provided when invoking the function. -A formal parameter may have attributes. -See [[#function-calls]]. -The identifier is [=in scope=] until the end of the function. -Two formal parameters for a given function must not have the same name. - -If the return type is specified, then: -* The return type must be a [=plain type=]. -* The last statement in the function body must be a [=return=] statement. - -
-function_decl
-  : attribute_list* function_header compound_statement
-
-function_header
-  : FN IDENT PAREN_LEFT param_list PAREN_RIGHT function_return_type_decl_optional
-
-function_return_type_decl_optional
-  :
-  | ARROW attribute_list* type_decl
-
-param_list
-  :
-  | (param COMMA)* param
-
-param
-  : attribute_list* variable_ident_decl
-
- -[SHORTNAME] defines the following attributes that can be applied to function declarations: - * [=attribute/stage=] - * [=attribute/workgroup_size=] - -[SHORTNAME] defines the following attributes that can be applied to function -parameters and return types: - * [=attribute/builtin=] - * [=attribute/location=] - -
- - // Declare the add_two function. - // It has two formal paramters, i and b. - // It has a return type of i32. - // It has a body with a return statement. - fn add_two(i: i32, b: f32) -> i32 { - return i + 2; // A formal parameter is available for use in the body. - } - - // A compute shader entry point function, 'main'. - // It has no specified return type. - // It invokes the ordinary_two function, and captures - // the resulting value in the named value 'two'. - [[stage(compute)]] fn main() { - let six: i32 = add_two(4, 5.0); - } - -
- -## Function calls TODO ## {#function-calls} - -A function call is a statement or expression which invokes a function. - -TODO: explain how invocation works: supply operands matching formal parameter types, -"suspend" execution of the caller, then resume after -the callee is done (unless discard). Describe return value. - -TODO: A function call site is a dynamic context. This matters when discussing the originating -variable for the dynamic value provided as the operand for a formal argument having pointer type. - -The names in the parameter list of a function definition are available for use in the body -of the function. -During a particular function evaluation, -the parameter names denote the values specified to the function call expression or statement -which initiated the function evaluation; -the names and values are associated by position. - -TODO: define 'formal parameter'. - -## Function calls TODO ## {#func-call-semantics} - -## Restrictions TODO ## {#function-restriction} -TODO: *This is a stub* - -* Recursion is not permitted. (No cycle in the call graph.) -* Function call parameters - * Match type and number - * Restrictions on pointers - * Aliasing (?) - - -# Entry Points TODO # {#entry-points} - -## Shader Stages ## {#shader-stages-sec} - -WebGPU issues work to the GPU in the form of [=draw command|draw=] or [=dispatch commands=]. -These commands execute a pipeline in the context of a set of -[=pipeline input|inputs=], [=pipeline output|outputs=], and attached [=resources=]. - -A pipeline describes the behaviour to be performed on the GPU, as a sequence -of stages, some of which are programmable. -In WebGPU, a pipeline is created before scheduling a draw or dispatch command for execution. -There are two kinds of pipelines: GPUComputePipeline, and GPURenderPipeline. - -A [=dispatch command=] uses a GPUComputePipeline to run a -compute shader stage over a logical -grid of points with a controllable amount of parallelism, -while reading and possibly updating buffer and image resources. - -A [=draw command=] uses a GPURenderPipeline to run a multi-stage process with -two programmable stages among other fixed-function stages: - -* A vertex shader stage maps input attributes for a single vertex into - output attributes for the vertex. -* Fixed-function stages map vertices into graphic primitives (such as triangles) - which are then rasterized to produce fragments. -* A fragment shader stage processes each fragment, - possibly producing a fragment output. -* Fixed-function stages consume a fragment output, possibly updating external state - such as color attachments and depth and stencil buffers. - -The WebGPU specification describes pipelines in greater detail. - -[SHORTNAME] defines three shader stages, corresponding to the -programmable parts of pipelines: - -* compute -* vertex -* fragment - -Each shader stage has its own set of features and constraints, described elsewhere. - -## Entry point declaration ## {#entry-point-decl} - -An entry point is a [=user-defined function=] that is invoked to perform -the work for a particular [=shader stage=]. - -Specify a `stage` attribute on a [=function declaration=] to declare that function -as an entry point. - -When configuring the stage in the pipeline, the entry point is specified by providing -the [SHORTNAME] module and the entry point's function name. - -The parameters of an entry point have to be within [=Entry point IO type=]s. -The return type of an entry point has to be of an [=Entry point IO type=], if specified. - -Note: compute entry points never have a return type. - -
- - [[stage(vertex)]] - fn vert_main() -> [[builtin(position)]] vec4<f32> { - return vec4<f32>(0.0, 0.0, 0.0, 1.0); - } - // OpEntryPoint Vertex %vert_main "vert_main" %return_value - // OpDecorate %return_value BuiltIn Position - // %float = OpTypeFloat 32 - // %v4float = OpTypeVector %float 4 - // %ptr = OpTypePointer Output %v4float - // %return_value = OpVariable %ptr Output - - [[stage(fragment)]] - fn frag_main([[builtin(position)]] coord_in: vec4<f32>) -> [[location(0)]] vec4<f32> { - return vec4<f32>(coord_in.x, coord_in.y, 0.0, 1.0); - } - // OpEntryPoint Fragment %frag_main "frag_main" %return_value %coord_in - // OpDecorate %return_value Location 0 - // %float = OpTypeFloat 32 - // %v4float = OpTypeVector %float 4 - // %ptr = OpTypePointer Output %v4float - // %return_value = OpVariable %ptr Output - - [[stage(compute)]] - fn comp_main() { } - // OpEntryPoint GLCompute %comp_main "comp_main" - -
- -The set of functions in a shader stage is the union of: - -* The entry point function for the stage. -* The targets of function calls from within the body of a function - in the shader stage, whether or not that call is executed. - -The union is applied repeatedly until it stabilizes. -It will stabilize in a finite number of steps. - -### Function attributes for entry points ### {#entry-point-attributes} - -[SHORTNAME] defines the following attributes that can be applied to entry point declarations: - * [=attribute/stage=] - * [=attribute/workgroup_size=] - -ISSUE: Can we query upper bounds on workgroup size dimensions? Is it independent of the shader, or - a property to be queried after creating the shader module? - -
- - [[ stage(compute), workgroup_size(8,1,1) ]] - fn sorter() { } - // OpEntryPoint GLCompute %sorter "sorter" - // OpExecutionMode %sorter LocalSize 8 1 1 - - [[ stage(compute), workgroup_size(8) ]] - fn reverser() { } - // OpEntryPoint GLCompute %reverser "reverser" - // OpExecutionMode %reverser LocalSize 8 1 1 - - [[ stage(compute) ]] - fn do_nothing() { } - // OpEntryPoint GLCompute %do_nothing "do_nothing" - // OpExecutionMode %do_nothing LocalSize 1 1 1 - -
- -## Shader Interface ## {#shader-interface} - -The shader interface is the set of objects -through which the shader accesses data external to the [=shader stage=], -either for reading or writing. -The interface includes: - -* Pipeline inputs and outputs -* Buffer resources -* Texture resources -* Sampler resources - -These objects are represented by module-scope variables in certain [=storage classes=]. - -We say a variable is statically accessed by a function if any subexpression -in the body of the function uses the variable's identifier, -and that subexpression is [=in scope=] of the variable's declaration. -Static access of a `let`-declared constant is defined similarly. -Note that being statically accessed is independent of whether an execution of the shader -will actually evaluate the subexpression, or even execute the enclosing statement. - -More precisely, the interface of a shader stage consists of: - - all parameters of the entry point - - the result value of the entry point - - all [=module scope=] variables that are [=statically accessed=] by [=functions in a shader stage|functions in the shader stage=], - and which are in storage classes [=storage classes/uniform=], [=storage classes/storage=], or [=storage classes/handle=]. - -### Pipeline Input and Output Interface ### {#pipeline-inputs-outputs} - -The Entry point IO types include the following: - - Built-in variables. See [[#builtin-inputs-outputs]]. - - User-defined IO. See [[#user-data-attributes]] - - Structures containing only built-in variables and user-defined IO. - The structure must not contain a nested structure. - -A pipeline input is data provided to the shader stage from upstream in the pipeline. -A pipeline input is denoted by the arguments of the entry point. - -A pipeline output is data the shader provides for further processing downstream in the pipeline. -A pipeline output is denoted by the return type of the entry point. - -Each pipeline input or output is one of: - -* A built-in variable. See [[#builtin-inputs-outputs]]. -* A user data attribute. See [[#user-data-attributes]]. - -#### Built-in inputs and outputs #### {#builtin-inputs-outputs} - -A built-in input variable provides access to system-generated control information. -The set of built-in inputs are listed in [[#builtin-variables]]. - -To declare a variable for accessing a particular input built-in *X* from an entry point: - -* Declare a parameter of the entry point function, - where the [=store type=] is the listed store type for *X*. -* Apply a `builtin(`*X*`)` attribute to the parameter. - -A built-in output variable is used by the shader to convey -control information to later processing steps in the pipeline. -The set of built-in outputs are listed in [[#builtin-variables]]. - -To declare a variable for accessing a particular output built-in *Y* from an entry point: - -* Add a variable to the result of the entry point, where [=store type=] is the listed store type for *Y*: - * If there is no result type for the entry point, change it to the variable type. - * Otherwise, make the result type to be a structure, where one of the fields is the new variable. -* Apply a `builtin(`*Y*`)` attribute to the result variable. - -The `builtin` attribute must not be applied to a variables in [=module scope=], -or the local variables in the function scope. - -A variable must not have more than one `builtin` attribute. - -Each built-in variable has an associated shader stage, as described in [[#builtin-variables]]. -If a built-in variable has stage *S* and is used by a function *F*, as either an argument or the -result type, then *F* must be a [=functions in a shader stage|function in a shader=] for stage *S*. - -Issue: in Vulkan, builtin variables occoupy I/O location slots counting toward limits. - -#### User Data Attribute TODO #### {#user-data-attributes} - -User-defined data can be passed as input to the start of a pipeline, passed -between stages of a pipeline or output from the end of a pipeline. -User-defined IO must not be passed to [=compute=] shader entry points. -User-defined IO must be [=numeric scalar=] or [=numeric vector=] types . -All user defined IO must be assigned locations (See [[#input-output-locations]]). - -#### Interpolation #### {#interpolation} - -Authors can control how user-defined IO data is interpolated through the use of -the [=attribute/interpolate=] attribute. -[SHORTNAME] offers two aspects of interpolation to control: the type of -interpolation, and the sampling of the interpolation. - -The interpolation type must be one of: -* `perspective` - Values are interpolated in a perspective correct manner. -* `linear` - Values are interpolated in a linear, non-perspective correct manner. -* `flat` - Values are not interpolated. - Interpolation sampling is not used with `flat` interpolation. - -The interpolation sampling must be one of: -* `center` - Interpolation is performed at the center of the pixel. -* `centroid` - Interpolation is performed at a point that lies within all the - samples covered by the fragment within the current primitive. - This value is the same for all samples in the primitive. -* `sample` - Interpolation is performed per sample. - The [=fragment=] shader is invoked once per sample when this attribute is - applied. - -The default interpolation of user-defined IO of scalar or vector floating-point -type is `[[interpolate(perspective, center)]]`. -User-defined IO of scalar or vector integer type is always -`[[interpolate(flat)]]` and, therefore, must not be specified in a [SHORTNAME] program. - -Interpolation attributes must match between [=vertex=] outputs and [=fragment=] -inputs with the same [=attribute/location=] assignment within the same [=pipeline=]. - -#### Input-output Locations #### {#input-output-locations} - -Each location can store a value up to 16 bytes in size. -The byte size of a type is defined using the *SizeOf* column in [[#alignment-and-size]]. -For example, a four-element vector of floating-point values occupies a single location. - -Locations are specified via the [=attribute/location=] attribute. - -Every user-defined input and output must have a fully specified set of -locations. -Each structure member in the entry point IO must be one of either a builtin variable -(see [[#builtin-inputs-outputs]]), or assigned a location. - -For a given entry point, the locations of the return type are distinct from -the locations of the function parameters. -Within each set of locations, there must be no overlap. - -Note: the number of available locations for an entry point is defined by the WebGPU API. - -
- - struct A { - [[location(0)]] x : f32; - // Despite locations being 16-bytes, x and y cannot share a location - [[location(1)]] y : f32; - }; - - // in1 occupies locations 0 and 1. - // in2 occupies location 2. - // The return value occupies location 0. - [[stage(fragment)]] - fn fragShader(in1 : A, [[location(2)]] in2 : f32) -> [[location(0)]] vec4<f32> { - // ... - } - -
- -User-defined IO can be mixed with builtin variables in the same structure. For example, - -
- - // Mixed builtins and user-defined inputs. - struct MyInputs { - [[location(0)]] x : vec4<f32>; - [[builtin(front_facing)]] y : bool; - [[location(1)]] z : u32; - }; - - struct MyOutputs { - [[builtin(frag_depth)]] x : f32; - [[location(0)]] y : vec4<f32>; - }; - - [[stage(fragment)]] - fn fragShader(in1 : MyInputs) -> MyOutputs { - // ... - } - -
- -
- - struct A { - [[location(0)]] x : u32; - // Invalid, x and y cannot share a location. - [[location(0)]] y : u32; - }; - - struct B { - [location(0)]] x : f32; - }; - - struct C { - // Invalid, structures with user-defined IO cannot be nested. - b : B; - }; - - struct D { - x : vec4<f32>; - }; - - [[stage(fragment)]] - // Invalid, location cannot be applied to a structure type. - fn fragShader1([[location(0)]] in1 : D) { - // ... - } - - [[stage(fragment)]] - // Invalid, in1 and in2 cannot share a location. - fn fragShader2([location(0)]] in1 : f32, [[location(0)]] in2 : f32) { - // ... - } - - [[stage(fragment)]] - // Invalid, location cannot be applied to a structure. - fn fragShader3([[location(0)]] in1 : vec4<f32>) -> [[location(0)]] D { - // ... - } - -
- -### Resource interface ### {#resource-interface} - -A resource is an object, -other than a [[#pipeline-inputs-outputs|pipeline input or output]], -which provides access to data external to a [=shader stage=]. -Resources are shared by all invocations of the shader. - -There are four kinds of resources: - -* [=uniform buffers=] -* [=storage buffers=] -* textures -* samplers - -The resource interface of a shader is the set of module-scope -resource variables [=statically accessed=] by -[=functions in a shader stage|functions in the shader stage=]. - -Each resource variable must be declared with both [=group=] and [=binding=] -attributes. -Together with the shader's stage, these identify the binding address -of the resource on the shader's pipeline. -See [[WebGPU#pipeline-layout|WebGPU § GPUPipelineLayout]]. - -Bindings must not alias within a shader stage: -two different variables in the resource interface of a given -shader must not have the same group and binding values, when considered as a pair of values. - -### Resource layout compatibility ### {#resource-layout-compatibility} - -WebGPU requires that a shader's resource interface match the [[WebGPU#pipeline-layout|layout of the pipeline]] -using the shader. - -Each [SHORTNAME] variable in a resource interface must be bound to a WebGPU resource with -a compatible -[[WebGPU#enumdef-gpubindingtype|GPUBindingType]], -where compatibility is defined by the following table. - - - - -
WebGPU binding type compatibility
[SHORTNAME] resource - WebGPU [[WebGPU#enumdef-gpubindingtype|GPUBindingType]] -
[=uniform buffer=] - [[WebGPU#dom-gpubindingtype-uniform-buffer|uniform-buffer]] -
read-write [=storage buffer=] - [[WebGPU#dom-gpubindingtype-storage-buffer|storage-buffer]] -
read-only [=storage buffer=] - [[WebGPU#dom-gpubindingtype-readonly-storage-buffer|readonly-storage-buffer]] -
sampler - [[WebGPU#dom-gpubindingtype-sampler|sampler]] -
sampler_comparison - [[WebGPU#dom-gpubindingtype-comparison-sampler|comparison-sampler]] -
sampled texture - [[WebGPU#dom-gpubindingtype-sampled-texture|sampled-texture]] or - [[WebGPU#dom-gpubindingtype-multisampled-texture|multisampled-texture]] -
[=read-only storage texture=] - [[WebGPU#dom-gpubindingtype-readonly-storage-texture|readonly-storage-texture]] -
[=write-only storage texture=] - [[WebGPU#dom-gpubindingtype-writeonly-storage-texture|writeonly-storage-texture]] -
- - -TODO: Rewrite the phrases 'read-only storage buffer' and 'read-write storage buffer' after -we settle on how to express those concepts. -See https://github.com/gpuweb/gpuweb/pull/1183 - -If |B| is a [=uniform buffer=] variable in a resource interface, -and |WB| is the [[WebGPU#buffer-interface|WebGPU GPUBuffer]] bound to |B|, then: -* The size of |WB| must be at least as large as the size of the [=store type=] - of |B| in the [=storage classes/storage=] storage class. - -If |B| is a [=storage buffer=] variable in a resource interface, -and |WB| is the [[WebGPU#buffer-interface|WebGPU GPUBuffer]] bound to |B|, then: -* If the [=store type=] |S| of |B| does not contain a [=runtime-sized=] array, then - the size of |WB| must be at least as large as the size - of |S| in the [=storage classes/storage=] storage class. -* If the [=store type=] |S| of |B| contains a [=runtime-sized=] array as its last member, - then: - * The runtime-determined array length of that member must be at least 1. - * The size of |WB| must be at least as large as the size in - storage class [=storage classes/storage=] of the value stored in |B|. - -Note: Recall that a [=runtime-sized=] array may only appear as the last element in the structure -type that is the store type of a storage buffer variable. - -TODO: Describe other interface matching requirements, e.g. for images? - -## Pipeline compatibility TODO ## {#pipeline-compatibility} - -TODO: match flat attribute - -TODO: user data inputs of fragment stage must be subset of user data outputs of vertex stage - -### Input-output matching rules TODO ### {#input-output-matching} - -# Language extensions # {#language-extensions} - -The [SHORTNAME] language is expected to evolve over time. - -An extension is a named grouping for a coherent -set of modifications to a particular version of the [SHORTNAME] specification, consisting of any combination of: -* Addition of new concepts and behaviours via new syntax, including: - * declarations, statements, attributes, and built-in functions. -* Removal of restrictions in the current specification or in previously published extensions. -* Syntax for reducing the set of permissible behaviours. -* Syntax for limiting the features available to a part of the program. -* A description of how the extension interacts with the existing specification, and optionally with other extensions. - -Hypothetically, extensions could be used to: -* Add numeric scalar types, such as 16-bit integers. -* Add syntax to constrain floating point rounding mode. -* Add syntax to signal that a shader does not use atomic types. -* Add new kinds of statements. -* Add new built-in functions. -* Add constraints on how shader invocations execute. -* Add new shader stages. - -## Enable Directive ## {#enable-directive-section} - -An enable directive indicates that the functionality -described by a particular named -[=extension=] may be used in the source text after the directive itself. -That is, language functionality described by the extension may be used in any -source text after the `enable` directive. - -The directive must not appear inside the text of any [=declaration=]. -(If it were a declaration, it would be at [=module scope=].) - -The directive uses an identifier to name the extension, but does not -create a [=scope=] for the identifier. -Use of the identifier by the directive does not conflict with the -use of that identifier as the name in any [=declaration=]. - -
-enable_directive
-  : ENABLE IDENT SEMICOLON
-
- -Note: The grammar rule includes the terminating semicolon token, -ensuring the additional functionality is usable only after that semicolon. -Therefore any [SHORTNAME] implementation can parse the entire `enable` directive. -When an implementation encounters an enable directive for an unsupported extension, -the implementation can issue a clear diagnostic. - -
- - // Enable a hypothetical IEEE binary16 floating point extension. - enable f16; - - // Assuming the f16 extension enables use of the f16 type: - // - as function return value - // - as the type for let declaration - // - as a type constructor, with an i32 argument - // - as operands to the division operator: / - fn halve_it(x: f16) -> f16 { - let two: f16 = f16(2); - return x / two; - }; - - enable f16; // A redundant enable directive is ok. - // Enable a hypothetical extension adding syntax for controlling - // the rounding mode on f16 arithmetic. - enable rounding_mode_f16; - - [[round_to_even_f16]] // Attribute enabled by the rounding_mode_f16 extension - fn triple_it(x: f16) -> f16 { - return x * f16(3); // Uses round-to-even. - }; - -
- - -# WGSL program TODO # {#wgsl-module} - -TODO: *Stub* A WGSL program is a sequence of [=directives=] and [=module scope=] [=declarations=]. - -
-translation_unit
-  : global_decl_or_directive* EOF
-
- -
-global_decl_or_directive
-  : SEMICOLON
-  | global_variable_decl SEMICOLON
-  | global_constant_decl SEMICOLON
-  | type_alias SEMICOLON
-  | struct_decl SEMICOLON
-  | function_decl
-  | enable_directive
-
- -# Execution TODO # {#execution} - -## Invocation of an entry point TODO ## {#invocation-of-an-entry-point} - -### Before an entry point begins TODO ### {#before-entry-point-begins} - -TODO: *Stub* - -* Setting values of builtin variables -* External-interface variables have initialized backing storage -* Internal module-scope variables have backing storage - * Initializers evaluated in textual order -* No two variables have overlapping storage (might already be covered earlier?) - -### Program order (within an invocation) TODO ### {#program-order} - -#### Function-scope variable lifetime and initialization TODO #### {#function-scope-variable-lifetime} - -#### Statement order TODO #### {#statement-order} - -#### Intra-statement order (or lack) TODO #### {#intra-statement-order} - -TODO: *Stub*: Expression evaluation - -## Uniformity TODO ## {#uniformity} - -### Uniform control flow TODO ### {#uniform-control-flow} - -### Divergence and reconvergence TODO ### {#divergence-reconvergence} - -### Uniformity restrictions TODO ### {#uniformity-restrictions} - -## Compute Shaders and Workgroups ## {#compute-shader-workgroups} - -A workgroup is a set of invocations which -concurrently execute a [=compute shader stage=] [=entry point=], -and share access to shader variables in the [=storage classes/workgroup=] storage class. - -The workgroup grid for a compute shader is the set of points -with integer coordinates *(i,j,k)* with: - -* 0 ≤ i < workgroup_size_x -* 0 ≤ j < workgroup_size_y -* 0 ≤ k < workgroup_size_z - -where *(workgroup_size_x, workgroup_size_y, workgroup_size_z)* is -the value specified for the [=workgroup_size=] attribute of the -entry point, or (1,1,1) if the entry point has no such attribute. - -There is exactly one invocation in a workgroup for each point in the workgroup grid. - -An invocation's local invocation ID is the coordinate -triple for the invocation's corresponding workgroup grid point. - -When an invocation has [=local invocation ID=] (i,j,k), then its -local invocation index is - - i + - (j * workgroup_size_x) + - (k * workgroup_size_x * workgroup_size_y) - -

Note that if a workgroup has |W| invocations, -then each invocation |I| the workgroup has a unique local invocation index |L|(|I|) -such that 0 ≤ |L|(|I|) < |W|, -and that entire range is covered.

- -A compute shader begins execution when a WebGPU implementation -removes a dispatch command from a queue and begins the specified work on the GPU. -The dispatch command specifies a dispatch size, -which is an integer triple *(group_count_x, group_count_y, group_count_z)* -indicating the number of workgroups to be executed, as described in the following. - -The compute shader grid for a particular dispatch -is the set of points with integer coordinates *(CSi,CSj,CSk)* with: - -* 0 ≤ CSi ≤ workgroup_size_x × group_count_x -* 0 ≤ CSj ≤ workgroup_size_y × group_count_y -* 0 ≤ CSk ≤ workgroup_size_z × group_count_z - -where *workgroup_size_x*, -*workgroup_size_y*, and -*workgroup_size_z* are as above for the compute shader entry point. - -The work to be performed by a compute shader dispatch is to execute exactly one -invocation of the entry point for each point in the compute shader grid. - -An invocation's global invocation ID is the coordinate -triple for the invocation's corresponding compute shader grid point. - -The invocations are organized into workgroups, so that each invocation -*(CSi, CSj, CSk)* is identified with the workgroup grid point - - ( *CSi* mod workgroup_size_x , - *CSj* mod workgroup_size_y , - *CSk* mod workgroup_size_z ) - -in workgroup ID - - ( ⌊ *CSi* ÷ workgroup_size_x ⌋, - ⌊ *CSj* ÷ workgroup_size_y ⌋, - ⌊ *CSk* ÷ workgroup_size_z ⌋). - -WebGPU provides no guarantees about: - -* Whether invocations from different workgroups execute concurrently. - That is, you cannot assume more than one workgroup executes at a time. -* Whether, once invocations from a workgroup begin executing, that other workgroups - are blocked from execution. - That is, you cannot assume that only one workgroup executes at a time. - While a workgroup is executing, the implementation may choose to - concurrently execute other workgroups as well, or other queued but unblocked work. -* Whether invocations from one particular workgroup begin executing before - the invocations of another workgroup. - That is, you cannot assume that workgroups are launched in a particular order. - -Issue: [WebGPU issue 1045](https://github.com/gpuweb/gpuweb/issues/1045): -Dispatch group counts must be positive. -However, how do we handle an indirect dispatch that specifies a group count of zero. - -## Collective operations TODO ## {#collective-operations} - -### Barrier TODO ### {#barrier} - -### Image Operations Requiring Uniformity TODO ### {#image-operations-requiring-uniformity} - -### Derivatives TODO ### {#derivatives} - -### Arrayed resource access TODO ### {#arrayed-resource-access} - -## Floating Point Evaluation TODO ## {#floating-point-evaluation} - -TODO: *Stub* - -* Infinities, NaNs, negative zeros -* Denorms, flushing -* fast-math rules: e.g. reassociation, fusing -* Invariance (or is this more general than floating point) -* Rounding -* Error bounds on basic operations - -### Floating point conversion ### {#floating-point-conversion} - -When converting a floating point scalar value to an integral type: -* If the original value is exactly representable in the destination type, then the result is that value. -* If the original value has a fractional component, then it cannot be represented exactly in the destination type, and the result is TODO -* If the original value is out of range of the destination type, then TODO. - -When converting a value to a floating point type: -* If the original value is exactly representable in the destination type, then the result is that value. - * If the original value is zero and of integral type, then the resulting value has a zero sign bit. -* Otherwise, the original value is not exactly representable. - * If the original value is different from but lies between two adjacent values representable in the destination type, - then the result is one of those two values. - [SHORTNAME] does not specify whether the larger or smaller representable - value is chosen, and different instances of such a conversion may choose differently. - * Otherwise, if the original value lies outside the range of the destination type. - * This does not occur when the original types is one of [=i32=] or [=u32=] and the destination type is [=f32=]. - * This does not occur when the source type is a floating point type with fewer exponent and mantissa bits. - * If the source type is a floating point type with more mantissa bits than the destination type, then: - * The extra mantissa bits of the source value may be discarded (treated as if they are 0). - * If the resulting value is the maximum normal value of the destination type, then that is the result. - * Otherwise the result is the infinity value with the same sign as the source value. - * Otherwise, if the original value is a NaN for the source type, then the result is a NaN in the destination type. - -NOTE: An integer value may lie between two adjacent representable floating point values. -In particular, the [=f32=] type uses 23 explicit fractional bits. -Additionally, when the floating point value is in the normal range (the exponent is neither extreme value), then the mantissa is -the set of fractional bits together with an extra 1-bit at the most significant position at bit position 23. -Then, for example, integers 228 and 1+228 both map to the same floating point value: the difference in the -least significant 1 bit is not representable by the floating point format. -This kind of collision occurs for pairs of adjacent integers with a magnitude of at least 225. - -Issue: (dneto) Default rounding mode is an implementation choice. Is that what we want? - -Issue: Check behaviour of the f32 to f16 conversion for numbers just beyond the max normal f16 values. -I've written what an NVIDIA GPU does. See https://github.com/google/amber/pull/918 for an executable test case. - -# Memory Model TODO # {#memory-model} - -# Keyword and Token Summary # {#grammar} - -## Keyword Summary ## {#keyword-summary} - - - - - -
Type-defining keywords
TokenDefinition -
`ARRAY`array -
`BOOL`bool -
`FLOAT32`f32 -
`INT32`i32 -
`MAT2x2`mat2x2 // 2 column x 2 row -
`MAT2x3`mat2x3 // 2 column x 3 row -
`MAT2x4`mat2x4 // 2 column x 4 row -
`MAT3x2`mat3x2 // 3 column x 2 row -
`MAT3x3`mat3x3 // 3 column x 3 row -
`MAT3x4`mat3x4 // 3 column x 4 row -
`MAT4x2`mat4x2 // 4 column x 2 row -
`MAT4x3`mat4x3 // 4 column x 3 row -
`MAT4x4`mat4x4 // 4 column x 4 row -
`POINTER`ptr -
`SAMPLER`sampler -
`SAMPLER_COMPARISON`sampler_comparison -
`STRUCT`struct -
`TEXTURE_1D`texture_1d -
`TEXTURE_2D`texture_2d -
`TEXTURE_2D_ARRAY`texture_2d_array -
`TEXTURE_3D`texture_3d -
`TEXTURE_CUBE`texture_cube -
`TEXTURE_CUBE_ARRAY`texture_cube_array -
`TEXTURE_MULTISAMPLED_2D`texture_multisampled_2d -
`TEXTURE_STORAGE_1D`texture_storage_1d -
`TEXTURE_STORAGE_2D`texture_storage_2d -
`TEXTURE_STORAGE_2D_ARRAY`texture_storage_2d_array -
`TEXTURE_STORAGE_3D`texture_storage_3d -
`TEXTURE_DEPTH_2D`texture_depth_2d -
`TEXTURE_DEPTH_2D_ARRAY`texture_depth_2d_array -
`TEXTURE_DEPTH_CUBE`texture_depth_cube -
`TEXTURE_DEPTH_CUBE_ARRAY`texture_depth_cube_array -
`UINT32`u32 -
`VEC2`vec2 -
`VEC3`vec3 -
`VEC4`vec4 -
- - - - -
Other keywords
TokenDefinition -
`BITCAST`bitcast -
`BLOCK`block -
`BREAK`break -
`CASE`case -
`CONTINUE`continue -
`CONTINUING`continuing -
`DEFAULT`default -
`DISCARD`discard -
`ELSE`else -
`ELSE_IF`elseif -
`ENABLE`enable -
`FALLTHROUGH`fallthrough -
`FALSE`false -
`FN`fn -
`FOR`for -
`FUNCTION`function -
`IF`if -
`LET`let -
`LOOP`loop -
`PRIVATE`private -
`RETURN`return -
`STORAGE`storage -
`SWITCH`switch -
`TRUE`true -
`TYPE`type -
`UNIFORM`uniform -
`VAR`var -
`WORKGROUP`workgroup -
- - - - -
Image format keywords
TokenDefinition -
`R8UNORM`r8unorm -
`R8SNORM`r8snorm -
`R8UINT`r8uint -
`R8SINT`r8sint -
`R16UINT`r16uint -
`R16SINT`r16sint -
`R16FLOAT`r16float -
`RG8UNORM`rg8unorm -
`RG8SNORM`rg8snorm -
`RG8UINT`rg8uint -
`RG8SINT`rg8sint -
`R32UINT`r32uint -
`R32SINT`r32sint -
`R32FLOAT`r32float -
`RG16UINT`rg16uint -
`RG16SINT`rg16sint -
`RG16FLOAT`rg16float -
`RGBA8UNORM`rgba8unorm -
`RGBA8UNORM-SRGB`rgba8unorm_srgb -
`RGBA8SNORM`rgba8snorm -
`RGBA8UINT`rgba8uint -
`RGBA8SINT`rgba8sint -
`BGRA8UNORM`bgra8unorm -
`BGRA8UNORM-SRGB`bgra8unorm_srgb -
`RGB10A2UNORM`rgb10a2unorm -
`RG11B10FLOAT`rg11b10float -
`RG32UINT`rg32uint -
`RG32SINT`rg32sint -
`RG32FLOAT`rg32float -
`RGBA16UINT`rgba16uint -
`RGBA16SINT`rgba16sint -
`RGBA16FLOAT`rgba16float -
`RGBA32UINT`rgba32uint -
`RGBA32SINT`rgba32sint -
`RGBA32FLOAT`rgba32float -
- -TODO(dneto): Eliminate the image formats that are not used in storage images. -For example SRGB formats (bgra8unorm_srgb), mixed channel widths (rg11b10float), out-of-order channels (bgra8unorm) - -## Reserved Keywords ## {#reserved-keywords} -The following is a list of keywords which are reserved for future expansion. - - - - - - -
asm - bf16 - do - enum - f16 -
f64 - i8 - i16 - i64 - const -
typedef - u8 - u16 - u64 - unless -
using - while - regardless - premerge - handle -
- -## Syntactic Tokens ## {#syntactic-tokens} - -
`AND``&` -
`AND_AND``&&` -
`ARROW``->` -
`ATTR_LEFT``[[` -
`ATTR_RIGHT``]]` -
`FORWARD_SLASH``/` -
`BANG``!` -
`BRACKET_LEFT``[` -
`BRACKET_RIGHT``]` -
`BRACE_LEFT``{` -
`BRACE_RIGHT``}` -
`COLON``:` -
`COMMA``,` -
`EQUAL``=` -
`EQUAL_EQUAL``==` -
`NOT_EQUAL``!=` -
`GREATER_THAN``>` -
`GREATER_THAN_EQUAL``>=` -
`SHIFT_RIGHT``>>` -
`LESS_THAN``<` -
`LESS_THAN_EQUAL``<=` -
`SHIFT_LEFT``<<` -
`MODULO``%` -
`MINUS``-` -
`MINUS_MINUS``--` -
`PERIOD``.` -
`PLUS``+` -
`PLUS_PLUS``++` -
`OR``|` -
`OR_OR``||` -
`PAREN_LEFT``(` -
`PAREN_RIGHT``)` -
`SEMICOLON``;` -
`STAR``*` -
`TILDE``~` -
`XOR``^` -
- -Note: The `MINUS_MINUS` and `PLUS_PLUS` tokens are reserved, i.e. they are not used in any grammar productions. -For example `x--` and `++i` are not syntactically valid expressions in [SHORTNAME]. - -# Validation # {#validation} - -TODO: Move these to the subject-matter sections. - -Each validation item will be given a unique ID and a test must be provided -when the validation is added. The tests will reference the validation ID in -the test name. - -* v-0001: A declaration must not introduce a name when that name is already in scope at the start - of the declaration. -* v-0004: Recursion is not allowed. -* v-0007: Structures must be defined before use. -* v-0008: switch statements must have exactly one default clause. -* v-0009: Break is only permitted in loop and switch constructs. -* v-0010: continue is only permitted in loop. -* v-0015: The last member of the structure type defining the "store type" for variable in the - storage storage class may be a runtime-sized array. -* v-0017: Builtin decorations must have the correct types. -* v-0018: Builtin decorations must be used with the correct shader type and - storage class. -* v-0020: The pair of `` must be unique in the - module. -* v-0021: Cannot re-assign a constant. -* v-0022: Global variables must have a storage class. -* v-0025: Switch statement selector expression must be of a scalar integer type. -* v-0026: The case selector values must have the same type as the selector expression. -* v-0027: A literal value must not appear more than once in the case selectors for a switch statement. -* v-0028: A fallthrough statement must not appear as the last statement in last clause of a switch. -* v-0029: Return must come last in its block. -* v-0030: A runtime-sized array must not be used as the store type or contained within a store type - except as allowed by v-0015. -* v-0031: The type of an expression must not be a runtime-sized array type. -* v-0032: A runtime-sized array must have a stride attribute. - - -# Built-in variables # {#builtin-variables} - -See [[#builtin-inputs-outputs]] for how to declare a built-in variable. - - - - - -
Built-inStageInput or OutputStore typeDescription -
`vertex_index` - vertex - in - u32 - Index of the current vertex within the current API-level draw command, - independent of draw instancing. - - For a non-indexed draw, the first vertex has an index equal to the `firstIndex` argument - of the draw, whether provided directly or indirectly. - The index is incremented by one for each additional vertex in the draw instance. - - For an indexed draw, the index is equal to the index buffer entry for - vertex, plus the `baseVertex` argument of the draw, whether provided directly or indirectly. - -
`instance_index` - vertex - in - u32 - Instance index of the current vertex within the current API-level draw command. - - The first instance has an index equal to the `firstInstance` argument of the draw, - whether provided directly or indirectly. - The index is incremented by one for each additional instance in the draw. - -
`position` - vertex - out - vec4<f32> - Output position of the current vertex, using homogeneous coordinates. - After homogeneous normalization (where each of the *x*, *y*, and *z* components - are divided by the *w* component), the position is in the WebGPU normalized device - coordinate space. - See [[WebGPU#coordinate-systems|WebGPU § Coordinate Systems]]. - -
`position` - fragment - in - vec4<f32> - Framebuffer position of the current fragment, using normalized homogeneous - coordinates. - (The *x*, *y*, and *z* components have already been scaled such that *w* is now 1.) - See [[WebGPU#coordinate-systems|WebGPU § Coordinate Systems]]. - -
`front_facing` - fragment - in - bool - True when the current fragment is on a front-facing primitive. - False otherwise. - See [[WebGPU#dom-gpurasterizationstatedescriptor-frontface|WebGPU § Rasterization State]]. - -
`frag_depth` - fragment - out - f32 - Updated depth of the fragment, in the viewport depth range. - See [[WebGPU#coordinate-systems|WebGPU § Coordinate Systems]]. - -
`local_invocation_id` - compute - in - vec3<u32> - The current invocation's [=local invocation ID=], - i.e. its position in the [=workgroup grid=]. - -
`local_invocation_index` - compute - in - u32 - The current invocation's [=local invocation index=], a linearized index of - the invocation's position within the [=workgroup grid=]. - -
`global_invocation_id` - compute - in - vec3<u32> - The current invocation's [=global invocation ID=], - i.e. its position in the [=compute shader grid=]. - -
`workgroup_id` - compute - in - vec3<u32> - The current invocation's [=workgroup ID=], - i.e. the position of the workgroup in the [=workgroup grid=]. - -
`workgroup_size` - compute - in - vec3<u32> - The [=workgroup_size=] of the current entry point. - -
`sample_index` - fragment - in - u32 - Sample index for the current fragment. - The value is least 0 and at most `sampleCount`-1, where - [[WebGPU#dom-gpurenderpipelinedescriptor-samplecount|sampleCount]] - is the number of MSAA samples specified for the GPU render pipeline. -
See [[WebGPU#gpurenderpipe|WebGPU § GPURenderPipeline]]. - -
`sample_mask` - fragment - in - u32 - Sample coverage mask for the current fragment. - It contains a bitmask indicating which samples in this fragment are covered - by the primitive being rendered. -
See [[WebGPU#sample-masking|WebGPU § Sample Masking]]. - -
`sample_mask` - fragment - out - u32 - Sample coverage mask control for the current fragment. - The last value written to this variable becomes the - [[WebGPU#shader-output-mask|shader-output mask]]. - Zero bits in the written value will cause corresponding samples in - the color attachments to be discarded. -
See [[WebGPU#sample-masking|WebGPU § Sample Masking]]. -
- -
- - struct VertexOutput { - [[builtin(position)]] my_pos: vec4<f32>; - // OpDecorate %my_pos BuiltIn Position - // %float = OpTypeFloat 32 - // %v4float = OpTypeVector %float 4 - // %ptr = OpTypePointer Output %v4float - // %my_pos = OpVariable %ptr Output - }; - - [[stage(vertex)]] - fn vs_main( - [[builtin(vertex_index)]] my_index: u32, - // OpDecorate %my_index BuiltIn VertexIndex - // %uint = OpTypeInt 32 0 - // %ptr = OpTypePointer Input %uint - // %my_index = OpVariable %ptr Input - [[builtin(instance_index)]] my_inst_index : u32, - // OpDecorate %my_inst_index BuiltIn InstanceIndex - ) -> VertexOutput; - - struct FragmentOutput { - [[builtin(frag_depth)]] depth: f32; - // OpDecorate %depth BuiltIn FragDepth - [[builtin(sample_mask)]] mask_out : u32; - // OpDecorate %mask_out BuiltIn SampleMask ; an output variable - }; - - [[stage(fragment)]] - fn fs_main( - [[builtin(front_facing)]] is_front : u32, - // OpDecorate %is_front BuiltIn FrontFacing - [[builtin(position)]] coord : vec4<f32>, - // OpDecorate %coord BuiltIn FragCoord - [[builtin(sample_index)]] my_sample_index : u32, - // OpDecorate %my_sample_index BuiltIn SampleId - [[builtin(sample_mask_in)]] mask_in : u32, - // OpDecorate %mask_in BuiltIn SampleMask ; an input variable - // OpDecorate %mask_in Flat - ) -> FragmentOutput; - - [[stage(compute)]] - fn cs_main( - [[builtin(local_invocation_id)]] local_id : vec3<u32>, - // OpDecorate %local_id BuiltIn LocalInvocationId - [[builtin(local_invocation_index)]] local_index : u32, - // OpDecorate %local_index BuiltIn LocalInvocationIndex - [[builtin(global_invocation_id)]] global_id : vec3<u32>, - // OpDecorate %global_id BuiltIn GlobalInvocationId - ); - -
- -# Built-in functions # {#builtin-functions} - -Certain functions are always available in a [SHORTNAME] program, -and are provided by the implementation. -These are called built-in functions. - -Since a built-in function is always in scope, it is an error to attempt to redefine -one or to use the name of a built-in function as an identifier for any other -kind of declaration. - -Unlike ordinary functions defined in a [SHORTNAME] program, -a built-in function may use the same function name with different -sets of parameters. -In other words, a built-in function may have more than one *overload*, -but ordinary function definitions in [SHORTNAME] may not. - -When calling a built-in function, all arguments to the function are evaluated -before function evaulation begins. - -TODO(dneto): Elaborate the descriptions of the built-in functions. So far I've only reorganized -the contents of the existing table. - -TODO: Explain the use of a function prototype in the table: provides name, formal parameter list, and return type. -That's not a full user-defined function declaration. - -## Logical built-in functions ## {#logical-builtin-functions} - - - - -
Logical built-in functionsSPIR-V -
all(BoolVec) -> boolOpAll -
any(BoolVec) -> boolOpAny -
select(*T*,*T*,bool) -> *T* - For scalar or vector type *T*. - `select(a,b,c)` evaluates to *a* when *c* is true, and *b* otherwise.
- OpSelect -
select(vec*N*<*T*>,vec*N*<*T*>,vec*N*<bool>) -> vec*N*<*T*> - For scalar type *T*. - `select(a,b,c)` evaluates to a vector with component *i* being `select(a[i], b[i], c[i])`.
- OpSelect -
- -## Value-testing built-in functions ## {#value-testing-builtin-functions} - - - - - - - - - - - - - -
Unary operators
PreconditionConclusionNotes -
|e| : f32`isNan(`|e|`)` : bool - Returns true if |e| is NaN according to IEEE. (OpIsNan) -
|e| : |T|, |T| is *FloatVec* - `isNan(`|e|`)` : vec|N|<bool>, where |N| = *Arity(*|T|*)*Component-wise test for NaN. Component *i* of the result is *isNan(e[i])*. (OpIsNan) -
|e| : f32`isInf(`|e|`)` : bool - Returns true if |e| is an infinity according to IEEE. (OpIsInf) -
|e| : |T|, |T| is *FloatVec* - `isInf(`|e|`)` : vec|N|<bool>, where |N| = *Arity(*|T|*)*Component-wise test for inifinity. Component *i* of the result is *isInf(e[i])*. (OpIsInf) -
|e| : f32`isFinite(`|e|`)` : bool - Returns true if |e| is finite according to IEEE. (emulated) -
|e| : |T|, |T| is *FloatVec* - `isFinite(`|e|`)` : vec|N|<bool>, where |N| = *Arity(*|T|*)*Component-wise finite value test. Component *i* of the result is *isFinite(e[i])*. (emulated) -
|e| : f32`isNormal(`|e|`)` : bool - Returns true if |e| is a normal number according to IEEE. (emulated) -
|e| : |T|, |T| is *FloatVec* - `isNormal(`|e|`)` : vec|N|<bool>, where |N| = *Arity(*|T|*)*Component-wise test for normal number. Component *i* of the result is *isNormal(e[i])*. (emulated) -
|e| : ptr<storage,array<|T|>> - `arrayLength(`|e|`)` : u32Returns the number of elements in the runtime array.
- (OpArrayLength, but you have to trace back to get the pointer to the enclosing struct.) -
- -## Float built-in functions ## {#float-builtin-functions} - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
PreconditionBuilt-inDescription -
|T| is f32 - `abs(`|e|`:` |T| `) -> ` |T| - (GLSLstd450FAbs) -
|T| is f32 - `abs(`|e|`:` vec|N|<|T|> `) -> ` vec|N|<|T|> - (GLSLstd450FAbs) -
|T| is f32 - `acos(`|e|`:` |T| `) -> ` |T| - (GLSLstd450Acos) -
|T| is f32 - `acos(`|e|`:` vec|N|<|T|> `) -> ` vec|N|<|T|> - (GLSLstd450Acos) -
|T| is f32 - `asin(`|e|`:` |T| `) -> ` |T| - (GLSLstd450Asin) -
|T| is f32 - `asin(`|e|`:` vec|N|<|T|> `) -> ` vec|N|<|T|> - (GLSLstd450Asin) -
|T| is f32 - `atan(`|e|`:` |T| `) -> ` |T| - (GLSLstd450Atan) -
|T| is f32 - `atan(`|e|`:` vec|N|<|T|> `) -> ` vec|N|<|T|> - (GLSLstd450Atan) -
|T| is f32 - `atan2(`|e1|`:` |T| `, `|e2|`:` |T| `) -> ` |T| - (GLSLstd450Atan2) -
|T| is f32 - `atan2(`|e1|`:` vec|N|<|T|> `, `|e2|`:` vec|N|<|T|> `) -> ` vec|N|<|T|> - (GLSLstd450Atan2) -
|T| is f32 - `ceil(`|e|`:` |T| `) -> ` |T| - (GLSLstd450Ceil) -
|T| is f32 - `ceil(`|e|`:` vec|N|<|T|> `) -> ` vec|N|<|T|> - (GLSLstd450Ceil) -
|T| is f32 - `clamp(`|e1|`:` |T| `, `|e2|`:` |T| `, `|e3|`:` |T|`) -> ` |T| - (GLSLstd450NClamp) -
|T| is f32 - `clamp(`|e1|`:` vec|N|<|T|> `, `|e2|`:` vec|N|<|T|>`, `|e3|`:` vec|N|<|T|>`) -> ` vec|N|<|T|> - (GLSLstd450NClamp) -
|T| is f32 - `cos(`|e|`:` |T| `) -> ` |T| - (GLSLstd450Cos) -
|T| is f32 - `cos(`|e|`:` vec|N|<|T|> `) -> ` vec|N|<|T|> - (GLSLstd450Cos) -
|T| is f32 - `cosh(`|e|`:` |T| `) -> ` |T| - (GLSLstd450Cosh) -
|T| is f32 - `cosh(`|e|`:` vec|N|<|T|> `) -> ` vec|N|<|T|> - (GLSLstd450Cosh) -
|T| is f32 - `cross(`|e1|`:` vec3<|T|> `, `|e2|`:` vec3<|T|>`) -> ` vec3<|T|> - (GLSLstd450Cross) -
|T| is f32 - `distance(`|e1|`:` |T| `, `|e2|`:` |T| `) -> ` |T| - (GLSLstd450Distance) -
|T| is f32 - `distance(`|e1|`:` vec|N|<|T|> `, `|e2|`:` vec|N|<|T|>`) -> ` |T| - (GLSLstd450Distance) -
|T| is f32 - `exp(`|e|`:` |T| `) -> ` |T| - (GLSLstd450Exp) -
|T| is f32 - `exp(`|e|`:` vec|N|<|T|> `) -> ` vec|N|<|T|> - (GLSLstd450Exp) -
|T| is f32 - `exp2(`|e|`:` |T| `) -> ` |T| - (GLSLstd450Exp2) -
|T| is f32 - `exp2(`|e|`:` vec|N|<|T|> `) -> ` vec|N|<|T|> - (GLSLstd450Exp2) -
|T| is f32 - `faceForward(`|e1|`:` |T| `, `|e2|`:` |T| `, `|e3|`:` |T| `) -> ` |T| - (GLSLstd450FaceForward) -
|T| is f32 - `faceForward(`|e1|`:` vec|N|<|T|> `, `|e2|`:` vec|N|<|T|>`, `|e3|`:` vec|N|<|T|>`) -> ` vec|N|<|T|> - (GLSLstd450FaceForward) -
|T| is f32 - `floor(`|e|`:` |T| `) -> ` |T| - (GLSLstd450Floor) -
|T| is f32 - `floor(`|e|`:` vec|N|<|T|> `) -> ` vec|N|<|T|> - (GLSLstd450Floor) -
|T| is f32 - `fma(`|e1|`:` |T| `, `|e2|`:` |T| `, `|e3|`:` |T| `) -> ` |T| - (GLSLstd450Fma) -
|T| is f32 - `fma(`|e1|`:` vec|N|<|T|> `, `|e2|`:` vec|N|<|T|>`, `|e3|`:` vec|N|<|T|>`) -> ` vec|N|<|T|> - (GLSLstd450Fma) -
|T| is f32 - `fract(`|e|`:` |T| `) -> ` |T| - (GLSLstd450Fract) -
|T| is f32 - `fract(`|e|`:` vec|N|<|T|> `) -> ` vec|N|<|T|> - (GLSLstd450Fract) -
|T| is f32
- |I| is i32 or u32 -
`frexp(`|e1|`:` |T| `, `|e2|`:` ptr<|I|> `) -> ` |T| - (GLSLstd450Frexp) -
|T| is f32
- |I| is i32 or u32 -
`frexp(`|e1|`:` vec|N|<|T|> `, `|e2|`:` ptr<vec|N|<|I|>>`) -> ` vec|N|<|T|> - (GLSLstd450Frexp) -
|T| is f32 - `inverseSqrt(`|e|`:` |T| `) -> ` |T| - (GLSLstd450InverseSqrt) -
|T| is f32 - `inverseSqrt(`|e|`:` vec|N|<|T|> `) -> ` vec|N|<|T|> - (GLSLstd450InverseSqrt) -
|T| is f32
- |I| is i32 or u32 -
`ldexp(`|e1|`:` |T| `, `|e2|`:` |I| `) -> ` |T| - (GLSLstd450Ldexp) -
|T| is f32
- |I| is i32 or u32 -
`ldexp(`|e1|`:` vec|N|<|T|> `, `|e2|`:` vec|N|<|I|>`) -> ` vec|N|<|T|> - (GLSLstd450Ldexp) -
|T| is f32 - `length(`|e|`:` |T| `) -> ` |T| - (GLSLstd450Length) -
|T| is f32 - `length(`|e|`:` vec|N|<|T|> `) -> ` |T| - (GLSLstd450Length) -
|T| is f32 - `log(`|e|`:` |T| `) -> ` |T| - (GLSLstd450Log) -
|T| is f32 - `log(`|e|`:` vec|N|<|T|> `) -> ` vec|N|<|T|> - (GLSLstd450Log) -
|T| is f32 - `log2(`|e|`:` |T| `) -> ` |T| - (GLSLstd450Log2) -
|T| is f32 - `log2(`|e|`:` vec|N|<|T|> `) -> ` vec|N|<|T|> - (GLSLstd450Log2) -
|T| is f32 - `max(`|e1|`:` |T| `, `|e2|`:` |T| `) -> ` |T| - (GLSLstd450NMax) -
|T| is f32 - `max(`|e1|`:` vec|N|<|T|> `, `|e2|`:` vec|N|<|T|>`) -> ` vec|N|<|T|> - (GLSLstd450NMax) -
|T| is f32 - `min(`|e1|`:` |T| `, `|e2|`:` |T| `) -> ` |T| - (GLSLstd450NMin) -
|T| is f32 - `min(`|e1|`:` vec|N|<|T|> `, `|e2|`:` vec|N|<|T|>`) -> ` vec|N|<|T|> - (GLSLstd450NMin) -
|T| is f32 - `mix(`|e1|`:` |T| `, `|e2|`:` |T| `, `|e3|`:` |T|`) -> ` |T| - (GLSLstd450FMix) -
|T| is f32 - `mix(`|e1|`:` vec|N|<|T|> `, `|e2|`:` vec|N|<|T|>`, `|e3|`:` vec|N|<|T|>`) -> ` vec|N|<|T|> - (GLSLstd450FMix) -
|T| is f32
-
`modf(`|e1|`:` |T| `, `|e2|`:` ptr<|T|> `) -> ` |T| - (GLSLstd450Modf) -
|T| is f32 - `modf(`|e1|`:` vec|N|<|T|> `, `|e2|`:` ptr<vec|N|<|T|>>`) -> ` vec|N|<|T|> - (GLSLstd450Modf) -
|T| is f32 - `normalize(`|e|`:` vec|N|<|T|> `) -> ` vec|N|<|T|> - (GLSLstd450Normalize) -
|T| is f32 - `pow(`|e1|`:` |T| `, `|e2|`:` |T| `) -> ` |T| - (GLSLstd450Pow) -
|T| is f32 - `pow(`|e1|`:` vec|N|<|T|> `, `|e2|`:` vec|N|<|T|> `) -> ` vec|N|<|T|> - (GLSLstd450Pow) -
|T| is f32 - `reflect(`|e1|`:` |T| `, `|e2|`:` |T| `) -> ` |T| - (GLSLstd450Reflect) -
|T| is f32 - `reflect(`|e1|`:` vec|N|<|T|> `, `|e2|`:` vec|N|<|T|>`) -> ` vec|N|<|T|> - (GLSLstd450Reflect) -
|T| is f32 - `round(`|e|`:` |T| `) -> ` |T| - Result is the integer |k| nearest to |e|, as a floating point value.
- When |e| lies halfway between integers |k| and |k|+1, - the result is |k| when |k| is even, and |k|+1 when |k| is odd.
- (GLSLstd450RoundEven) -
|T| is f32 - `round(`|e|`:` vec|N|<|T|> `) -> ` vec|N|<|T|> - Component-wise rounding.
- Component |i| of the result is `round`(|e|[|i|])
- (GLSLstd450RoundEven) -
|T| is f32 - `sign(`|e|`:` |T| `) -> ` |T| - (GLSLstd450FSign) -
|T| is f32 - `sign(`|e|`:` vec|N|<|T|> `) -> ` vec|N|<|T|> - (GLSLstd450FSign) -
|T| is f32 - `sin(`|e|`:` |T| `) -> ` |T| - (GLSLstd450Sin) -
|T| is f32 - `sin(`|e|`:` vec|N|<|T|> `) -> ` vec|N|<|T|> - (GLSLstd450Sin) -
|T| is f32 - `sinh(`|e|`:` |T| `) -> ` |T| - (GLSLstd450Sinh) -
|T| is f32 - `sinh(`|e|`:` vec|N|<|T|> `) -> ` vec|N|<|T|> - (GLSLstd450Sinh) -
|T| is f32 - `smoothStep(`|e1|`:` |T| `, `|e2|`:` |T| `, `|e3|`:` |T| `) -> ` |T| - (GLSLstd450SmoothStep) -
|T| is f32 - `smoothStep(`|e1|`:` vec|N|<|T|> `, `|e2|`:` vec|N|<|T|>`, `|e3|`:` vec|N|<|T|>`) -> ` vec|N|<|T|> - (GLSLstd450SmoothStep) -
|T| is f32 - `sqrt(`|e|`:` |T| `) -> ` |T| - (GLSLstd450Sqrt) -
|T| is f32 - `sqrt(`|e|`:` vec|N|<|T|> `) -> ` vec|N|<|T|> - (GLSLstd450Sqrt) -
|T| is f32 - `step(`|e1|`:` |T| `, `|e2|`:` |T| `) -> ` |T| - (GLSLstd450Step) -
|T| is f32 - `step(`|e1|`:` vec|N|<|T|> `, `|e2|`:` vec|N|<|T|>`) -> ` vec|N|<|T|> - (GLSLstd450Step) -
|T| is f32 - `tan(`|e|`:` |T| `) -> ` |T| - (GLSLstd450Tan) -
|T| is f32 - `tan(`|e|`:` vec|N|<|T|> `) -> ` vec|N|<|T|> - (GLSLstd450Tan) -
|T| is f32 - `tanh(`|e|`:` |T| `) -> ` |T| - (GLSLstd450Tanh) -
|T| is f32 - `tanh(`|e|`:` vec|N|<|T|> `) -> ` vec|N|<|T|> - (GLSLstd450Tanh) -
|T| is f32 - `trunc(`|e|`:` |T| `) -> ` |T| - (GLSLstd450Trunc) -
|T| is f32 - `trunc(`|e|`:` vec|N|<|T|> `) -> ` vec|N|<|T|> - (GLSLstd450Trunc) -
- -## Integer built-in functions ## {#integer-builtin-functions} - - - - - - - - - - - - - - - - - - - - - - - - -
PreconditionBuilt-inDescription -
- `abs`(|e|: i32 ) -> i32 - The absolute value of |e|.
- (GLSLstd450SAbs) -
- `abs`(|e| : vec|N|<i32> ) -> vec|N|<i32> - Component-wise absolute value: - Component |i| of the result is `abs(`|e|`[`|i|`])`
- (GLSLstd450SAbs) -
- `abs`(|e| : u32 ) -> u32 - Result is |e|. This is provided for symmetry with `abs` for signed integers. -
- `abs(`|e|`:` vec|N|<u32> `) ->` vec|N|<u32> - Result is |e|. This is provided for symmetry with `abs` for signed integer vectors. -
|T| is u32 - `clamp(`|e1|`:` |T| `, `|e2|`:` |T|`, `|e3|`:` |T|`) ->` |T| - (GLSLstd450UClamp) -
|T| is u32 - `clamp(`|e1|`:` vec|N|<|T|> `, `|e2|`:` vec|N|<|T|>`, `|e3|`:`vec|N|<|T|> `) ->` vec|N|<|T|> - (GLSLstd450UClamp) -
|T| is i32 - `clamp(`|e1|`:` |T| `, `|e2|`:` |T|`, `|e3|`:` |T|`) ->` |T| - (GLSLstd450SClamp) -
|T| is i32 - `clamp(`|e1|`:` vec|N|<|T|> `, `|e2|`:` vec|N|<|T|>`, `|e3|`:`vec|N|<|T|> `) ->` vec|N|<|T|> - (GLSLstd450SClamp) -
|T| is u32 or i32
-
`countOneBits(`|e|`:` |T| `) ->` |T| - The number of 1 bits in the representation of |e|.
- Also known as "population count".
- (SPIR-V OpBitCount) -
|T| is u32 or i32 - `countOneBits(`|e|`:` vec|N|<|T|>`) ->` vec|N|<|T|>
-
Component-wise population count: - Component |i| of the result is `countOneBits(`|e|`[`|i|`])`
- (SPIR-V OpBitCount) -
|T| is u32 - `max(`|e1|`:` |T| `, `|e2|`:` |T|`) ->` |T| - (GLSLstd450UMax) -
|T| is u32 - `max(`|e1|`:` vec|N|<|T|> `, `|e2|`:` vec|N|<|T|>`) ->` vec|N|<|T|> - (GLSLstd450UMax) -
|T| is i32 - `max(`|e1|`:` |T| `, `|e2|`:` |T|`) ->` |T| - (GLSLstd450SMax) -
|T| is i32 - `max(`|e1|`:` vec|N|<|T|> `, `|e2|`:` vec|N|<|T|>`) ->` vec|N|<|T|> - (GLSLstd450SMax) -
|T| is u32 - `min(`|e1|`:` |T| `, `|e2|`:` |T|`) ->` |T| - (GLSLstd450UMin) -
|T| is u32 - `min(`|e1|`:` vec|N|<|T|> `, `|e2|`:` vec|N|<|T|>`) ->` vec|N|<|T|> - (GLSLstd450UMin) -
|T| is i32 - `min(`|e1|`:` |T| `, `|e2|`:` |T|`) ->` |T| - (GLSLstd450SMin) -
|T| is i32 - `min(`|e1|`:` vec|N|<|T|> `, `|e2|`:` vec|N|<|T|>`) ->` vec|N|<|T|> - (GLSLstd450SMin) -
|T| is u32 or i32
-
`reverseBits(`|e|`:` |T| `) ->` |T| - Reverses the bits in |e|: The bit at position |k| of the result equals the - bit at position 31-|k| of |e|.
- (SPIR-V OpBitReverse) -
|T| is u32 or i32 - `reverseBits(`|e|`:` vec|N|<|T|> `) ->` vec|N|<|T|>
-
Component-wise bit reversal: - Component |i| of the result is `reverseBits(`|e|`[`|i|`])`
- (SPIR-V OpBitReverse) -
- -## Matrix built-in functions ## {#matrix-builtin-functions} - - - - -
PreconditionBuilt-inDescription -
|T| is f32 - `determinant(`|e|`:` mat|N|x|N|<|T|> `) -> ` |T| - (GLSLstd450Determinant) -
- -## Vector built-in functions ## {#vector-builtin-functions} - - - - -
Vector built-in functionsSPIR-V -
dot(vecN<f32>, vecN<f32>) -> floatOpDot -
- -## Derivative built-in functions ## {#derivative-builtin-functions} - - - - -
PreconditionDerivative built-in functionsSPIR-V -
|T| is f32 or vecN<f32>dpdx(T) -> TOpDPdx -
dpdxCoarse(T) -> TOpDPdxCoarse -
dpdxFine(T) -> TOpDPdxFine -
dpdy(T) -> TOpDPdy -
dpdyCoarse(T) -> TOpDPdyCoarse -
dpdyFine(T) -> TOpDPdyFine -
fwidth(T) -> TOpFwidth -
fwidthCoarse(T) -> TOpFwidthCoarse -
fwidthFine(T) -> TOpFwidthFine -
- -## Texture built-in functions ## {#texture-builtin-functions} - -### `textureDimensions` ### {#texturedimensions} - -Returns the dimensions of a texture, or texture's mip level in texels. - -```rust -textureDimensions(t : texture_1d) -> i32 -textureDimensions(t : texture_2d) -> vec2 -textureDimensions(t : texture_2d, level : i32) -> vec2 -textureDimensions(t : texture_2d_array) -> vec2 -textureDimensions(t : texture_2d_array, level : i32) -> vec2 -textureDimensions(t : texture_3d) -> vec3 -textureDimensions(t : texture_3d, level : i32) -> vec3 -textureDimensions(t : texture_cube) -> vec3 -textureDimensions(t : texture_cube, level : i32) -> vec3 -textureDimensions(t : texture_cube_array) -> vec3 -textureDimensions(t : texture_cube_array, level : i32) -> vec3 -textureDimensions(t : texture_multisampled_2d)-> vec2 -textureDimensions(t : texture_multisampled_2d_array)-> vec2 -textureDimensions(t : texture_depth_2d) -> vec2 -textureDimensions(t : texture_depth_2d, level : i32) -> vec2 -textureDimensions(t : texture_depth_2d_array) -> vec2 -textureDimensions(t : texture_depth_2d_array, level : i32) -> vec2 -textureDimensions(t : texture_depth_cube) -> vec3 -textureDimensions(t : texture_depth_cube, level : i32) -> vec3 -textureDimensions(t : texture_depth_cube_array) -> vec3 -textureDimensions(t : texture_depth_cube_array, level : i32) -> vec3 -textureDimensions(t : texture_storage_1d) -> i32 -textureDimensions(t : texture_storage_2d) -> vec2 -textureDimensions(t : texture_storage_2d_array) -> vec2 -textureDimensions(t : texture_storage_3d) -> vec3 -``` - -**Parameters:** - - -
`t` - The [sampled](#sampled-texture-type), - [multisampled](#multisampled-texture-type), [depth](#texture-depth), or - [storage](#texture-storage) texture. -
`level` - The mip level, with level 0 containing a full size version of the texture.
- If omitted, the dimensions of level 0 are returned. -
- -**Returns:** - -The dimensions of the texture in texels.
- - -### `textureLoad` ### {#textureload} - -Reads a single texel from a texture without sampling or filtering. - -```rust -textureLoad(t : texture_1d, coords : i32, level : i32) -> vec4 -textureLoad(t : texture_2d, coords : vec2, level : i32) -> vec4 -textureLoad(t : texture_2d_array, coords : vec2, array_index : i32, level : i32) -> vec4 -textureLoad(t : texture_3d, coords : vec3, level : i32) -> vec4 -textureLoad(t : texture_multisampled_2d, coords : vec2, sample_index : i32)-> vec4 -textureLoad(t : texture_multisampled_2d_array, coords : vec2, array_index : i32, sample_index : i32)-> vec4 -textureLoad(t : texture_depth_2d, coords : vec2, level : i32) -> f32 -textureLoad(t : texture_depth_2d_array, coords : vec2, array_index : i32, level : i32) -> f32 -textureLoad(t : [[access(read)]] texture_storage_1d, coords : i32) -> vec4 -textureLoad(t : [[access(read)]] texture_storage_2d, coords : vec2) -> vec4 -textureLoad(t : [[access(read)]] texture_storage_2d_array, coords : vec2, array_index : i32) -> vec4 -textureLoad(t : [[access(read)]] texture_storage_3d, coords : vec3) -> vec4 -``` - -For [read-only storage textures](#texture-storage) the returned channel format `T` -depends on the texel format `F`. -[See the texel format table](#storage-texel-formats) for the mapping of texel -format to channel format. - -**Parameters:** - - -
`t` - The [sampled](#sampled-texture-type), - [multisampled](#multisampled-texture-type), [depth](#texture-depth) or - [read-only storage](#texture-storage) texture. -
`coords` - The 0-based texel coordinate. -
`array_index` - The 0-based texture array index. -
`level` - The mip level, with level 0 containing a full size version of the texture. -
`sample_index` - The 0-based sample index of the multisampled texture. -
- -**Returns:** - -If all the parameters are within bounds, the unfiltered texel data.
-If any of the parameters are out of bounds, then zero in all components. - - -### `textureNumLayers` ### {#texturenumlayers} - -Returns the number of layers (elements) of an array texture. - -```rust -textureNumLayers(t : texture_2d_array) -> i32 -textureNumLayers(t : texture_cube_array) -> i32 -textureNumLayers(t : texture_multisampled_2d_array) -> i32 -textureNumLayers(t : texture_depth_2d_array) -> i32 -textureNumLayers(t : texture_depth_cube_array) -> i32 -textureNumLayers(t : texture_storage_2d_array) -> i32 -``` - -**Parameters:** - - -
`t` - The [sampled](#sampled-texture-type), - [multisampled](#multisampled-texture-type), [depth](#texture-depth) or - [storage](#texture-storage) array texture. -
- -**Returns:** - -If the number of layers (elements) of the array texture. - - -### `textureNumLevels` ### {#texturenumlevels} - -Returns the number of mip levels of a texture. - -```rust -textureNumLevels(t : texture_2d) -> i32 -textureNumLevels(t : texture_2d_array) -> i32 -textureNumLevels(t : texture_3d) -> i32 -textureNumLevels(t : texture_cube) -> i32 -textureNumLevels(t : texture_cube_array) -> i32 -textureNumLevels(t : texture_depth_2d) -> i32 -textureNumLevels(t : texture_depth_2d_array) -> i32 -textureNumLevels(t : texture_depth_cube) -> i32 -textureNumLevels(t : texture_depth_cube_array) -> i32 -``` - -**Parameters:** - - -
`t` - The [sampled](#sampled-texture-type) or [depth](#texture-depth) texture. -
- -**Returns:** - -If the number of mip levels for the texture. - - -### `textureNumSamples` ### {#texturenumsamples} - -Returns the number samples per texel in a multisampled texture. - -```rust -textureNumSamples(t : texture_multisampled_2d) -> i32 -textureNumSamples(t : texture_multisampled_2d_array) -> i32 -``` - -**Parameters:** - - -
`t` - The [multisampled](#multisampled-texture-type) texture. -
- -**Returns:** - -If the number of samples per texel in the multisampled texture. - - -### `textureSample` ### {#texturesample} - -Samples a texture. -Must only be used in a [=fragment=] shader stage. - -```rust -textureSample(t : texture_1d, s : sampler, coords : f32) -> vec4 -textureSample(t : texture_2d, s : sampler, coords : vec2) -> vec4 -textureSample(t : texture_2d, s : sampler, coords : vec2, offset : vec2) -> vec4 -textureSample(t : texture_2d_array, s : sampler, coords : vec2, array_index : i32) -> vec4 -textureSample(t : texture_2d_array, s : sampler, coords : vec2, array_index : i32, offset : vec2) -> vec4 -textureSample(t : texture_3d, s : sampler, coords : vec3) -> vec4 -textureSample(t : texture_3d, s : sampler, coords : vec3, offset : vec3) -> vec4 -textureSample(t : texture_cube, s : sampler, coords : vec3) -> vec4 -textureSample(t : texture_cube_array, s : sampler, coords : vec3, array_index : i32) -> vec4 -textureSample(t : texture_depth_2d, s : sampler, coords : vec2) -> f32 -textureSample(t : texture_depth_2d, s : sampler, coords : vec2, offset : vec2) -> f32 -textureSample(t : texture_depth_2d_array, s : sampler, coords : vec2, array_index : i32) -> f32 -textureSample(t : texture_depth_2d_array, s : sampler, coords : vec2, array_index : i32, offset : vec2) -> f32 -textureSample(t : texture_depth_cube, s : sampler, coords : vec3) -> f32 -textureSample(t : texture_depth_cube_array, s : sampler, coords : vec3, array_index : i32) -> f32 -``` - -**Parameters:** - - -
`t` - The [sampled](#sampled-texture-type) or [depth](#texture-depth) texture to - sample. -
`s` - The [sampler type](#sampler-type). -
`coords` - The texture coordinates used for sampling. -
`array_index` - The 0-based texture array index to sample. -
`offset` - The optional texel offset applied to the unnormalized texture coordinate - before sampling the texture. This offset is applied before applying any - texture wrapping modes.
- `offset` must be compile time constant, and may only be provided as a - [literal](#literals) or `const_expr` expression (e.g. `vec2(1, 2)`).
- Each `offset` component must be at least `-8` and at most `7`. Values outside - of this range will be treated as a compile time error. -
- -**Returns:** - -The sampled value. - - -### `textureSampleBias` ### {#texturesamplebias} - -Samples a texture with a bias to the mip level. -Must only be used in a [=fragment=] shader stage. - -```rust -textureSampleBias(t : texture_2d, s : sampler, coords : vec2, bias : f32) -> vec4 -textureSampleBias(t : texture_2d, s : sampler, coords : vec2, bias : f32, offset : vec2) -> vec4 -textureSampleBias(t : texture_2d_array, s : sampler, coords : vec2, array_index : i32, bias : f32) -> vec4 -textureSampleBias(t : texture_2d_array, s : sampler, coords : vec2, array_index : i32, bias : f32, offset : vec2) -> vec4 -textureSampleBias(t : texture_3d, s : sampler, coords : vec3, bias : f32) -> vec4 -textureSampleBias(t : texture_3d, s : sampler, coords : vec3, bias : f32, offset : vec3) -> vec4 -textureSampleBias(t : texture_cube, s : sampler, coords : vec3, bias : f32) -> vec4 -textureSampleBias(t : texture_cube_array, s : sampler, coords : vec3, array_index : i32, bias : f32) -> vec4 -``` - -**Parameters:** - - -
`t` - The [texture](#sampled-texture-type) to sample. -
`s` - The [sampler type](#sampler-type). -
`coords` - The texture coordinates used for sampling. -
`array_index` - The 0-based texture array index to sample. -
`bias` - The bias to apply to the mip level before sampling. - `bias` must be between `-16.0` and `15.99`. -
`offset` - The optional texel offset applied to the unnormalized texture coordinate - before sampling the texture. This offset is applied before applying any - texture wrapping modes.
- `offset` must be compile time constant, and may only be provided as a - [literal](#literals) or `const_expr` expression (e.g. `vec2(1, 2)`).
- Each `offset` component must be at least `-8` and at most `7`. Values outside - of this range will be treated as a compile time error. -
- -**Returns:** - -The sampled value. - - -### `textureSampleCompare` ### {#texturesamplecompare} - -Samples a depth texture and compares the sampled depth values against a reference value. - -```rust -textureSampleCompare(t : texture_depth_2d, s : sampler_comparison, coords : vec2, depth_ref : f32) -> f32 -textureSampleCompare(t : texture_depth_2d, s : sampler_comparison, coords : vec2, depth_ref : f32, offset : vec2) -> f32 -textureSampleCompare(t : texture_depth_2d_array, s : sampler_comparison, coords : vec2, array_index : i32, depth_ref : f32) -> f32 -textureSampleCompare(t : texture_depth_2d_array, s : sampler_comparison, coords : vec2, array_index : i32, depth_ref : f32, offset : vec2) -> f32 -textureSampleCompare(t : texture_depth_cube, s : sampler_comparison, coords : vec3, depth_ref : f32) -> f32 -textureSampleCompare(t : texture_depth_cube_array, s : sampler_comparison, coords : vec3, array_index : i32, depth_ref : f32) -> f32 -``` - -**Parameters:** - - -
`t` - The [depth](#texture-depth) texture to sample. -
`s` - The [sampler comparision](#sampler-type) type. -
`coords` - The texture coordinates used for sampling. -
`array_index` - The 0-based texture array index to sample. -
`depth_ref` - The reference value to compare the sampled depth value against. -
`offset` - The optional texel offset applied to the unnormalized texture coordinate - before sampling the texture. This offset is applied before applying any - texture wrapping modes.
- `offset` must be compile time constant, and may only be provided as a - [literal](#literals) or `const_expr` expression (e.g. `vec2(1, 2)`).
- Each `offset` component must be at least `-8` and at most `7`. Values outside - of this range will be treated as a compile time error. -
- -**Returns:** - -A value in the range `[0.0..1.0]`. - -Each sampled texel is compared against the reference value using the comparision -operator defined by the `sampler_comparison`, resulting in either a `0` or `1` -value for each texel. - -If the `sampler_comparison` uses bilinear filtering then the returned value is -the filtered average of these values, otherwise the comparision result of a -single texel is returned. - - -### `textureSampleGrad` ### {#texturesamplegrad} - -Samples a texture using explicit gradients. - -```rust -textureSampleGrad(t : texture_2d, s : sampler, coords : vec2, ddx : vec2, ddy : vec2) -> vec4 -textureSampleGrad(t : texture_2d, s : sampler, coords : vec2, ddx : vec2, ddy : vec2, offset : vec2) -> vec4 -textureSampleGrad(t : texture_2d_array, s : sampler, coords : vec2, array_index : i32, ddx : vec2, ddy : vec2) -> vec4 -textureSampleGrad(t : texture_2d_array, s : sampler, coords : vec2, array_index : i32, ddx : vec2, ddy : vec2, offset : vec2) -> vec4 -textureSampleGrad(t : texture_3d, s : sampler, coords : vec3, ddx : vec3, ddy : vec3) -> vec4 -textureSampleGrad(t : texture_3d, s : sampler, coords : vec3, ddx : vec3, ddy : vec3, offset : vec3) -> vec4 -textureSampleGrad(t : texture_cube, s : sampler, coords : vec3, ddx : vec3, ddy : vec3) -> vec4 -textureSampleGrad(t : texture_cube_array, s : sampler, coords : vec3, array_index : i32, ddx : vec3, ddy : vec3) -> vec4 -``` - -**Parameters:** - - -
`t` - The [texture](#sampled-texture-type) to sample. -
`s` - The [sampler type](#sampler-type). -
`coords` - The texture coordinates used for sampling. -
`array_index` - The 0-based texture array index to sample. -
`ddx` - The x direction derivative vector used to compute the sampling locations. -
`ddy` - The y direction derivative vector used to compute the sampling locations. -
`offset` - The optional texel offset applied to the unnormalized texture coordinate - before sampling the texture. This offset is applied before applying any - texture wrapping modes.
- `offset` must be compile time constant, and may only be provided as a - [literal](#literals) or `const_expr` expression (e.g. `vec2(1, 2)`).
- Each `offset` component must be at least `-8` and at most `7`. Values outside - of this range will be treated as a compile time error. -
- -**Returns:** - -The sampled value. - - -### `textureSampleLevel` ### {#texturesamplelevel} - -Samples a texture using an explicit mip level. - -```rust -textureSampleLevel(t : texture_2d, s : sampler, coords : vec2, level : f32) -> vec4 -textureSampleLevel(t : texture_2d, s : sampler, coords : vec2, level : f32, offset : vec2) -> vec4 -textureSampleLevel(t : texture_2d_array, s : sampler, coords : vec2, array_index : i32, level : f32) -> vec4 -textureSampleLevel(t : texture_2d_array, s : sampler, coords : vec2, array_index : i32, level : f32, offset : vec2) -> vec4 -textureSampleLevel(t : texture_3d, s : sampler, coords : vec3, level : f32) -> vec4 -textureSampleLevel(t : texture_3d, s : sampler, coords : vec3, level : f32, offset : vec3) -> vec4 -textureSampleLevel(t : texture_cube, s : sampler, coords : vec3, level : f32) -> vec4 -textureSampleLevel(t : texture_cube_array, s : sampler, coords : vec3, array_index : i32, level : f32) -> vec4 -textureSampleLevel(t : texture_depth_2d, s : sampler, coords : vec2, level : i32) -> f32 -textureSampleLevel(t : texture_depth_2d, s : sampler, coords : vec2, level : i32, offset : vec2) -> f32 -textureSampleLevel(t : texture_depth_2d_array, s : sampler, coords : vec2, array_index : i32, level : i32) -> f32 -textureSampleLevel(t : texture_depth_2d_array, s : sampler, coords : vec2, array_index : i32, level : i32, offset : vec2) -> f32 -textureSampleLevel(t : texture_depth_cube, s : sampler, coords : vec3, level : i32) -> f32 -textureSampleLevel(t : texture_depth_cube_array, s : sampler, coords : vec3, array_index : i32, level : i32) -> f32 -``` - -**Parameters:** - - -
`t` - The [sampled](#sampled-texture-type) or [depth](#texture-depth) texture to - sample. -
`s` - The [sampler type](#sampler-type). -
`coords` - The texture coordinates used for sampling. -
`array_index` - The 0-based texture array index to sample. -
`level` - The mip level, with level 0 containing a full size version of the texture. - For the functions where `level` is a `f32`, fractional values may interpolate - between two levels if the format is filterable according to the - [Texture Format Capabilities](https://gpuweb.github.io/gpuweb/#texture-format-caps). -
`offset` - The optional texel offset applied to the unnormalized texture coordinate - before sampling the texture. This offset is applied before applying any - texture wrapping modes.
- `offset` must be compile time constant, and may only be provided as a - [literal](#literals) or `const_expr` expression (e.g. `vec2(1, 2)`).
- Each `offset` component must be at least `-8` and at most `7`. Values outside - of this range will be treated as a compile time error. -
- -**Returns:** - -The sampled value. - - -### `textureStore` ### {#texturestore} - -Writes a single texel to a texture. - -```rust -textureStore(t : [[access(write)]] texture_storage_1d, coords : i32, value : vec4) -textureStore(t : [[access(write)]] texture_storage_2d, coords : vec2, value : vec4) -textureStore(t : [[access(write)]] texture_storage_2d_array, coords : vec2, array_index : i32, value : vec4) -textureStore(t : [[access(write)]] texture_storage_3d, coords : vec3, value : vec4) -``` - -The channel format `T` depends on the storage texel format `F`. -[See the texel format table](#storage-texel-formats) for the mapping of texel -format to channel format. - -**Parameters:** - - -
`t` - The [write-only storage texture](#texture-storage). -
`coords` - The 0-based texel coordinate.
-
`array_index` - The 0-based texture array index. -
`value` - The new texel value.
-
- -**Note:** - -If any of the parameters are out of bounds, then the call to `textureStore()` -does nothing. - - -**TODO:** - -
-TODO(dsinclair): Need gather operations
-
- -## Atomic built-in functions ## {#atomic-builtin-functions} - -Atomic built-in functions can be used to read/write/read-modify-write atomic -objects. They are the only operations allowed on [[#atomic-types]]. - -All atomic built-in functions use a `relaxed` memory ordering (**0**-value -integral constant in SPIR-V for all `Memory Semantics` operands). -This means synchronization and ordering guarantees only apply among atomic -operations acting on the same [=memory locations=]. -No synchronization or ordering guarantees apply between atomic and -non-atomic memory accesses, or between atomic accesses acting on different -memory locations. - -Atomic built-in functions `must` not be used in a [=vertex=] shader stage. - -The storage class `SC` of the `atomic_ptr` parameter in all atomic built-in -functions `must` be either [=storage classes/storage=] or [=storage -classes/workgroup=]. [=storage classes/workgroup=] atomics have a **Workgroup** -memory scope in SPIR-V, while [=storage classes/storage=] atomics have a -**Device** memory scope in SPIR-V. - -TODO: Add links to the eventual memory model descriptions. - -### Atomic Load ### {#atomic-load} - -```rust -atomicLoad(atomic_ptr : ptr>) -> T - -// Maps to the SPIR-V instruction OpAtomicLoad. -``` - -Returns the atomically loaded the value pointed to by `atomic_ptr`. - -### Atomic Store ### {#atomic-store} - -```rust -atomicStore(atomic_ptr : ptr>, v : T) - -// Maps to the SPIR-V instruction OpAtomicStore. -``` - -Atomically stores the value `v` in the atomic object pointed to by `atomic_ptr`. - -### Atomic Read-Modify-Write ### {#atomic-rmw} - -```rust -atomicAdd(atomic_ptr : ptr>, v : T) -> T -atomicMax(atomic_ptr : ptr>, v : T) -> T -atomicMin(atomic_ptr : ptr>, v : T) -> T -atomicAnd(atomic_ptr : ptr>, v : T) -> T -atomicOr(atomic_ptr : ptr>, v : T) -> T -atomicXor(atomic_ptr : ptr>, v : T) -> T - -// Mappings to SPIR-V instructions: -// atomicAdd -> OpAtomicIAdd -// atomicMax -> OpAtomicSMax or OpAtomicUMax (depending on the signedness of T) -// atomicMin -> OpAtomicSMin or OpAtomicUMin (depending on the signedness of T) -// atomicAnd -> OpAtomicAnd -// atomicOr -> OpAtomicOr -// atomicXor -> OpAtomicXor -``` -Each function performs the following steps atomically: - -1. Load the original value pointed to by `atomic_ptr`. -2. Obtains a new value by performing the operation (e.g. max) from the function - name with the value |v|. -3. Store the new value using `atomic_ptr`. - -Each function returns the original value stored in the atomic object. - -```rust -atomicExchange(atomic_ptr : ptr>, v : T) -> T - -// Maps to the SPIR-V instruction OpAtomicExchange. -``` - -Atomically stores the value `v` in the atomic object pointed to -`atomic_ptr` and returns the original value stored in the atomic object. - -```rust -atomicCompareExchangeWeak(atomic_ptr : ptr>, cmp : T, v : T) -> vec2 - -// Maps to the SPIR-V instruction OpAtomicCompareExchange. -``` - -Performs the following steps atomically: - -1. Load the original value pointed to by `atomic_ptr`. -2. Compare the original value to the value `v` using an equality operation. -3. Store the value `v` `only if` the result of the equality comparison was **true**. - -Returns a two-element vector, where the first element is the original value of -the atomic object and the second element is whether or not the comparison -succeeded (**1** if successful, **0** otherwise). - -Note: the equality comparison may spuriously fail on some implementations. That -is, the second element of the result vector may be **0** even if the first -element of the result vector equals `cmp`. - -## Data packing built-in functions ## {#pack-builtin-functions} - -Data packing builtin functions can be used to encode values using data formats that -do not correspond directly to types in [SHORTNAME]. -This enables a program to write many densely packed values to memory, which can -reduce a shader's memory bandwidth demand. - - - - - - - - - -
Built-inDescription -
`pack4x8snorm`(|e|: vec4<f32>) -> u32 - Converts four normalized floating point values to 8-bit signed integers, and then combines them - into one `u32` value.
- Component |e|[|i|] of the input is converted to an 8-bit twos complement integer value - ⌊ 0.5 + 127 × min(1, max(-1, |e|[|i|])) ⌋ which is then placed in bits - 8 × |i| through - 8 × |i| + 7 of the result. - -
`pack4x8unorm`(|e|: vec4<f32>) -> u32 - Converts four normalized floating point values to 8-bit unsigned integers, and then combines them - into one `u32` value.
- Component |e|[|i|] of the input is converted to an 8-bit unsigned integer value - ⌊ 0.5 + 255 × min(1, max(0, |e|[|i|])) ⌋ which is then placed in bits - 8 × |i| through - 8 × |i| + 7 of the result. - -
`pack2x16snorm`(|e|: vec2<f32>) -> u32 - Converts two normalized floating point values to 16-bit signed integers, and then combines them - into one `u32` value.
- Component |e|[|i|] of the input is converted to a 16-bit twos complement integer value - ⌊ 0.5 + 32767 × min(1, max(-1, |e|[|i|])) ⌋ which is then placed in bits - 16 × |i| through - 16 × |i| + 15 of the result. - -
`pack2x16unorm`(|e|: vec2<f32>) -> u32 - Converts two normalized floating point values to 16-bit unsigned integers, and then combines them - into one `u32` value.
- Component |e|[|i|] of the input is converted to a 16-bit unsigned integer value - ⌊ 0.5 + 65535 × min(1, max(0, |e|[|i|])) ⌋ which is then placed in bits - 16 × |i| through - 16 × |i| + 15 of the result. - -
`pack2x16float`(|e|: vec2<f32>) -> u32 - Converts two floating point values to half-precision floating point numbers, and then combines - them into one one `u32` value.
- Component |e|[|i|] of the input is converted to a IEEE 754 binary16 value, which is then - placed in bits - 16 × |i| through - 16 × |i| + 15 of the result. - See [[#floating-point-conversion]] for edge case behaviour. -
- -## Data unpacking built-in functions ## {#unpack-builtin-functions} - -Data unpacking builtin functions can be used to decode values in -data formats that do not correspond directly to types in [SHORTNAME]. -This enables a program to read many densely packed values from memory, which can -reduce a shader's memory bandwidth demand. - - - - - - - - - -
Built-inDescription -
`unpack4x8snorm`(|e|: u32) -> vec4<f32> - Decomposes a 32-bit value into four 8-bit chunks, then reinterprets - each chunk as a signed normalized floating point value.
- Component |i| of the result is max(|v| ÷ 127, -1), where |v| is the interpretation of - bits 8×|i| through 8×|i|+7 of |e| as a twos-complement signed integer. - -
`unpack4x8unorm`(|e|: u32) -> vec4<f32> - Decomposes a 32-bit value into four 8-bit chunks, then reinterprets - each chunk as an unsigned normalized floating point value.
- Component |i| of the result is |v| ÷ 255, where |v| is the interpretation of - bits 8×|i| through 8×|i|+7 of |e| as an unsigned integer. - -
`unpack2x16snorm`(|e|: u32) -> vec2<f32> - Decomposes a 32-bit value into two 16-bit chunks, then reinterprets - each chunk as a signed normalized floating point value.
- Component |i| of the result is max(|v| ÷ 32767, -1), where |v| is the interpretation of - bits 16×|i| through 16×|i|+15 of |e| as a twos-complement signed integer. - -
`unpack2x16unorm`(|e|: u32) -> vec2<f32> - Decomposes a 32-bit value into two 16-bit chunks, then reinterprets - each chunk as an unsigned normalized floating point value.
- Component |i| of the result is |v| ÷ 65535, where |v| is the interpretation of - bits 16×|i| through 16×|i|+15 of |e| as an unsigned integer. - -
`unpack2x16float`(|e|: u32) -> vec2<f32> - Decomposes a 32-bit value into two 16-bit chunks, and reinterpets each chunk - as a floating point value.
- Component |i| of the result is the f32 representation of |v|, - where |v| is the interpretation of bits 16×|i| through 16×|i|+15 of |e| - as an IEEE 754 binary16 value. - See [[#floating-point-conversion]] for edge case behaviour. -
- -## Synchronization built-in functions ## {#sync-builtin-functions} - -[SHORTNAME] provides the following synchronization functions: - -```rust -fn storageBarrier() -fn workgroupBarrier() -``` - -All synchronization functions execute a control barrier with Acquire/Release -memory ordering. That is, all synchronization functions, and affected memory -and atomic operations are ordered in [[#program-order]] relative to the -synchronization function. Additionally, the affected memory and atomic -operations program-ordered before the synchronization function must be visible -to all other threads in the workgroup before any affected memory or atomic -operation program-ordered after the synchronization function is executed by a -member of the workgroup. - -storageBarrier affects memory and atomic operations in the [=storage -classes/storage=] storage class. - -workgroupBarrier affects memory and atomic operations in the [=storage -classes/workgroup=] storage class. - -TODO: Add links to the eventual memory model. - -
- - storageBarrier(); - // Maps to: - // Execution Scope is Workgroup = %uint_2 - // Memory Scope is Device = %uint_1 - // Memory Semantics are AcquireRelease | UniformMemory (0x8 | 0x40) = %uint_72 - // OpControlBarrier %uint_2 %uint_1 %uint_72 - - workgroupBarrier(); - // Maps to: - // Execution and Memory Scope are Workgroup = %uint_2 - // Memory semantics are AcquireRelease | WorkgroupMemory (0x8 | 0x100) = %uint_264 - // OpControlBarrier %uint_2 %uint_2 %uint_264 - - workgroupBarrier(); - storageBarrier(); - // Or, equivalently: - storageBarrier(); - workgroupBarrier(); - // Could be mapped to a single OpControlBarrier: - // Execution scope is Workgroup = %uint_2 - // Memory Scope is Device = %uint_1 - // Memory semantics are AcquireRelease | UniformMemory | WorkgroupMemory - // (0x8 | 0x40 | 0x100) = %uint_328 - // OpControlBarrier %uint_2 %uint_1 %uint_328 - -
- -## Value-steering functions ## {#value-steering-functions} - - - - - -
Built-inDescription -
- `ignore`(|e|: |T|) - Evaluates |e|, and then ignores the result.
- Type |T| is any type that can be returned from a function.
-
- -# Glossary # {#glossary} - -TODO: Remove terms unused in the rest of the specification. - - - - -
TermDefinition -
Dominates - Basic block `A` *dominates* basic block `B` if: - * `A` and `B` are both in the same function `F` - * Every control flow path in `F` that goes to `B` must also to through `A` -
Strictly dominates - `A` *strictly dominates* `B` if `A` dominates `B` and `A != B` -
DomBy(A) - The basic blocks dominated by `A` -
- -# MATERIAL TO BE MOVED TO A NEW HOME OR DELETED # {#junkyard} - - -[SHORTNAME] has operations for: - -* extracting one of the components of a composite value -* creating a new composite value from an old one by replacing one of its components -* creating a new composite value from components - -## Type Promotions ## {#type-promotions} -There are no implicit type promotions in [SHORTNAME]. If you want to convert between -types you must use the cast syntax to do it. - -
- - var e : f32 = 3; // error: literal is the wrong type - - var f : f32 = 1.0; - - var t : i32 = i32(f); - -
- -The non-promotion extends to vector classes as well. There are no overrides to -shorten vector declarations based on the type or number of elements provided. -If you want `vec4` you must provide 4 float values in the constructor. - -## Precedence ## {#precedence} - -Issue: (dsinclair) Write out precedence rules. Matches c and glsl rules .... diff --git a/wgsl/keywords b/wgsl/keywords new file mode 100755 index 0000000000..feba403158 --- /dev/null +++ b/wgsl/keywords @@ -0,0 +1,498 @@ +#!/usr/bin/env perl +use strict; + +# A script to print the contents of a grammar rule for a list +# of reserved words. + +# C++ keywords +# Extracted from working draft at https://eel.is/c++draft/ +# https://eel.is/c++draft/gram.lex#nt:keyword +# "Any identifier listed in Table 5" plus import, module, export +# https://eel.is/c++draft/tab:lex.key +# + +my @cpp = qw( +export +import +module + +alignas +alignof +asm +auto +bool +break +case +catch +char +char16_t +char32_t +char8_t +class +co_await +co_return +co_yield +concept +const +const_cast +consteval +constexpr +constinit +continue +decltype +default +delete +do +double +dynamic_cast +else +enum +explicit +export +extern +false +float +for +friend +goto +if +inline +int +long +mutable +namespace +new +noexcept +nullptr +operator +private +protected +public +register +reinterpret_cast +requires +return +short +signed +sizeof +static +static_assert +static_cast +struct +switch +template +this +thread_local +throw +true +try +typedef +typeid +typename +union +unsigned +using +virtual +void +volatile +wchar_t +while +); + +# Rust +# https://doc.rust-lang.org/reference/keywords.html#reserved-keywords +# We include strict, reserved, and weak +# Excludes 'strict because it starts with a single quote. + +my @rust = qw( + as + break + const + continue + crate + else + enum + extern + false + fn + for + if + impl + in + let + loop + match + mod + move + mut + pub + ref + return + self + Self + static + struct + super + trait + true + type + unsafe + use + where + while + + abstract + become + box + do + final + macro + override + priv + typeof + unsized + virtual + yield + + macro_rules + union +); + + +my @smalltalk = qw( + nil + self + super + true + false ); + + + +# ECMAScript +# https://262.ecma-international.org/5.1/#sec-7.6.1.1 +# Keywords, Reserved, FutureReserved + +my @ecmascript_5_1 = qw( + break + case + catch + continue + debugger + default + delete + do + else + finally + for + function + if + in + instanceof + new + return + switch + this + throw + try + typeof + var + void + while + with + + +class +const +enum +export +extends +import +super + + +implements +interface +let +package +private +protected +public +static +yield + +); + +# https://tc39.es/ecma262/ retrieved 2022-02-24 +my @ecmascript_2022 = qw( +break case catch class const continue debugger default delete +do else enum export extends false finally for function if import +in instanceof new null return super switch this throw true try +typeof var void while with + +await yield +let static implements interface package private protected public +as async from get meta of set target +); + +# GLSL 4.6 +my @glsl = qw( +const uniform buffer shared attribute varying +coherent volatile restrict readonly writeonly +atomic_uint +layout +centroid flat smooth noperspective +patch sample +invariant precise +break continue do for while switch case default +if else +subroutine +in out inout +int void bool true false float double +discard return +vec2 vec3 vec4 ivec2 ivec3 ivec4 bvec2 bvec3 bvec4 +uint uvec2 uvec3 uvec4 +dvec2 dvec3 dvec4 +mat2 mat3 mat4 +mat2x2 mat2x3 mat2x4 +mat3x2 mat3x3 mat3x4 +mat4x2 mat4x3 mat4x4 +dmat2 dmat3 dmat4 +dmat2x2 dmat2x3 dmat2x4 +dmat3x2 dmat3x3 dmat3x4 +dmat4x2 dmat4x3 dmat4x4 +lowp mediump highp precision +sampler1D sampler1DShadow sampler1DArray sampler1DArrayShadow +isampler1D isampler1DArray usampler1D usampler1DArray +sampler2D sampler2DShadow sampler2DArray sampler2DArrayShadow +isampler2D isampler2DArray usampler2D usampler2DArray +sampler2DRect sampler2DRectShadow isampler2DRect usampler2DRect +sampler2DMS isampler2DMS usampler2DMS +sampler2DMSArray isampler2DMSArray usampler2DMSArray +sampler3D isampler3D usampler3D +samplerCube samplerCubeShadow isamplerCube usamplerCube +samplerCubeArray samplerCubeArrayShadow +isamplerCubeArray usamplerCubeArray +samplerBuffer isamplerBuffer usamplerBuffer +image1D iimage1D uimage1D +image1DArray iimage1DArray uimage1DArray +image2D iimage2D uimage2D +image2DArray iimage2DArray uimage2DArray +image2DRect iimage2DRect uimage2DRect +image2DMS iimage2DMS uimage2DMS +image2DMSArray iimage2DMSArray uimage2DMSArray +image3D iimage3D uimage3D +imageCube iimageCube uimageCube +imageCubeArray iimageCubeArray uimageCubeArray +imageBuffer iimageBuffer uimageBuffer +struct +texture1D texture1DArray +itexture1D itexture1DArray utexture1D utexture1DArray +texture2D texture2DArray +itexture2D itexture2DArray utexture2D utexture2DArray +texture2DRect itexture2DRect utexture2DRect +texture2DMS itexture2DMS utexture2DMS +texture2DMSArray itexture2DMSArray utexture2DMSArray +texture3D itexture3D utexture3D +textureCube itextureCube utextureCube +textureCubeArray itextureCubeArray utextureCubeArray +textureBuffer itextureBuffer utextureBuffer +sampler samplerShadow +subpassInput isubpassInput usubpassInput +subpassInputMS isubpassInputMS usubpassInputMS +); + +# GLSL 4.6 +my @glsl_reserved = qw( +common partition active +asm +class union enum typedef template this +resource +goto +inline noinline public static extern external interface +long short half fixed unsigned superp +input output +hvec2 hvec3 hvec4 fvec2 fvec3 fvec4 +filter +sizeof cast +namespace using +sampler3DRect +); + +# HLSL +# https://docs.microsoft.com/en-us/windows/win32/direct3dhlsl/dx-graphics-hlsl-appendix-keywords +# retrieved 2022-02-24 +my @hlsl = qw( +AppendStructuredBuffer asm asm_fragment +BlendState bool break Buffer ByteAddressBuffer +case cbuffer centroid class column_major compile compile_fragment CompileShader const continue ComputeShader ConsumeStructuredBuffer +default DepthStencilState DepthStencilView discard do double DomainShader dword +else export extern +false float for fxgroup +GeometryShader groupshared +half Hullshader +if in inline inout InputPatch int interface +line lineadj linear LineStream +matrix min16float min10float min16int min12int min16uint +namespace nointerpolation noperspective NULL +out OutputPatch +packoffset pass pixelfragment PixelShader point PointStream precise +RasterizerState RenderTargetView return register row_major RWBuffer RWByteAddressBuffer RWStructuredBuffer RWTexture1D RWTexture1DArray RWTexture2D RWTexture2DArray RWTexture3D +sample sampler SamplerState SamplerComparisonState shared snorm stateblock stateblock_state static string struct switch StructuredBuffer +tbuffer technique technique10 technique11 texture Texture1D Texture1DArray Texture2D Texture2DArray Texture2DMS Texture2DMSArray Texture3D TextureCube TextureCubeArray true typedef triangle triangleadj TriangleStream +uint uniform unorm unsigned +vector vertexfragment VertexShader void volatile +while +texture +samper +); + +foreach my $type (qw(float int uint bool + min10float min16float + min12int min16int + min16uint)) { + push @hlsl, $type; + foreach my $i (qw(1 2 3 4)) { + push @hlsl, "$type$i"; + } + foreach my $row (qw(1 2 3 4)) { + foreach my $col (qw(1 2 3 4)) { + push @hlsl, "$type${row}x$col"; + } + } +} + +# Already in use by WGSL +my @wgsl = qw( +array +atomic +bitcast +bool +break +case +continue +continuing +default +discard +else +enable +f32 +fallthrough +false +fn +for +function +i32 +if +let +loop +mat2x2 +mat2x3 +mat2x4 +mat3x2 +mat3x3 +mat3x4 +mat4x2 +mat4x3 +mat4x4 +override +private +ptr +return +sampler +sampler_comparison +storage +struct +switch +texture_1d +texture_2d +texture_2d_array +texture_3d +texture_cube +texture_cube_array +texture_depth_2d +texture_depth_2d_array +texture_depth_cube +texture_depth_cube_array +texture_depth_multisampled_2d +texture_multisampled_2d +texture_storage_1d +texture_storage_2d +texture_storage_2d_array +texture_storage_3d +true +type +u32 +uniform +var +vec2 +vec3 +vec4 +while +workgroup +); + + +# Deliberately reserved by WGSL +my @wgsl_reserved = qw( +asm +bf16 +const +demote +demote_to_helper +do +enum +f16 +f64 +handle +i16 +i64 +i8 +mat +null +premerge +regardless +std +typedef +u16 +u64 +u8 +unless +using +vec +void +wgsl +); + + +# Key is a keyword. +# Value is a list of languages that reserve the word in some way. +my %words = (); + +# Adds a list of keywords, from a given language. +# $lang should be empty when the given language is WGSL. +sub add_from($@) { + my ($lang, @keywords) = @_; + foreach my $word (@keywords) { + $words{$word} = [] unless defined $words{$word}; + push(@{$words{$word}}, $lang); + } +} + +add_from('C++', @cpp); +add_from('ECMAScript2022', @ecmascript_2022); +add_from('Rust', @rust); +add_from('Smalltalk', @smalltalk); +add_from('HLSL', @hlsl); +add_from('GLSL', @glsl); +add_from('GLSL(reserved)', @glsl_reserved); +add_from('WGSL', @wgsl_reserved); + +# Remove keywords already used in WGSL. +foreach my $word (@wgsl) { + delete $words{$word}; +} + +# Print the contents of the _reserved grammar rule. +foreach my $word (sort {$a cmp $b} keys %words) { + print " | `'$word'` \n\n"; +} diff --git a/wgsl/wgsl_spec_style_guide.md b/wgsl/wgsl_spec_style_guide.md index a373efca25..1b85ff7c86 100644 --- a/wgsl/wgsl_spec_style_guide.md +++ b/wgsl/wgsl_spec_style_guide.md @@ -1,5 +1,7 @@ # WSGL spec writing style guide +## Style + Goal: Avoid possibly being misunderstood. * Tradeoff: The text might read as stilted. **Precision is better than flair.** @@ -80,3 +82,98 @@ Use the [serial comma](https://en.wikipedia.org/wiki/Serial_comma), also known a In Markdown, no two sentences (or parts of sentences) should be on the same text line. This makes it easier to edit and review changes. + +The WGSL grammar's syntactic rules are presented as a set of cross-referenced Bikeshed +definitions. There is one Bikeshed definition for each grammar token or non-terminal. +These definitions are contained in `` div `` elements in the `` syntax `` class. + +Authoring syntactic rules: +* Each syntactic rule should start with a line which only contains ``
`` +and end with a line which only contains ``
``. There must be only one +syntactic rule between these lines. +* Each syntactic rule must define itself for Bikeshed. Each syntactic rule definition must start with two spaces +and then place the rule name between `` `` and `` : `` on the same line. +* Each syntactic rule item must start with four spaces and then list members after `` | `` followed by a space. + * Syntactic rule items can be split to multiple lines. For this, start the next line with six spaces. +* Each syntactic rule item must be surrounded by only a space before and after, +trailing space at the end of the line being redundant. +* Members of syntactic rules items can be references to existing rules. These must be placed between +`` [=syntax/ `` and `` =] ``. +* Members of syntactic rules can contain groups which should contain the group members between `` ( `` and `` ) ``. +* Members of syntactic rule items which denote a string should start with `` `' `` +and end with `` '` `` and not contain any space character or line break between these two. +* Members of syntactic rule items which denote a regular expression should start with `` `/ `` +and end with `` /` `` and not contain any space character or line break between these two. +* If a member is optional, then it must be followed by a `` ? `` member token. +* If a member can repeat and must appear at least once, then it must be followed by a `` + `` member token. +* If a member can repeat and does not have to appear, then it must be followed by a `` * `` member token. + +## Tagging conventions + +Several tools process the specification source, extracting things for further processing. +Those tools rely on attributes on certain elements, as described here. + +### Algorithms + +In [Bikeshed][] source, an [algorithm](https://tabatkins.github.io/bikeshed/#algorithms) +attribute on an element does two things: + +1. It specifies a unique human-readable name for the thing being defined by the element. +1. It scopes variables to that element. In a browser, clicking on one use of a variable + will highlight all the uses of that variable in the same scope. + +For example, the definition of a matrix type has two parameters: _N_ and _M_. +The uses of `|N|` and `|M|` are scoped to the `tr` element having the `algorithm` attribute: + + + mat|N|x|M|<f32> + Matrix of |N| columns and |M| rows, where |N| and |M| are both in {2, 3, 4}. + Equivalently, it can be viewed as |N| column vectors of type vec|M|<f32>. + +The following kinds of document elements should have `algorithm` attribute: + +* Types: Tag the `tr` element in the table describing the type. +* Each row (`tr` element) in a [type rule table](https://w3.org/TR/WGSL#typing-tables-section): + * This applies to the tables describing expressions and built-in functions. +* Parameterized definitions, equations, or rules that have variables: + * These use `p`, `blockquote`, or `div` elements. + +### Code samples + +Code samples should have a `class` attribute starting with `example`. + +For WGSL code samples, specify a `class` tag whose value is three space-separated words: +* `example` indicating this is a code example +* `wgsl` indicating the code is in WGSL +* a word indicating what kind of code snippet it is, or where it should appear, one of: + * `expect-error`: The code snippet is invalid + * `global-scope`: The code snippet is assumed to appear at module-scope, i.e. outside all other declarations. + * `type-scope`: The code snippet shows the WGSL spelling of a type, independent of other context. + * `function-scope`: The code snippet is assumed to appear inside a function body, but the function declaration + and surrounding braces are not shown. + +For example: + +
+ + @stage(fragment) + fn main() -> @location(0) vec4<f32> { + return vec4<f32>(0.4,0.4,0.8,1.0); + } + +
+ + +Code samples in languages other than WGSL should name the language, for example: + +
+ + +## In-progress extensions are developed outside the main spec text + +Once a language extension is fully developed, it should be described in the WGSL specification. + +Before then, an extension is considered to be "in progress", and should not be discussed in the main WGSL specification. +Extensions which require a lot of iteration can be freely developed in a branch or some other document. + +[Bikeshed]: https://tabatkins.github.io/bikeshed "Bikeshed"