From 14737d3e53753d44d0fda7c04cd2bd5271056507 Mon Sep 17 00:00:00 2001 From: Jim Blandy Date: Sun, 17 Apr 2022 14:56:37 -0700 Subject: [PATCH] [hlsl-out] Flesh out matrix handling documentation. --- src/back/hlsl/mod.rs | 99 +++++++++++++++++++++++++++++++++++++------- 1 file changed, 85 insertions(+), 14 deletions(-) diff --git a/src/back/hlsl/mod.rs b/src/back/hlsl/mod.rs index f51d49db39..7ecfc0fc92 100644 --- a/src/back/hlsl/mod.rs +++ b/src/back/hlsl/mod.rs @@ -6,29 +6,100 @@ Backend for [HLSL][hlsl] (High-Level Shading Language). - 5.1 - 6.0 -# General Matrix Note +# Layout of values in `uniform` buffers -All matrix construction/deconstruction is row based in HLSL. This means that when -we construct a matrix from column vectors, our matrix will be implicitly transposed. -The inverse transposition happens when we call `[0]` to get the zeroth column vector. +WGSL's ["Internal Layout of Values"][ilov] rules specify how each WGSL +type should be stored in `uniform` and `storage` buffers. The HLSL we +generate must access values in that form, even when it is not what +HLSL would use normally. -Because all of our matrices are implicitly transposed, we flip arguments to `mul`. `mat * vec` -becomes `vec * mat`, etc. This acts as the inverse transpose making the results identical. +The rules described here only apply to WGSL `uniform` variables. WGSL +`storage` buffers are translated as HLSL `ByteAddressBuffers`, for +which we generate `Load` and `Store` method calls with explicit byte +offsets. WGSL pipeline inputs must be scalars or vectors; they cannot +be matrices, which is where the interesting problems arise. -The only time we don't get this implicit transposition is when reading matrices from Uniforms/Push Constants. -To deal with this, we add `row_major` to all declarations of matrices in Uniforms/Push Constants. +## Row- and column-major ordering for matrices -Finally because all of our matrices are transposed, if you use `mat3x4`, it'll become `float3x4` in HLSL -(HLSL has inverted col/row notation). +WGSL specifies that matrices in uniform buffers are stored in +column-major order. This matches HLSL's default, so one might expect +things to be straightforward. Unfortunately, WGSL and HLSL disagree on +what indexing a matrix means: in WGSL, `m[i]` retrieves the `i`'th +*column* of `m`, whereas in HLSL it retrieves the `i`'th *row*. We +want to avoid translating `m[i]` into some complicated reassembly of a +vector from individually fetched components, so this is a problem. -# Matrix struct member of the form `matCx2` Note +However, with a bit of trickery, it is possible to use HLSL's `m[i]` +as the translation of WGSL's `m[i]`: -Struct member matrices of the form `matCx2` are translated to a sequence of C `vec2`s due to -differences in alignment between WGSL and HLSL for uniform buffers. +- We declare all matrices in uniform buffers in HLSL with the + `row_major` qualifier, and transpose the row and column counts: a + WGSL `mat3x4`, say, becomes an HLSL `row_major float3x4`. (Note + that WGSL and HLSL type names put the row and column in reverse + order.) Since the HLSL type is the transpose of how WebGPU directs + the user to store the data, HLSL will load all matrices transposed. -Accesses to these matrices are handled by injected functions. +- Since matrices are transposed, an HLSL indexing expression retrieves + the "columns" of the intended WGSL value, as desired. + +- For vector-matrix multiplication, since `mul(transpose(m), v)` is + equivalent to `mul(v, m)` (note the reversal of the arguments), and + `mul(v, transpose(m))` is equivalent to `mul(m, v)`, we can + translate WGSL `m * v` and `v * m` to HLSL by simply reversing the + arguments to `mul`. + +## Padding in two-row matrices + +An HLSL `row_major floatKx2` matrix has padding between its rows that +the WGSL `matKx2` matrix it represents does not. HLSL stores all +matrix rows [aligned on 16-byte boundaries][16bb], whereas WGSL says +that the columns of a `matKx2` need only be [aligned as required +for `vec2`][ilov], which is [eight-byte alignment][8bb]. + +To compensate for this, any time a `matKx2` appears in a WGSL +`uniform` variable, whether directly as the variable's type or as a +struct member, we actually emit `K` separate `float2` members, and +assemble/disassemble the matrix from its columns (in WGSL; rows in +HLSL) upon load and store. + +For example, the following WGSL struct type: + +```ignore +struct Baz { + m: mat3x2, +} +``` + +is rendered as the HLSL struct type: + +```ignore +struct Baz { + float2 m_0; float2 m_1; float2 m_2; +}; +``` + +The `wrapped_struct_matrix` functions in `help.rs` generate HLSL +helper functions to access such members, converting between the stored +form and the HLSL matrix types appropriately. For example, for reading +the member `m` of the `Baz` struct above, we emit: + +```ignore +float3x2 GetMatmOnBaz(Baz obj) { + return float3x2(obj.m_0, obj.m_1, obj.m_2); +} +``` + +We also emit an analogous `Set` function, as well as functions for +accessing individual columns by dynamic index. + +At present, we do not generate correct HLSL when `matCx2` us used +directly as the type of a WGSL `uniform` global ([#1837]). [hlsl]: https://docs.microsoft.com/en-us/windows/win32/direct3dhlsl/dx-graphics-hlsl +[ilov]: https://gpuweb.github.io/gpuweb/wgsl/#internal-value-layout +[16bb]: https://github.com/microsoft/DirectXShaderCompiler/wiki/Buffer-Packing#constant-buffer-packing +[8bb]: https://gpuweb.github.io/gpuweb/wgsl/#alignment-and-size +[#1837]: https://github.com/gfx-rs/naga/issues/1837 */ mod conv;