i32x4.dot_i16x8_s and i32x4.dot_i16x8_add_s instructions

Maratyszcza · Maratyszcza · commit bb31a5a4b6ac · 2020-02-07T13:27:57.000-08:00
diff --git a/proposals/simd/BinarySIMD.md b/proposals/simd/BinarySIMD.md
@@ -199,3 +199,5 @@ The `v8x16.shuffle` instruction has 16 bytes after `simdop`.
 | `v128.andnot`              |    `0xd8`| -                  |
 | `i8x16.avgr_u`             |    `0xd9`|                    |
 | `i16x8.avgr_u`             |    `0xda`|                    |
+| `i32x4.dot_i16x8_s`        |    `0xdb`| -                  |
+| `i32x4.dot_i16x8_add_s`    |    `0xdc`| -                  |
diff --git a/proposals/simd/ImplementationStatus.md b/proposals/simd/ImplementationStatus.md
@@ -123,6 +123,8 @@
 | `i32x4.min_u`              |               `-msimd128` |    :heavy_check_mark: | :heavy_check_mark: |                    |
 | `i32x4.max_s`              |               `-msimd128` |    :heavy_check_mark: | :heavy_check_mark: |                    |
 | `i32x4.max_u`              |               `-msimd128` |    :heavy_check_mark: | :heavy_check_mark: |                    |
+| `i32x4.dot_i16x8_s`        |                           |                       |                    |                    |
+| `i32x4.dot_i16x8_add_s`    |                           |                       |                    |                    |
 | `i64x2.neg`                |               `-msimd128` |    :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: |
 | `i64x2.shl`                |               `-msimd128` |    :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: |
 | `i64x2.shr_s`              |               `-msimd128` |    :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: |
diff --git a/proposals/simd/SIMD.md b/proposals/simd/SIMD.md
@@ -380,6 +380,17 @@ def S.mul(a, b):
     return S.lanewise_binary(mul, a, b)
 ```
 
+### Integer dot product
+* `i32x4.dot_i16x8_s(a: v128, b: v128) -> v128`
+
+Lane-wise multiply signed 16-bit integers in the two input vectors and add adjacent pairs of the full 32-bit results.
+
+### Integer dot product with accumulation
+
+* `i32x4.dot_i16x8_add_s(a: v128, b: v128, c: v128) -> v128`
+
+Lane-wise multiply signed 16-bit integers in the two input vectors, add adjacent pairs of the full 32-bit results, and accumulate with corresponding 32-bit lanes of `c`. This operation is equivalent to `i32x4.add(i32x4.dot_i16x8_s(a, b), c)`.
+
 ### Integer negation
 * `i8x16.neg(a: v128) -> v128`
 * `i16x8.neg(a: v128) -> v128`