Add SIMD operations that use f16 and f128 #125440

tgross35 · 2024-05-23T06:50:57Z

Eventually we will want to be able to make use of simd operations for f16 and f128, now that we have primitives to represent them. Possibilities that I know of:

Aarch64 neon supports float16x{4,8} https://developer.arm.com/architectures/instruction-sets/intrinsics/#f:@navigationhierarchiessimdisa=[Neon]&f:@navigationhierarchiesreturnbasetype=[float]&f:@navigationhierarchieselementbitsize=[16]&q=.
Arm sve supports float16x{1,2} https://developer.arm.com/architectures/instruction-sets/intrinsics/#f:@navigationhierarchiesreturnbasetype=[float]&f:@navigationhierarchieselementbitsize=[16]&f:@navigationhierarchiessimdisa=[sve2,sve]&q=
RISC-V apparently has both f16 and f128 https://five-embeddev.com/riscv-user-isa-manual/riscv-user-2.2/v.html
NVIDIA PTX has f16 SIMD
- Implementation: NVPTX: Add f16 SIMD intrinsics stdarch#1626
- Submodule Update stdarch submodule #128866
- Tracking issue: Tracking Issue for NVPTX arch intrinsics #111199
x86 with +avx512fp16
- Implementation: Implement AVX512_FP16 stdarch#1605
- Submodule: Update the stdarch submodule #128466
- Tracking issue: Tracking Issue for AVX512_FP16 intrinsics #127213
Portable SIMD should eventually be able to support these operations

Probably some work/research overlap with adding assembly #125398

Tracking issue: #116909

The text was updated successfully, but these errors were encountered:

tgross35 · 2024-05-23T06:51:48Z

@rustbot label +A-simd +T-libs +F-f16_and_f128 +E-help-wanted +C-feature-request -needs-triage

kjetilkjeka · 2024-05-24T08:28:17Z

Nvidia ptx (--target nvptx64-nvidia-cuda) also support arithmetic instructions for f16 and f16x2 SIMD.

Making this work is an important step for making the ptx target "feature complete" with languages traditionally used for GPGPU. Let me know if there's anything I can do to support this.

tgross35 · 2024-05-24T08:35:59Z

Thanks, I'll add that to the top list.

It looks like it might not be too hard to add new simd intrinsics on that platform? I have no clue but https://github.com/rust-lang/stdarch/blob/df3618d9f35165f4bc548114e511c49c29e1fd9b/crates/core_arch/src/nvptx/mod.rs is pretty straightforward if you want to give it a shot at some point

kjetilkjeka · 2024-05-24T10:36:31Z

I just tested f16 on nvptx now and I don't think I realized how many of the pieces was already put together. That's great!

I looked a bit around in SIMD instructions for other arches and I think this is, as you say, pretty straightforward. I will give it a shot. Hopefully I will get around to creating a PR next week.

tgross35 · 2024-05-24T16:51:35Z

That is great news! Note that unfortunately math symbols aren’t yet available on all targets so testing with the new types is kind of weird sometimes, but hopefully that will be resolved in a week or so with a compiler_builtins update.

kjetilkjeka · 2024-08-20T15:43:20Z

Took me a bit longer than I originally hoped for but I ended up creating a PR for (most) nvptx f16x2 intrinsics and getting it merged. rust-lang/stdarch#1626

I have also noticed that we're lacking portable_simd variants of f16 and it's not being tracked by this issue. Is that outside the scope of this issue or just not added yet? Is anyone already coordinating with the portable_simd project or is it simply being blocked by other features that needs to land first?

tgross35 · 2024-08-20T18:09:36Z

Took me a bit longer than I originally hoped for but I ended up creating a PR for (most) nvptx f16x2 intrinsics and getting it merged. rust-lang/stdarch#1626

Awesome news, thanks for the update! Looks like there is an open PR to get the new changes #128866.

I have also noticed that we're lacking portable_simd variants of f16 and it's not being tracked by this issue. Is that outside the scope of this issue or just not added yet? Is anyone already coordinating with the portable_simd project or is it simply being blocked by other features that needs to land first?

I'll add it to the issue, no particular reason outside of being lower priority than the intrinsics.

rustbot added the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label May 23, 2024

tgross35 mentioned this issue May 23, 2024

Tracking Issue for f16 and f128 float types #116909

Open

84 tasks

tgross35 mentioned this issue Jun 7, 2024

Enable f16 in assembly on aarch64 platforms that support it #126070

Closed

tgross35 mentioned this issue Jul 1, 2024

f16/half-float support rust-lang/stdarch#344

Open

rustbot added the E-help-wanted Call for participation: Help is requested to fix this issue. label Jul 12, 2024

kjetilkjeka mentioned this issue Aug 7, 2024

NVPTX: Add f16 SIMD intrinsics rust-lang/stdarch#1626

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add SIMD operations that use f16 and f128 #125440

Add SIMD operations that use f16 and f128 #125440

tgross35 commented May 23, 2024 •

edited

Loading

tgross35 commented May 23, 2024 •

edited

Loading

kjetilkjeka commented May 24, 2024

tgross35 commented May 24, 2024

kjetilkjeka commented May 24, 2024

tgross35 commented May 24, 2024

kjetilkjeka commented Aug 20, 2024

tgross35 commented Aug 20, 2024

Add SIMD operations that use f16 and f128 #125440

Add SIMD operations that use f16 and f128 #125440

Comments

tgross35 commented May 23, 2024 • edited Loading

tgross35 commented May 23, 2024 • edited Loading

kjetilkjeka commented May 24, 2024

tgross35 commented May 24, 2024

kjetilkjeka commented May 24, 2024

tgross35 commented May 24, 2024

kjetilkjeka commented Aug 20, 2024

tgross35 commented Aug 20, 2024

tgross35 commented May 23, 2024 •

edited

Loading

tgross35 commented May 23, 2024 •

edited

Loading