Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Portable SIMD support #1912

Closed
RalfJung opened this issue Nov 14, 2021 · 15 comments
Closed

Portable SIMD support #1912

RalfJung opened this issue Nov 14, 2021 · 15 comments
Labels
A-shims Area: This affects the external function shims C-enhancement Category: a PR with an enhancement or an issue tracking an accepted enhancement E-good-first-issue A good way to start contributing, mentoring is available

Comments

@RalfJung
Copy link
Member

It'd be great if Miri would support at least a reasonable subset of Rust's portable SIMD API -- and from what I heard in the past, that is much more feasible than supporting things from core::arch. Now that rust-lang/rust#89167 has landed I'd like to add corresponding tests in Miri to ensure we cover the required intrinsics.

@workingjubilee what would be a good small-ish set of tests to start with here, that cover as much of the underlying support infrastructure as possible?

@RalfJung RalfJung added A-shims Area: This affects the external function shims C-enhancement Category: a PR with an enhancement or an issue tracking an accepted enhancement labels Nov 14, 2021
@RalfJung
Copy link
Member Author

RalfJung commented Nov 14, 2021

I guess we can start with the hello world example. :D

#![feature(portable_simd)]
use std::simd::*;

fn main() {
    let a = f32x4::splat(10.0);
    let b = f32x4::from_array([1.0, 2.0, 3.0, 4.0]);
    println!("{:?}", a + b);
}
error: unsupported operation: unimplemented intrinsic: simd_add
   --> /home/r/.rustup/toolchains/miri/lib/rustlib/src/rust/library/core/src/../../portable-simd/crates/core_simd/src/ops.rs:644:1
    |
644 | impl_float_ops! { f32, f64 }
    | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ unimplemented intrinsic: simd_add
    |
    = help: this is likely not a bug in the program; it indicates that the program performed an operation that the interpreter does not support
            
    = note: inside `core::core_simd::ops::<impl std::ops::Add for std::simd::Simd<f32, 4_usize>>::add` at /home/r/.rustup/toolchains/miri/lib/rustlib/src/rust/library/core/src/../../portable-simd/crates/core_simd/src/ops.rs:230:25
note: inside `main` at tests/run-pass/portable-simd.rs:7:22
   --> tests/run-pass/portable-simd.rs:7:22
    |
7   |     println!("{:?}", a + b);
    |                      ^^^^^

I also found this list of intrinsics so I guess once the basics work we'll want to ensure that we have a test that calls all these intrinsics.

@RalfJung RalfJung added the E-good-first-issue A good way to start contributing, mentoring is available label Nov 14, 2021
@workingjubilee
Copy link
Member

workingjubilee commented Nov 15, 2021

Yes, if you support that suite of LLVM-esque intrinsics, you will support essentially all of the higher-level constructs we derive from that, at least for the moment.

In general, we don't get very far with integers without the SIMD arithmetic and bitops.

Almost all higher-level constructs after there then gate on the availability of the masks, which require simd_lt and simd_{eq,ne} to construct, and then are used with simd_select.

@RalfJung
Copy link
Member Author

Sounds good. Supporting most of that list should not be too hard -- we can basically apply the primitive MIR operators in a loop, I think (unless there are extra restrictions for the SIMD intrinsics that we have to check for).

at least for the moment.

Now I am worried. ;)

Is there a longer-term plan to rely more directly on the arch intrinsics, or significantly grow the set of these simd_* intrinsics?

@workingjubilee
Copy link
Member

Is there a longer-term plan to rely more directly on the arch intrinsics,

No.

or significantly grow the set of these simd_* intrinsics?

Significantly? Not really. We will only add instructions to this "Rust SIMD Extension ISA" with great reluctance since we know the burden of supporting them is high. When we do, it will be aimed mostly at "we know this is fairly simple and there is a clear, direct lowering for it", e.g. we have discussed simd_min and simd_max before.

@workingjubilee
Copy link
Member

workingjubilee commented Nov 15, 2021

Also by the time we are seriously looking at adding to this list we likely will have finally advanced on our plan of adding a """library""" in Rust that can handle these instructions to the compiler (by converting them to scalarized loops on arrays, yes), so that there is always a valid lowering a backend can use.

workingjubilee added a commit to workingjubilee/rustc that referenced this issue Nov 15, 2021
…orkingjubilee

disable portable SIMD tests in Miri

Until rust-lang/miri#1912 is resolved, we'll have to skip these tests in Miri.
@RalfJung
Copy link
Member Author

RalfJung commented Nov 15, 2021

I see, sounds great! Miri being an interpreter it has slightly different needs than regular codegen backends, but that infrastructure can probably be made to handle both. :) (And I know a lot of formal verification people are also interested in having these things lowered away before their MIR/LLVM-based tools run.)

workingjubilee added a commit to workingjubilee/rustc that referenced this issue Nov 15, 2021
…orkingjubilee

disable portable SIMD tests in Miri

Until rust-lang/miri#1912 is resolved, we'll have to skip these tests in Miri.
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Nov 15, 2021
…orkingjubilee

disable portable SIMD tests in Miri

Until rust-lang/miri#1912 is resolved, we'll have to skip these tests in Miri.
JohnTitor added a commit to JohnTitor/rust that referenced this issue Nov 16, 2021
…orkingjubilee

disable portable SIMD tests in Miri

Until rust-lang/miri#1912 is resolved, we'll have to skip these tests in Miri.
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Nov 18, 2021
fix CTFE/Miri simd_insert/extract on array-style repr(simd) types

The changed test would previously fail since `place_index` would just return the only field of `f32x4`, i.e., the array -- rather than *indexing into* the array which is what we have to do.

The new helper methods will also be needed for rust-lang/miri#1912.

r? `@oli-obk`
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Nov 19, 2021
fix CTFE/Miri simd_insert/extract on array-style repr(simd) types

The changed test would previously fail since `place_index` would just return the only field of `f32x4`, i.e., the array -- rather than *indexing into* the array which is what we have to do.

The new helper methods will also be needed for rust-lang/miri#1912.

r? ``@oli-obk``
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Nov 19, 2021
fix CTFE/Miri simd_insert/extract on array-style repr(simd) types

The changed test would previously fail since `place_index` would just return the only field of `f32x4`, i.e., the array -- rather than *indexing into* the array which is what we have to do.

The new helper methods will also be needed for rust-lang/miri#1912.

r? ```@oli-obk```
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Nov 19, 2021
fix CTFE/Miri simd_insert/extract on array-style repr(simd) types

The changed test would previously fail since `place_index` would just return the only field of `f32x4`, i.e., the array -- rather than *indexing into* the array which is what we have to do.

The new helper methods will also be needed for rust-lang/miri#1912.

r? ````@oli-obk````
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Nov 19, 2021
fix CTFE/Miri simd_insert/extract on array-style repr(simd) types

The changed test would previously fail since `place_index` would just return the only field of `f32x4`, i.e., the array -- rather than *indexing into* the array which is what we have to do.

The new helper methods will also be needed for rust-lang/miri#1912.

r? `````@oli-obk`````
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Nov 20, 2021
fix CTFE/Miri simd_insert/extract on array-style repr(simd) types

The changed test would previously fail since `place_index` would just return the only field of `f32x4`, i.e., the array -- rather than *indexing into* the array which is what we have to do.

The new helper methods will also be needed for rust-lang/miri#1912.

r? ``````@oli-obk``````
bors added a commit that referenced this issue Nov 21, 2021
portable SIMD: basic binops

First steps towards #1912. Requires rust-lang/rust#90999.
@RalfJung
Copy link
Member Author

RalfJung commented Feb 4, 2022

Looks like we need to implement simd_eq to get division to work again... @workingjubilee what is the behavior of that intrinsic? If I interpret the SimdElement instances correctly, then this returns a vector of iN where N is the bitwidth of the input type... at least if full_masks.rs is used. What is not clear to me is if the vector elements will be 0x1 or all-bits-set-to-1 in order to represent true.

But there also is bitmask.rs? And this is switched based on all(target_arch = "x86_64", target_feature = "avx512f")... but I can't believe that the behavior and type of the intrinsic depends on a cfg flag so I notice that I am confused and stopped my investigation here. Some hints would be appreciated. :)

(Unfortunately the file at https://github.com/rust-lang/portable-simd/blob/master/crates/core_simd/src/intrinsics.rs is not very useful for people not already intimately familiar with SIMD intrinsics -- it does neither sufficiently declare the types nor the behavior of these intrinsics. Is there some documentation of this elsewhere that I am missing?)

bors added a commit that referenced this issue Feb 4, 2022
rustup; implement simd_and/or

I had to disable the integer division tests since they now require simd_eq, which seems [non-trivial to implement](#1912 (comment)).

Cc rust-lang/rust#93619
@workingjubilee
Copy link
Member

workingjubilee commented Feb 6, 2022

Yes, that is how simd_eq works for that. SIMD-land is upside down, so true is -1 (i.e. "all bits set to 1").

On machines that have SIMD architectures, there is a concept of a "mask". This is essentially a "vector of bools". On many, this is an unspecialized concept: it is merely a vector of the same size as any other vector, which has had its lanes set to -1 (i.e. 0b1111_1111_...etc.). Some instructions then perform the "select" operation, also known as "blend": "if lane is true, pick A, if lane is false, pick B, write the blended result to target register."

On some, like AVX512F, and in other, newer SIMD architectures, this type is conceptually identical but has a different size and uses special registers: it is in fact exactly 1 bit per "lane", so it's an integer used as a bitmask... except... well, it gets a bit more complicated from there onwards.

What we would like to attempt to do is implement a single Mask type that abstracts over both those hardware versions and allows conversions where necessary (since almost all architectures that have "bitmasks" actually have both kinds). This has been... one of the more thorny points.

@RalfJung
Copy link
Member Author

RalfJung commented Feb 6, 2022

Wow so the very type of these intrinsics depends not only on the architecture but on the exact feature flags used for compilation? We have these hardware details leak all the way out to the language and Abstract Machine itself? What a mess. :(

@bjorn3
Copy link
Member

bjorn3 commented Feb 6, 2022

I believe simd_eq always returns a wide mask, but there is the simd_bitmask intrinsic to convert to a bitmask. See https://github.com/rust-lang/rust/blob/a41a6925badac7508d7a72cc1fc20f43dc6ad75e/compiler/rustc_codegen_ssa/src/base.rs#L121-L125 and https://github.com/rust-lang/rust/blob/6250d5a08cf0870d3655fa98b83718bc01ff6f45/compiler/rustc_codegen_llvm/src/intrinsic.rs#L1063-L1071

@RalfJung
Copy link
Member Author

RalfJung commented Feb 6, 2022

Ah you are right, I had overlooked that the return value of simd_eq is passed to Mask::from_int_unchecked... so I guess I will hope that in Miri we always end up with the full_mask implementation and then we don't have to worry about this. (We could add some cfg(miri), too, if we want to ensure this to be the case.)

@workingjubilee
Copy link
Member

Sure, I'll make sure Miri always hits the full_mask path for now.

Also I have started the process rolling of adding many more comments to the intrinsics.rs module in question that you already found. I will find a better way to expose and document them somewhere, but feel free to pepper me with any other questions you have.

@RalfJung
Copy link
Member Author

@workingjubilee sounds great, thanks. :-)

bors added a commit that referenced this issue Mar 7, 2022
implement more SIMD intrinsics

Requires rust-lang/rust#94681

With this, the cast, i32_ops, and f32_ops test suites of portable-simd pass. :)

Cc #1912
bors added a commit that referenced this issue Mar 10, 2022
implement simd_{shuffle,gather,scatter}

This makes portable-simd doctests pass. :)

Cc #1912
bors added a commit that referenced this issue Mar 16, 2022
implement SIMD float rounding functions

Cc #1912
bors added a commit that referenced this issue Mar 16, 2022
implement SIMD float rounding functions

Cc #1912
bors added a commit that referenced this issue Mar 17, 2022
implement SIMD sqrt and fma

Cc #1912
@RalfJung
Copy link
Member Author

Once #2029 lands (which requires sorting out rust-lang/portable-simd#267), all intrinsics are implemented and the entire portable-simd test suite passes. :-)

However, @workingjubilee already indicated new intrinsics will probably be added in the future (unless I misunderstood), so it probably makes sense to keep this issue open for now.

bors added a commit that referenced this issue Mar 17, 2022
implement simd bitmask intrinsics

Cc #1912
@RalfJung
Copy link
Member Author

RalfJung commented Jun 6, 2022

As far as I know we are currently implementing all intrinsics used by portable-simd. So I'll close this; let's reopen when a new one is introduced. @workingjubilee I'd appreciate a ping when that happens. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-shims Area: This affects the external function shims C-enhancement Category: a PR with an enhancement or an issue tracking an accepted enhancement E-good-first-issue A good way to start contributing, mentoring is available
Projects
None yet
Development

No branches or pull requests

3 participants