-
Notifications
You must be signed in to change notification settings - Fork 13.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Potentially unsound uses of simd_select_bitmask in stdarch #137942
Comments
This is a similar case: /// For each packed 32-bit integer maps the value to the number of logical 1 bits.
///
/// Uses the writemask in k - elements are zeroed in the result if the corresponding mask bit is not set.
/// Otherwise the computation result is written into the result.
///
/// [Intel's documentation](https://www.intel.com/content/www/us/en/docs/intrinsics-guide/index.html#text=_mm_maskz_popcnt_epi32)
#[inline]
#[target_feature(enable = "avx512vpopcntdq,avx512vl")]
#[unstable(feature = "stdarch_x86_avx512", issue = "111137")]
#[cfg_attr(test, assert_instr(vpopcntd))]
pub fn _mm_maskz_popcnt_epi32(k: __mmask8, a: __m128i) -> __m128i {
unsafe {
transmute(simd_select_bitmask(
k,
simd_ctpop(a.as_i32x4()),
i32x4::ZERO,
))
}
} I did not do an exhaustive search. |
(misread your comment) |
Right, I believe the intent is to ignore the unused bits here. |
A quick look at Intel's docs confirms this. So, the question is, can we change the implementation to use Or do we have to dig into the |
That is also my understanding of the llvm implementation. It will truncate to an integer with the number of bits corresponding to the number of lanes, then bitcast that to a vector: rust/compiler/rustc_codegen_llvm/src/intrinsic.rs Line 1284 in 81d8edc
So any higher bits will be ignored. |
I wonder if our other backends work the same; Cc @bjorn3 @GuillaumeGomez |
cg_clif currently just checks if the respective lane in the mask is equal to 0 or not: https://github.com/rust-lang/rustc_codegen_cranelift/blob/0f9c09fb3a64ff11ea81446a96907cd5e86490c2/src/intrinsics/simd.rs#L788-L790 |
Okay so that is also fine with arbitrary data in the "extra" bits. |
Yeah, extra bits are ignored. |
So it seems we could just change the |
We haven't heard about the GCC backend yet, but yeah it seems like that should work. Only Miri and the docs would need adjustments for that. |
Looking for potential violations of
simd_*
intrinsic preconditions, I found this in stdarch:simd_select_bitmask
is documented to require that all the "extra"/"padding" bits in the mask (not corresponding to a vector element) must be 0. Here,rst
andzero
are vectors of length 4, and the maskk
is au8
, meaning there are 4 bits ink
that must be 0. However, nothing in the function actually ensures that.I don't know the intended behavior of the intrinsic for that case (probably intel promises to just ignore the extra bits?), but this function recently got marked as safe (in rust-lang/stdarch#1714) and that is clearly in contradiction with our intrinsic docs. I assume the safety is correct as probably the intrinsic should have no precondition; in that case we have to
simd_select_bitmask
Cc @usamoi @Amanieu @workingjubilee
The text was updated successfully, but these errors were encountered: