std::intrinsics::simd::simd_reduce_add_unordered
generates inefficient code for floating-point numbers
#130028
Labels
A-codegen
Area: Code generation
A-SIMD
Area: SIMD (Single Instruction Multiple Data)
C-bug
Category: This is a bug.
O-AArch64
Armv8-A or later processors in AArch64 mode
T-compiler
Relevant to the compiler team, which will review and decide on the PR/issue.
Code generation for
std::intrinsics::simd::simd_reduce_add_unordered
generates an extra floating-point add that adds +0.0 to the result: https://godbolt.org/z/Y496nxv3EThe problem seems to be because the compiler uses +0.0 as the starting value of
@llvm.vector.reduce.fadd.*
instead of -0.0. Comparing LLVM code generation for the two cases, we get the more efficient version when using -0.0: https://godbolt.org/z/fhaz7ced6This generates the following assembly for AArch64:
To me, this behaviour seems to be caused by using +0.0 instead of -0.0 here in the compiler:
rust/compiler/rustc_codegen_llvm/src/intrinsic.rs
Lines 2095 to 2101 in a3af208
The text was updated successfully, but these errors were encountered: