Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add simd_relaxed_fma intrinsic #133395

Merged
merged 2 commits into from
Dec 3, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion compiler/rustc_codegen_cranelift/src/intrinsics/simd.rs
Original file line number Diff line number Diff line change
Expand Up @@ -415,7 +415,8 @@ pub(super) fn codegen_simd_intrinsic_call<'tcx>(
});
}

sym::simd_fma => {
// FIXME: simd_relaxed_fma doesn't relax to non-fused multiply-add
sym::simd_fma | sym::simd_relaxed_fma => {
intrinsic_args!(fx, args => (a, b, c); intrinsic);

if !a.layout().ty.is_simd() {
Expand Down
1 change: 1 addition & 0 deletions compiler/rustc_codegen_gcc/src/intrinsic/simd.rs
Original file line number Diff line number Diff line change
Expand Up @@ -772,6 +772,7 @@ pub fn generic_simd_intrinsic<'a, 'gcc, 'tcx>(
sym::simd_flog => "log",
sym::simd_floor => "floor",
sym::simd_fma => "fma",
sym::simd_relaxed_fma => "fma", // FIXME: this should relax to non-fused multiply-add when necessary
sym::simd_fpowi => "__builtin_powi",
sym::simd_fpow => "pow",
sym::simd_fsin => "sin",
Expand Down
2 changes: 2 additions & 0 deletions compiler/rustc_codegen_llvm/src/intrinsic.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1534,6 +1534,7 @@ fn generic_simd_intrinsic<'ll, 'tcx>(
sym::simd_flog => ("log", bx.type_func(&[vec_ty], vec_ty)),
sym::simd_floor => ("floor", bx.type_func(&[vec_ty], vec_ty)),
sym::simd_fma => ("fma", bx.type_func(&[vec_ty, vec_ty, vec_ty], vec_ty)),
sym::simd_relaxed_fma => ("fmuladd", bx.type_func(&[vec_ty, vec_ty, vec_ty], vec_ty)),
sym::simd_fpowi => ("powi", bx.type_func(&[vec_ty, bx.type_i32()], vec_ty)),
sym::simd_fpow => ("pow", bx.type_func(&[vec_ty, vec_ty], vec_ty)),
sym::simd_fsin => ("sin", bx.type_func(&[vec_ty], vec_ty)),
Expand Down Expand Up @@ -1572,6 +1573,7 @@ fn generic_simd_intrinsic<'ll, 'tcx>(
| sym::simd_fpowi
| sym::simd_fsin
| sym::simd_fsqrt
| sym::simd_relaxed_fma
| sym::simd_round
| sym::simd_trunc
) {
Expand Down
4 changes: 3 additions & 1 deletion compiler/rustc_hir_analysis/src/check/intrinsic.rs
Original file line number Diff line number Diff line change
Expand Up @@ -641,7 +641,9 @@ pub fn check_intrinsic_type(
| sym::simd_round
| sym::simd_trunc => (1, 0, vec![param(0)], param(0)),
sym::simd_fpowi => (1, 0, vec![param(0), tcx.types.i32], param(0)),
sym::simd_fma => (1, 0, vec![param(0), param(0), param(0)], param(0)),
sym::simd_fma | sym::simd_relaxed_fma => {
(1, 0, vec![param(0), param(0), param(0)], param(0))
}
sym::simd_gather => (3, 0, vec![param(0), param(1), param(2)], param(0)),
sym::simd_masked_load => (3, 0, vec![param(0), param(1), param(2)], param(2)),
sym::simd_masked_store => (3, 0, vec![param(0), param(1), param(2)], tcx.types.unit),
Expand Down
1 change: 1 addition & 0 deletions compiler/rustc_span/src/symbol.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1840,6 +1840,7 @@ symbols! {
simd_reduce_mul_unordered,
simd_reduce_or,
simd_reduce_xor,
simd_relaxed_fma,
simd_rem,
simd_round,
simd_saturating_add,
Expand Down
10 changes: 10 additions & 0 deletions library/core/src/intrinsics/simd.rs
Original file line number Diff line number Diff line change
Expand Up @@ -612,6 +612,16 @@ extern "rust-intrinsic" {
#[rustc_nounwind]
pub fn simd_fma<T>(x: T, y: T, z: T) -> T;

/// Computes `(x*y) + z` for each element, with unspecified rounding.
///
/// This may be equivalent to `simd_fma`, or it may relax to rounding each
/// operation if that's more efficient.
///
/// `T` must be a vector of floats.
#[cfg(not(bootstrap))]
#[rustc_nounwind]
pub fn simd_relaxed_fma<T>(x: T, y: T, z: T) -> T;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this is like fmuladd on scalars? That should be mentioned, and probably it makes sense to copy the doc comment from there.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I didn't realize there was a scalar version... is it used anywhere?

Copy link
Member

@programmerjake programmerjake Nov 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to match the scalar version, imo it should be renamed to simd_fmuladd, also to avoid confusion with any fast-math semantics

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not used yet, it was added in preparation for exposing corresponding methods on the float types.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to match the scalar version, imo it should be renamed to simd_fmuladd, also to avoid confusion with any fast-math semantics

That's a pretty bad name though IMO, it is used for the scalar version only because that's how LLVM calls them.

I like relaxed_fma. I don't think it is confusing with fast-math semantics, we don't call those "relaxed" after all.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, maybe...though I want to say I've seen relaxed suggested for fast math functions...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

though, since it isn't always fused, I think simd_something_mul_add is better than simd_something_fma

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

simd_mul_add? simd_relaxed_mul_add?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, either of those would be ok-ish -- simd_mul_add makes one think of f32::mul_add which is always fused (imo naming f32::mul_add mul_add is a mistake, but there's nothing we can do now...).


// Computes the sine of each element.
///
/// `T` must be a vector of floats.
Expand Down
4 changes: 4 additions & 0 deletions tests/ui/simd/intrinsic/float-math-pass.rs
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ extern "rust-intrinsic" {
fn simd_fexp<T>(x: T) -> T;
fn simd_fexp2<T>(x: T) -> T;
fn simd_fma<T>(x: T, y: T, z: T) -> T;
fn simd_relaxed_fma<T>(x: T, y: T, z: T) -> T;
fn simd_flog<T>(x: T) -> T;
fn simd_flog10<T>(x: T) -> T;
fn simd_flog2<T>(x: T) -> T;
Expand Down Expand Up @@ -77,6 +78,9 @@ fn main() {
let r = simd_fma(x, h, h);
assert_approx_eq!(x, r);

let r = simd_relaxed_fma(x, h, h);
assert_approx_eq!(x, r);

let r = simd_fsqrt(x);
assert_approx_eq!(x, r);

Expand Down
Loading