-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add simd_relaxed_fma intrinsic #133395
Add simd_relaxed_fma intrinsic #133395
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -612,6 +612,20 @@ extern "rust-intrinsic" { | |
#[rustc_nounwind] | ||
pub fn simd_fma<T>(x: T, y: T, z: T) -> T; | ||
|
||
/// Computes `(x*y) + z` for each element, non-deterministically executing either | ||
/// a fused multiply-add or two operations with rounding of the intermediate result. | ||
/// | ||
/// The operation is fused if the code generator determines that target instruction | ||
/// set has support for a fused operation, and that the fused operation is more efficient | ||
/// than the equivalent, separate pair of mul and add instructions. It is unspecified | ||
/// whether or not a fused operation is selected, and that may depend on optimization | ||
/// level and context, for example. | ||
/// | ||
/// `T` must be a vector of floats. | ||
#[cfg(not(bootstrap))] | ||
#[rustc_nounwind] | ||
pub fn simd_relaxed_fma<T>(x: T, y: T, z: T) -> T; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So this is like There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ah I didn't realize there was a scalar version... is it used anywhere? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. to match the scalar version, imo it should be renamed to There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's not used yet, it was added in preparation for exposing corresponding methods on the float types. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
That's a pretty bad name though IMO, it is used for the scalar version only because that's how LLVM calls them. I like There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. hmm, maybe...though I want to say I've seen There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. though, since it isn't always fused, I think There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yeah, either of those would be ok-ish -- |
||
|
||
// Computes the sine of each element. | ||
/// | ||
/// `T` must be a vector of floats. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we guarantee that the choice made is consistent across all lanes? Or could it happen that some lanes get fused and others not?
I assume this can happen because we want to allow the backend to split a big SIMD op into multiple smaller SIMD ops, and then some may be fused and some may not.