-
Notifications
You must be signed in to change notification settings - Fork 277
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
added f32 and f64 unaligned stores and loads from avx512f set #873
added f32 and f64 unaligned stores and loads from avx512f set #873
Conversation
Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @gnzlbg (or someone else) soon. If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes. Please see the contribution instructions for more information. |
051049d
to
eeadf05
Compare
crates/core_arch/src/x86/avx512f.rs
Outdated
// This intrinsic has no corresponding instruction. | ||
pub unsafe fn _mm512_undefined_ps() -> __m512 { | ||
// FIXME: this function should return MaybeUninit<__m512> | ||
mem::MaybeUninit::<__m512>::uninit().assume_init() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is UB.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The undefined
intrinsics should return zero-initialized vectors like clang does.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_mm_undefined_ps and _mm_256_undefined_p{d,s} do same thing, should i change them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes those should be fixed as well.
crates/core_arch/src/x86/avx512f.rs
Outdated
&mut dst as *mut __m512d as *mut u8, | ||
mem::size_of::<__m512d>(), | ||
); | ||
dst |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does mem::transmute
work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i used Amanieu version
crates/core_arch/src/x86/avx512f.rs
Outdated
&mut dst as *mut __m512d as *mut u8, | ||
mem::size_of::<__m512d>(), | ||
); | ||
dst |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Simpler version: ptr::read_unaligned(mem_addr as *const __m512d)
thanks, updated with feedback addressed |
#[target_feature(enable = "avx512f")] | ||
// This intrinsic has no corresponding instruction. | ||
pub unsafe fn _mm512_undefined_pd() -> __m512d { | ||
_mm512_set1_pd(0.0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a zero float. I think it should be literal zero bytes. Not sure though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A zero float happens to be encoded with all bits zeroed.
crates/core_arch/src/x86/avx512f.rs
Outdated
&a as *const __m512d as *const u8, | ||
mem_addr as *mut u8, | ||
mem::size_of::<__m512d>(), | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use ptr::write_unaligned
here just like the loads.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks
e5986e5
to
75c427e
Compare
For the CI failures just change the |
interesting, there are these comments: stdarch/crates/core_arch/src/x86/avx.rs Line 1678 in a371069
stdarch/crates/core_arch/src/x86/avx.rs Line 1697 in a371069
|
Thanks! |
No description provided.