-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
For small types, just swap directly #43351
Conversation
Don't ptr::swap_nonoverlapping for things smaller than its block size. That simpler for LLVM -- particularly in debug -- and means the load/store instructions are generated with full alignment information.
r? @BurntSushi (rust_highfive has picked a reviewer for you, use r? to override) |
@@ -548,8 +548,19 @@ pub unsafe fn uninitialized<T>() -> T { | |||
#[inline] | |||
#[stable(feature = "rust1", since = "1.0.0")] | |||
pub fn swap<T>(x: &mut T, y: &mut T) { | |||
unsafe { | |||
ptr::swap_nonoverlapping(x, y, 1); | |||
if size_of::<T>() < 32 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't 32 a bit magical? Can we get the "correct" size from some platform info?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
32 is the block size used in ptr::swap_nonoverlapping_bytes
, so is the point where the simd loop there starts to potentially do something interesting.
That choice is from PR #40454, which has history and a bunch of perf numbers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to make it so this function and swap_nonoverlapping_bytes
are always in sync? e.g., If the latter is updated, can we also guarantee that the former also must be updated?
Are there benchmarks for this? |
ping @scottmcm, thoughts on the benchmarks @BurntSushi mentioned? |
I'm going to close this to keep the queue clean, but we'd love to take this so if you manage to come back to it @scottmcm just let us know and we'll reopen! |
In
mem::swap
, don'tptr::swap_nonoverlapping
for things smaller than its block size. That lets debug mode not run the simd loop and means the load/store instructions in release are generated with full alignment information. (Swap on u64 in stable has align 8, but in nightly it has align 1.)Inspired by @oyvindln in #43299 (comment).