Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up ptr::replace #98197

Closed
wants to merge 3 commits into from
Closed

Conversation

lukas-code
Copy link
Member

I noticed, that ptr::replace and mem::replace are implemented differently, even though they should probably be the same.

Looking at the goldbolt output, it seems that mem::replace does only 2 memcpys and ptr::replace does 2 memcpys with an inlined copy in between.

I ran some benchmarks on my machine, in which mem::replace was more than 2x faster than ptr::replace.

Benchmark

Code

#![feature(test)]
#![cfg(test)]

extern crate test;

use rand::Rng;
use std::{mem, ptr};
use test::Bencher;

#[bench]
fn mem_replace(b: &mut Bencher) {
    let mut rng = rand::thread_rng();

    let mut dst = [0u8; 256];
    rng.fill(&mut dst);

    let mut src = [0u8; 256];
    rng.fill(&mut src);

    b.iter(|| {
        let mut dst_copy = dst;
        let old = mem::replace(&mut dst_copy, src);
        (old, dst_copy)
    });
}

#[bench]
fn ptr_replace(b: &mut Bencher) {
    let mut rng = rand::thread_rng();

    let mut dst = [0u8; 256];
    rng.fill(&mut dst);

    let mut src = [0u8; 256];
    rng.fill(&mut src);

    b.iter(|| {
        let mut dst_copy = dst;
        let old = unsafe { ptr::replace(&mut dst_copy, src) };
        (old, dst_copy)
    });
}

Output

test mem_replace ... bench:          11 ns/iter (+/- 0)
test ptr_replace ... bench:          25 ns/iter (+/- 0)

In this PR i have replaced the implementation of ptr::replace to call mem::replace instead of mem::swap.

@rustbot rustbot added the T-libs Relevant to the library team, which will review and decide on the PR/issue. label Jun 17, 2022
@rust-highfive
Copy link
Collaborator

Hey! It looks like you've submitted a new PR for the library teams!

If this PR contains changes to any rust-lang/rust public library APIs then please comment with @rustbot label +T-libs-api -T-libs to tag it appropriately. If this PR contains changes to any unstable APIs please edit the PR description to add a link to the relevant API Change Proposal or create one if you haven't already. If you're unsure where your change falls no worries, just leave it as is and the reviewer will take a look and make a decision to forward on if necessary.

Examples of T-libs-api changes:

  • Stabilizing library features
  • Introducing insta-stable changes such as new implementations of existing stable traits on existing stable types
  • Introducing new or changing existing unstable library APIs (excluding permanently unstable features / features without a tracking issue)
  • Changing public documentation in ways that create new stability guarantees
  • Changing observable runtime behavior of library APIs

@rust-highfive
Copy link
Collaborator

r? @kennytm

(rust-highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jun 17, 2022
@Dylan-DPC
Copy link
Member

@bors try @rust-timer queue

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jun 21, 2022
@bors
Copy link
Contributor

bors commented Jun 21, 2022

⌛ Trying commit bc0788b with merge fca4bf64b483deb3f9c962caf2ddf63c79124b81...

@bors
Copy link
Contributor

bors commented Jun 21, 2022

☀️ Try build successful - checks-actions
Build commit: fca4bf64b483deb3f9c962caf2ddf63c79124b81 (fca4bf64b483deb3f9c962caf2ddf63c79124b81)

@rust-timer
Copy link
Collaborator

Queued fca4bf64b483deb3f9c962caf2ddf63c79124b81 with parent 0887113, future comparison URL.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (fca4bf64b483deb3f9c962caf2ddf63c79124b81): comparison url.

Instruction count

  • Primary benchmarks: no relevant changes found
  • Secondary benchmarks: 😿 relevant regression found
mean1 max count2
Regressions 😿
(primary)
N/A N/A 0
Regressions 😿
(secondary)
0.8% 0.8% 1
Improvements 🎉
(primary)
N/A N/A 0
Improvements 🎉
(secondary)
N/A N/A 0
All 😿🎉 (primary) N/A N/A 0

Max RSS (memory usage)

Results
  • Primary benchmarks: 😿 relevant regression found
  • Secondary benchmarks: no relevant changes found
mean1 max count2
Regressions 😿
(primary)
7.4% 7.4% 1
Regressions 😿
(secondary)
N/A N/A 0
Improvements 🎉
(primary)
N/A N/A 0
Improvements 🎉
(secondary)
N/A N/A 0
All 😿🎉 (primary) 7.4% 7.4% 1

Cycles

Results
  • Primary benchmarks: 😿 relevant regressions found
  • Secondary benchmarks: no relevant changes found
mean1 max count2
Regressions 😿
(primary)
2.8% 2.9% 2
Regressions 😿
(secondary)
N/A N/A 0
Improvements 🎉
(primary)
N/A N/A 0
Improvements 🎉
(secondary)
N/A N/A 0
All 😿🎉 (primary) 2.8% 2.9% 2

If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf -perf-regression

Footnotes

  1. the arithmetic mean of the percent change 2 3

  2. number of relevant changes 2 3

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jun 21, 2022
@scottmcm
Copy link
Member

cc @m-ou-se, who did #83022 and thus probably has thoughts here.

also cc @RalfJung, who might have thoughts about the typed-ness of this like they did in #97712

@m-ou-se
Copy link
Member

m-ou-se commented Jun 24, 2022

Looks good to me, but I'm a bit surprised with the benchmarking results.

@RalfJung
Copy link
Member

RalfJung commented Jun 24, 2022

also cc @RalfJung, who might have thoughts about the typed-ness of this like they did in #97712

This operation takes and returns a T, so it has no choice but to do a typed copy.

It doesn't seem to actually help in rustc though? Not sure how representative that one microbenchmark is.

Comment on lines 1018 to 1019
// and cannot overlap `src` since `dst` must point to a distinct
// allocated object.
Copy link
Member

@RalfJung RalfJung Jun 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// and cannot overlap `src` since `dst` must point to a distinct
// allocated object.
// and cannot overlap `src` since `src` is a freshly stack-allocated local variable.

Not that we even still need this part of the argument with mem::replace.

@lukas-code
Copy link
Member Author

It probably doesn't improve performance in rustc a lot (at all), because it barely seems to be used in the Rust repository. All uses of ptr::replace I could find are in the implementation of mpsc in the standard library.

None => match ptr::replace(self.upgrade.get(), SendUsed) {

ptr::replace(self.upgrade.get(), prev);

match ptr::replace(self.upgrade.get(), SendUsed) {

let steals = ptr::replace(self.steals.get(), 0);

let steals = unsafe { ptr::replace(self.queue.consumer_addition().steals.get(), 0) };

Additionally I found one instance of mem::replace that should probably be a ptr::replace:

// SAFETY:
//
// note that this can in theory just be `*ptr = Some(value)`, but due to
// the compiler will currently codegen that pattern with something like:
//
// ptr::drop_in_place(ptr)
// ptr::write(ptr, Some(value))
//
// Due to this pattern it's possible for the destructor of the value in
// `ptr` (e.g., if this is being recursively initialized) to re-access
// TLS, in which case there will be a `&` and `&mut` pointer to the same
// value (an aliasing violation). To avoid setting the "I'm running a
// destructor" flag we just use `mem::replace` which should sequence the
// operations a little differently and make this safe to call.
//
// The precondition also ensures that we are the only one accessing
// `self` at the moment so replacing is fine.
unsafe {
let _ = mem::replace(&mut *ptr, Some(value));
}

@@ -811,7 +811,7 @@ mod lazy {

// SAFETY:
//
// note that this can in theory just be `*ptr = Some(value)`, but due to
// Note that this can in theory just be `*ptr = Some(value)`, but due to
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Due to what?

@JohnCSimon
Copy link
Member

ping from triage:
@lukas-code
looks like this PR is still in progress - so returning to author

FYI: when a PR is ready for review, send a message containing
@rustbot ready to switch to S-waiting-on-review so the PR is in the reviewer's backlog.

@rustbot author

@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jul 24, 2022
@lukas-code
Copy link
Member Author

Since this doesn't appear to increase performance at all, I'll just close this.

@lukas-code lukas-code closed this Jul 24, 2022
@lukas-code lukas-code deleted the ptr_replace branch July 24, 2022 11:28
taiki-e added a commit to taiki-e/portable-atomic that referenced this pull request Oct 29, 2023
There was a PR to improve it, but it was not merged for some reason.
rust-lang/rust#98197
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.