-
Notifications
You must be signed in to change notification settings - Fork 13.5k
Use zero for initialized Once state #143881
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
One nit, could you add a note to these |
To test the queue implementation: @bors2 try jobs=x86_64-msvc-* |
Use zero for initialized Once state By re-labeling which integer represents which internal state for `Once` we can ensure that the initialized state is the all-zero state. This is beneficial because some CPU architectures (such as Arm) have specialized instructions to specifically branch on non-zero, and checking for the initialized state is by far the most important operation. As an example, take this: ```rust use std::sync::atomic::{AtomicU32, Ordering}; const INIT: u32 = 3; #[inline(never)] #[cold] pub fn slow(state: &AtomicU32) { state.store(INIT, Ordering::Release); } pub fn ensure_init(state: &AtomicU32) { if state.load(Ordering::Acquire) != INIT { slow(state) } } ``` If `INIT` is 3 (as is currently the state for `Once`), we see the following assembly on `aarch64-apple-darwin`: ```asm example::ensure_init::h332061368366e313: ldapr w8, [x0] cmp w8, #3 b.ne LBB1_2 ret LBB1_2: b example::slow::ha042bd6a4f33724e ``` By changing the `INIT` state to zero we get the following: ```asm example::ensure_init::h332061368366e313: ldapr w8, [x0] cbnz w8, LBB1_2 ret LBB1_2: b example::slow::ha042bd6a4f33724e ``` So this PR saves 1 instruction every time a `LazyLock` gets accessed on platforms such as these. try-job: x86_64-msvc-*
@tgross35 Done. |
@bors r=tgross35 rollup |
Use zero for initialized Once state By re-labeling which integer represents which internal state for `Once` we can ensure that the initialized state is the all-zero state. This is beneficial because some CPU architectures (such as Arm) have specialized instructions to specifically branch on non-zero, and checking for the initialized state is by far the most important operation. As an example, take this: ```rust use std::sync::atomic::{AtomicU32, Ordering}; const INIT: u32 = 3; #[inline(never)] #[cold] pub fn slow(state: &AtomicU32) { state.store(INIT, Ordering::Release); } pub fn ensure_init(state: &AtomicU32) { if state.load(Ordering::Acquire) != INIT { slow(state) } } ``` If `INIT` is 3 (as is currently the state for `Once`), we see the following assembly on `aarch64-apple-darwin`: ```asm example::ensure_init::h332061368366e313: ldapr w8, [x0] cmp w8, #3 b.ne LBB1_2 ret LBB1_2: b example::slow::ha042bd6a4f33724e ``` By changing the `INIT` state to zero we get the following: ```asm example::ensure_init::h332061368366e313: ldapr w8, [x0] cbnz w8, LBB1_2 ret LBB1_2: b example::slow::ha042bd6a4f33724e ``` So this PR saves 1 instruction every time a `LazyLock` gets accessed on platforms such as these.
Rollup of 16 pull requests Successful merges: - #142885 (core: Add `BorrowedCursor::with_unfilled_buf`) - #143217 (Port #[link_ordinal] to the new attribute parsing infrastructure) - #143355 (wrapping shift: remove first bitmask and table) - #143448 (remote-test-client: Exit code `128 + <signal-number>` instead of `3`) - #143592 (UWP: link ntdll functions using raw-dylib) - #143681 (bootstrap/miri: avoid rebuilds for test builds) - #143710 (Updates to random number generation APIs) - #143724 (Tidy cleanup) - #143820 (Fixed a core crate compilation failure when enabling the `optimize_for_size` feature on some targets) - #143850 (Compiletest: Simplify {Html,Json}DocCk directive handling) - #143855 (Port `#[omit_gdb_pretty_printer_section]` to the new attribute parsing) - #143868 (warn on align on fields to avoid breaking changes) - #143875 (update issue number for `const_trait_impl`) - #143881 (Use zero for initialized Once state) - #143887 (Run bootstrap tests sooner in the `x test` pipeline) - #143893 (Don't require `eh_personality` lang item on targets that have a personality) Failed merges: - #143878 (Port `#[pointee]` to the new attribute parsing infrastructure) - #143891 (Port `#[coverage]` to the new attribute system) r? `@ghost` `@rustbot` modify labels: rollup
Use zero for initialized Once state By re-labeling which integer represents which internal state for `Once` we can ensure that the initialized state is the all-zero state. This is beneficial because some CPU architectures (such as Arm) have specialized instructions to specifically branch on non-zero, and checking for the initialized state is by far the most important operation. As an example, take this: ```rust use std::sync::atomic::{AtomicU32, Ordering}; const INIT: u32 = 3; #[inline(never)] #[cold] pub fn slow(state: &AtomicU32) { state.store(INIT, Ordering::Release); } pub fn ensure_init(state: &AtomicU32) { if state.load(Ordering::Acquire) != INIT { slow(state) } } ``` If `INIT` is 3 (as is currently the state for `Once`), we see the following assembly on `aarch64-apple-darwin`: ```asm example::ensure_init::h332061368366e313: ldapr w8, [x0] cmp w8, rust-lang#3 b.ne LBB1_2 ret LBB1_2: b example::slow::ha042bd6a4f33724e ``` By changing the `INIT` state to zero we get the following: ```asm example::ensure_init::h332061368366e313: ldapr w8, [x0] cbnz w8, LBB1_2 ret LBB1_2: b example::slow::ha042bd6a4f33724e ``` So this PR saves 1 instruction every time a `LazyLock` gets accessed on platforms such as these.
Rollup of 17 pull requests Successful merges: - #142885 (core: Add `BorrowedCursor::with_unfilled_buf`) - #143217 (Port #[link_ordinal] to the new attribute parsing infrastructure) - #143355 (wrapping shift: remove first bitmask and table) - #143448 (remote-test-client: Exit code `128 + <signal-number>` instead of `3`) - #143681 (bootstrap/miri: avoid rebuilds for test builds) - #143710 (Updates to random number generation APIs) - #143724 (Tidy cleanup) - #143738 (Move several float tests to floats/mod.rs) - #143820 (Fixed a core crate compilation failure when enabling the `optimize_for_size` feature on some targets) - #143850 (Compiletest: Simplify {Html,Json}DocCk directive handling) - #143855 (Port `#[omit_gdb_pretty_printer_section]` to the new attribute parsing) - #143868 (warn on align on fields to avoid breaking changes) - #143875 (update issue number for `const_trait_impl`) - #143881 (Use zero for initialized Once state) - #143887 (Run bootstrap tests sooner in the `x test` pipeline) - #143893 (Don't require `eh_personality` lang item on targets that have a personality) - #143901 (Region constraint nits) Failed merges: - #143878 (Port `#[pointee]` to the new attribute parsing infrastructure) - #143891 (Port `#[coverage]` to the new attribute system) r? `@ghost` `@rustbot` modify labels: rollup
Rollup of 10 pull requests Successful merges: - #143217 (Port #[link_ordinal] to the new attribute parsing infrastructure) - #143681 (bootstrap/miri: avoid rebuilds for test builds) - #143724 (Tidy cleanup) - #143733 (Change bootstrap's `tool.TOOL_NAME.features` to work on any subcommand) - #143850 (Compiletest: Simplify {Html,Json}DocCk directive handling) - #143875 (update issue number for `const_trait_impl`) - #143881 (Use zero for initialized Once state) - #143887 (Run bootstrap tests sooner in the `x test` pipeline) - #143917 (Change "allocated object" to "allocation".) - #143918 (Tier check cleanup) r? `@ghost` `@rustbot` modify labels: rollup
Rollup merge of #143881 - orlp:once-state-repr, r=tgross35 Use zero for initialized Once state By re-labeling which integer represents which internal state for `Once` we can ensure that the initialized state is the all-zero state. This is beneficial because some CPU architectures (such as Arm) have specialized instructions to specifically branch on non-zero, and checking for the initialized state is by far the most important operation. As an example, take this: ```rust use std::sync::atomic::{AtomicU32, Ordering}; const INIT: u32 = 3; #[inline(never)] #[cold] pub fn slow(state: &AtomicU32) { state.store(INIT, Ordering::Release); } pub fn ensure_init(state: &AtomicU32) { if state.load(Ordering::Acquire) != INIT { slow(state) } } ``` If `INIT` is 3 (as is currently the state for `Once`), we see the following assembly on `aarch64-apple-darwin`: ```asm example::ensure_init::h332061368366e313: ldapr w8, [x0] cmp w8, #3 b.ne LBB1_2 ret LBB1_2: b example::slow::ha042bd6a4f33724e ``` By changing the `INIT` state to zero we get the following: ```asm example::ensure_init::h332061368366e313: ldapr w8, [x0] cbnz w8, LBB1_2 ret LBB1_2: b example::slow::ha042bd6a4f33724e ``` So this PR saves 1 instruction every time a `LazyLock` gets accessed on platforms such as these.
By re-labeling which integer represents which internal state for
Once
we can ensure that the initialized state is the all-zero state. This is beneficial because some CPU architectures (such as Arm) have specialized instructions to specifically branch on non-zero, and checking for the initialized state is by far the most important operation.As an example, take this:
If
INIT
is 3 (as is currently the state forOnce
), we see the following assembly onaarch64-apple-darwin
:By changing the
INIT
state to zero we get the following:So this PR saves 1 instruction every time a
LazyLock
gets accessed on platforms such as these.