You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The current code for OnceNonZeroUsize::get_or_try_init() looks like this:
pubfn get_or_try_init<F,E>(&self,f:F) -> Result<NonZeroUsize,E>whereF:FnOnce() -> Result<NonZeroUsize,E>,{let val = self.inner.load(Ordering::Acquire);let res = matchNonZeroUsize::new(val){Some(it) => it,None => {letmut val = f()?.get();let exchange =
self.inner.compare_exchange(0, val,Ordering::AcqRel,Ordering::Acquire);ifletErr(old) = exchange {
val = old;}unsafe{NonZeroUsize::new_unchecked(val)}}};Ok(res)
In typical use, the Some branch is going to be taken extremely frequently but the None branch is only going to be taken once (or a very small number of times) at startup. If get_or_try_init is in a performance-sensitive segment of code, then it is important that get_or_try_init be inlined and that the compiler understands that the Some branch is (much) more likely than the None branch. When this happens, the call site basically becomes a load followed by a conditional jump that is basically never taken; which is ideal.
When get_or_try_init is used in many places in the user's code, it is important to avoid inlining any of the None branch into the call sites.
Unfortunately, the Rust compiler is sometimes not good at recognizing that code that calls a #[cold] function unconditionally must be cold itself. So, sometimes it isn't enough to mark our f as #[cold] #[inline(never)].
It may be better to instead put the entire body of the None branch into a function that is marked #[cold] #[inline(never)]. The #[inline(never)] is often needed because some post-inlining optimization passes in the compiler seem to not understand #[cold], and because we don't want any part of that branch to be in the calling code.
If you agree this is reasonable, I can submit a PR to this effect.
The text was updated successfully, but these errors were encountered:
briansmith
changed the title
Initialization branch of `
Initialization branch of OnceNonZeroUsize::get_or_try_init should be #[cold]Feb 6, 2025
The current code for
OnceNonZeroUsize::get_or_try_init()
looks like this:In typical use, the
Some
branch is going to be taken extremely frequently but theNone
branch is only going to be taken once (or a very small number of times) at startup. Ifget_or_try_init
is in a performance-sensitive segment of code, then it is important thatget_or_try_init
be inlined and that the compiler understands that theSome
branch is (much) more likely than theNone
branch. When this happens, the call site basically becomes a load followed by a conditional jump that is basically never taken; which is ideal.When
get_or_try_init
is used in many places in the user's code, it is important to avoid inlining any of theNone
branch into the call sites.Unfortunately, the Rust compiler is sometimes not good at recognizing that code that calls a
#[cold]
function unconditionally must be cold itself. So, sometimes it isn't enough to mark ourf
as#[cold] #[inline(never)]
.It may be better to instead put the entire body of the
None
branch into a function that is marked#[cold] #[inline(never)]
. The#[inline(never)]
is often needed because some post-inlining optimization passes in the compiler seem to not understand#[cold]
, and because we don't want any part of that branch to be in the calling code.If you agree this is reasonable, I can submit a PR to this effect.
The text was updated successfully, but these errors were encountered: