Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initialization branch of OnceNonZeroUsize::get_or_try_init should be #[cold] #272

Closed
briansmith opened this issue Feb 6, 2025 · 3 comments

Comments

@briansmith
Copy link
Contributor

The current code for OnceNonZeroUsize::get_or_try_init() looks like this:

 pub fn get_or_try_init<F, E>(&self, f: F) -> Result<NonZeroUsize, E>
    where
        F: FnOnce() -> Result<NonZeroUsize, E>,
    {
        let val = self.inner.load(Ordering::Acquire);
        let res = match NonZeroUsize::new(val) {
            Some(it) => it,
            None => {
                let mut val = f()?.get();
                let exchange =
                    self.inner.compare_exchange(0, val, Ordering::AcqRel, Ordering::Acquire);
                if let Err(old) = exchange {
                    val = old;
                }
                unsafe { NonZeroUsize::new_unchecked(val) }
            }
        };
        Ok(res)

In typical use, the Some branch is going to be taken extremely frequently but the None branch is only going to be taken once (or a very small number of times) at startup. If get_or_try_init is in a performance-sensitive segment of code, then it is important that get_or_try_init be inlined and that the compiler understands that the Some branch is (much) more likely than the None branch. When this happens, the call site basically becomes a load followed by a conditional jump that is basically never taken; which is ideal.

When get_or_try_init is used in many places in the user's code, it is important to avoid inlining any of the None branch into the call sites.

Unfortunately, the Rust compiler is sometimes not good at recognizing that code that calls a #[cold] function unconditionally must be cold itself. So, sometimes it isn't enough to mark our f as #[cold] #[inline(never)].

It may be better to instead put the entire body of the None branch into a function that is marked #[cold] #[inline(never)]. The #[inline(never)] is often needed because some post-inlining optimization passes in the compiler seem to not understand #[cold], and because we don't want any part of that branch to be in the calling code.

If you agree this is reasonable, I can submit a PR to this effect.

@briansmith briansmith changed the title Initialization branch of ` Initialization branch of OnceNonZeroUsize::get_or_try_init should be #[cold] Feb 6, 2025
@matklad
Copy link
Owner

matklad commented Feb 6, 2025

Yeah, totally that's a performance bug, the non branch should be outlined and marked as #[cold], thanks for spotting this!

@briansmith
Copy link
Contributor Author

PR #273 implements this.

@matklad
Copy link
Owner

matklad commented Feb 7, 2025

@matklad matklad closed this as completed Feb 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants