Skip to content

compiler-builtins: plumb LSE support for aarch64 on linux/gnu when optimized-compiler-builtins not enabled #144705

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

pmur
Copy link

@pmur pmur commented Jul 30, 2025

Add dynamic support for aarch64 LSE atomic ops on linux/gnu targets when optimized-compiler-builtins is not enabled.

Enabling LSE is the primary motivator for #143689, though extending the rust version doesn't seem too farfetched. Are there more details which I have overlooked which make this impractical? I've tested this on an aarch64 host with LSE.

r? @tgross35

@rustbot rustbot added A-compiler-builtins Area: compiler-builtins (https://github.com/rust-lang/compiler-builtins) S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Jul 30, 2025
@tgross35
Copy link
Contributor

Agreed that we should be able to support these in compiler-builtins. I think one problem with the current implementation is linker errors for no_std binaries on the aarch64-unknown-linux-gnu target if getauxval isn't supplied somewhere (whether or not we want to support this usecase is still up for debate).

Would things work if the check+init routine was moved into std::sys rather than within builtins? This should avoid any no-std concerns, and is_aarch64_feature_detected could be used rather than directly calling getauxval.

I'm certain @Amanieu has some thoughts here

@pmur
Copy link
Author

pmur commented Jul 31, 2025

Would things work if the check+init routine was moved into std::sys rather than within builtins? This should avoid any no-std concerns, and is_aarch64_feature_detected could be used rather than directly calling getauxval.

Hrm, the check does need to move out. Where in std::sys would the check belong?

Technically, I don't think the switchover to LSE has to happen very early, but earlier is probably better for performance.

@pmur
Copy link
Author

pmur commented Jul 31, 2025

Looking closer, if optimized-compiler-builtins are enabled and no-std is used, the user must supply a getauxval if they use the compiler-builtins on a gnu target (at least with the recent stable compiler I tried).

How much trouble would it be to explicitly require a user to supply getauxval in these cases?

@tgross35
Copy link
Contributor

Hrm, the check does need to move out. Where in std::sys would the check belong?

There probably isn't a good fit currently. Maybe a new module like configure_builtins.rs?

Technically, I don't think the switchover to LSE has to happen very early, but earlier is probably better for performance.

Would the library where the configuration code lives make a difference for timing? I think init_array could still be used, we have another instance at

#[unsafe(link_section = ".init_array.00099")]
.

Looking closer, if optimized-compiler-builtins are enabled and no-std is used, the user must supply a getauxval if they use the compiler-builtins on a gnu target (at least with the recent stable compiler I tried).

How much trouble would it be to explicitly require a user to supply getauxval in these cases?

Good point, that part probably doesn't actually matter then.

It is likely still beneficial to keep compiler-rt code that needs the C standard library in std if possible, I assume compiler-rt just has everything together because it doesn't have any control over libc. Not as relevant for the one libcall here (compared to e.g. #138944), but one advantage is that we're calling getauxval anyway, may as well benefit from initializing std's is_aarch64_feature_detected cache.

Paul Murphy and others added 2 commits July 31, 2025 16:22
Add dynamic support for aarch64 LSE atomic ops on linux/gnu targets
when optimized-compiler-builtins is not enabled.

The resulting asm should exactly match that of LLVM's compiler-rt
builtins, though the symbol naming for the support function and
global does not.
Create a private module to hold the bootstrap code needed enable LSE
at startup on aarch64-*-linux-gnu* targets when rust implements the
intrinsics.

This is a bit more heavyweight than compiler-rt's LSE initialization,
but has the benefit of initializing the aarch64 cpu feature detection
for other uses.

Using the rust initialization code does use some atomic operations,
that's OK. Mixing LSE and non-LSE operations should work while the
update flag propagates.
@pmur pmur force-pushed the murp/aarch64-lse branch from ee0641a to 12b5e55 Compare July 31, 2025 22:43
@rustbot rustbot added the T-libs Relevant to the library team, which will review and decide on the PR/issue. label Jul 31, 2025
/// Call to enable LSE atomic operations.
pub fn enable_lse() {
HAVE_LSE_ATOMICS.store(1, Ordering::Relaxed);
}
Copy link
Contributor

@tgross35 tgross35 Jul 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's probably okay to call this directly in std, but all builtins intrinsics are called by extern so it's probably safest to do the same here (the std dependency should actually be removed in 42bf044). The intrinsics macro handles no_mangle and weak linkage:

intrinsics! {
    pub extern "C" fn __rust_enable_lse() {
        // ...
    }
}

Any reason this is a draft still? Assuming you have tested that the current setup works, it lgtm with the above change (though I'll ask Amaieu to take a look)

#[cfg(all(
target_arch = "aarch64",
target_os = "linux",
target_env = "gnu",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we actually do have this on musl since very recently 3d0dedd. Gating on target_feature = "outline-atomics" would be ideal, but that doesn't seem to work for some reason.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gating on target_feature = "outline-atomics" would be ideal, but that doesn't seem to work for some reason.

#144761

@Amanieu
Copy link
Member

Amanieu commented Aug 1, 2025

At the moment this is the only way I can see this working. Fundamentally compiler-builtins is not able to call into other libraries so LSE must be manually initialized from outside.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-compiler-builtins Area: compiler-builtins (https://github.com/rust-lang/compiler-builtins) S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants