Skip to content

Allow the global alloc one TLS slot with a destructor #143761

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

orlp
Copy link
Contributor

@orlp orlp commented Jul 11, 2025

This is an improvement over #116402, which fixed an unsoundness by simply disallowing the global allocator from creating any thread-local variables with destructors. Instead of doing that, with this PR the global allocator is allowed to create exactly one thread-local variable with a destructor.

This is actually already a huge improvement over the status quo, as a single thread-local variable with a destructor is sufficient to store arbitrary amounts of thread-local data while allowing cleanup on thread exit.

@rustbot
Copy link
Collaborator

rustbot commented Jul 11, 2025

r? @ChrisDenton

rustbot has assigned @ChrisDenton.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Jul 11, 2025
@orlp
Copy link
Contributor Author

orlp commented Jul 11, 2025

I've added a test which uses a GlobalAlloc which not only has a thread-local variable with a destructor in its alloc, but also in dealloc. I had to change the drop order in destructors::run slightly for this.

Writing such an allocator only works if we guarantee not to touch the GlobalAlloc anymore in a thread after calling destructors::run. From a quick glance, I believe we mostly already do this on all platforms except two things:

  • rt::thread_cleanup calls thread::drop_current which drops an Arc holding the Thread handle:
    drop(Thread::from_raw(current));
    The test currently does not catch this as the main thread holds the Thread handle alive in the .join().
  • On μITRON we drop a Box after running the thread-local destructors:
    unsafe { crate::sys::thread_local::destructors::run() };
    let old_lifecycle = inner
    .lifecycle
    .swap(LIFECYCLE_EXITED_OR_FINISHED_OR_JOIN_FINALIZE, Ordering::AcqRel);
    match old_lifecycle {
    LIFECYCLE_DETACHED => {
    // [DETACHED → EXITED]
    // No one will ever join, so we'll ask the collector task to
    // delete the task.
    // In this case, `*p_inner`'s ownership has been moved to
    // us, and we are responsible for dropping it. The acquire
    // ordering ensures that the swap operation that wrote
    // `LIFECYCLE_DETACHED` happens-before `Box::from_raw(
    // p_inner)`.
    // Safety: See above.
    let _ = unsafe { Box::from_raw(p_inner) };

Can we change/mitigate these things so that GlobalAlloc.dealloc may also access the thread-local with a destructor?

Just to clarify, the current behavior of this PR is safe, you just get a LocalKey panic-during-panic into an abort if you were to write a GlobalAlloc which accesses a thread-local with a destructor in its dealloc.

@ChrisDenton
Copy link
Member

cc @joboet, since you've done a lot in this area did you want to take this? I'd be interested in your input even if not.

@joboet
Copy link
Member

joboet commented Jul 17, 2025

Sure, I can take this. Though I'm not convinced that the current approach is a good idea – just one TLS variable with a destructor seems like an arbitrary limit and will hinder the compositionality of allocators. And this also doesn't work on platforms like GNU/Windows, where the TLS storage is always allocated. I think there are two better solutions:

  • Make the destructor list a single-linked list and store the nodes along with the data in the TLS static (this wouldn't fix the key-based TLS issue though)
  • Just switch to the System allocator for all std-internal allocations

@orlp
Copy link
Contributor Author

orlp commented Jul 17, 2025

just one TLS variable with a destructor seems like an arbitrary limit

This PR doesn't prevent better solutions from being adopted later - it's already a huge improvement over the status quo (where an allocator is unable to register any kind of per-thread cleanup natively in Rust).

and will hinder the compositionality of allocators

This solution (and limitation) is only for the global allocator, which is already not composable. Only a single one may be set.

And this also doesn't work on platforms like GNU/Windows, where the TLS storage is always allocated

I don't quite follow, why would this not work on GNU/Windows?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants