Who is responsible for preventing reentrancy issues through the allocator? #506

RalfJung · 2024-04-17T08:34:06Z

Calling the global allocator can in turn call arbitrary user-defined code

if there is a custom global allocator (needs unsafe)
if allocation fails, and there is one of
- a custom alloc hook (unstable)
- -Zoom=panic (unstable) plus a custom panic hook
- a custom #[alloc_error_handler] (unstable, requires no_std)
- a no_std library using alloc's default error hook (which panics) with a #[panic_handler] (may be possible on stable?)

It may just barely be the case that this cannot be triggered on stable today, depending on whether there is any way to actually run the liballoc default error hook on stable. I think the __rust_no_alloc_shim_is_unstable symbol makes that impossible though? @bjorn3 might know.

Similar to #505, this can lead to reentrancy. That has caused real soundness questions: is it the responsibility of the allocator and the alloc error hook to never call into data structures except if they are declared reentrant, or is it the responsibility of every data structure to be resilient against reentrancy through the allocator? If they anyway have to be able to deal with reentrant panic hooks (due to #505), then is the extra possibility of reentrancy via the allocator even relevant?

The text was updated successfully, but these errors were encountered:

CAD97 · 2024-04-17T09:21:54Z

Actually IIUC handle_alloc_error with -Zoom=abort still calls the panic hook, it just does so with can_unwind=false so no unwinding occurs.

A more niche but more obviously suspect case of potential surprise single thread reentrancy is signal handlers. At least there the usual model is that they're a separate logical thread of execution even if physically they run within an existing thread.

I think "global allocator implementation shouldn't perform allocation" is a common rule, which would include forbidding panics. But I think any scenario which says panicking at all is unsound without even being able to cover it with catch_unwind is untenable.

RalfJung · 2024-04-17T09:36:15Z

swc-project/swc#8362 is likely still an issue in the latest version of that crate. If there is a custom allocator, and that allocator is allowed to panic, then one can cause an aliasing violation in entirely safe code:

call AtomStoreCell::atomic
that will create an &mut and pass it to AtomStore::atom (meaning it is protected and invalidation is insta-UB)
that will call some things in the hstr crate which end up doing an allocation
that will call our allocator, which panics
that will call our panic hook, which calls AtomStoreCell::atomic a second time which creates an aliasing &mut which is immediate UB due to the protector created the first time around

Actually IIUC handle_alloc_error with -Zoom=abort still calls the panic hook, it just does so with can_unwind=false so no unwinding occurs.

No, the alloc error handler in std will call rtprintpanic! which does not call the panic hook.

signal handlers

Those are very different; they may only do async-signal-safe things and they are generally accepted as a very non-standard execution environment.

I think "global allocator implementation shouldn't perform allocation" is a common rule, which would include forbidding panics. But I think any scenario which says panicking at all is unsound without even being able to cover it with catch_unwind is untenable.

So we should forbid panics but also forbidding panics is untenable...?

CAD97 · 2024-04-22T07:34:54Z

To restate hopefully a bit clearer:

I believe "global allocators don't reentrantly call the global allocator" is a common property that people expect to be upheld. However, I don't think that treating this as a soundness property that must be upheld by the developer is a viable option (c.f. allocation size fitting isize). Since arbitrary panic hooks may of course include allocation, such a requirement means writing entirely panic-free code, elevating accidental panics from a likely fatal error to unsoundness.

There's two sides to the coin here: reentrancy in code because the allocator reentered it, and reentrancy of the allocator itself. My observation is more about the latter, not the former. Making code no-global-alloc is a reasonable ask (the allocator won't enter your data structure that uses the allocator), but no-panic isn't really, not without better compiler assistance anyway. So perhaps this ends up more about #505 (reentrancy via panic hook) than this issue (reentrancy via alloc hook).

However, it's possible to imagine the Rust runtime enforcing alloc reentrancy freedom in the global allocator dynamically, by manipulating the alloc and/or panic hook. (But if safe APIs must tolerate reentrancy from the panic hook anyway, preventing alloc reentrancy doesn't seem actually beneficial.)

jwong101 · 2024-10-13T03:25:01Z

Making the global allocator be the one for ensuring reentrancy seems to be the most reasonable option. Yes it's difficult, but it seems natural that the "code in charge of global dynamic memory allocation" should shoulder more responsibility, rather than having all other unsafe code/data structures have to worry about being reentered whenever they allocate memory.

If they anyway have to be able to deal with reentrant panic hooks (due to #505), then is the extra possibility of reentrancy via the allocator even relevant?

Yes, since the allocator currently isn't allowed to panic.

Otherwise this code:

https://github.com/rust-lang/rust/blob/ef4e8259b5016d85e261587b605028b2ff06c13d/library/alloc/src/rc.rs#L1877

is unsound if Global.alloc() is allowed to soundly call into the Rc:

use std::alloc::{GlobalAlloc, Layout, System};
use std::cell::Cell;
use std::mem::ManuallyDrop;
use std::rc::{Rc, Weak};
use std::sync::atomic::{AtomicBool, Ordering};

type RcBox = Rc<Box<i32>>;
type WeakBox = Weak<Box<i32>>;
struct Demonic;

thread_local! {
    static STRONG: Cell<Option<RcBox>> =  const { Cell::new(None) };
    static WEAK: Cell<WeakBox> =  const { Cell::new(Weak::new()) };
}

static SIDE_CHANNEL: AtomicBool = AtomicBool::new(false);

fn dupe_rc() {
    // `ManuallyDrop` used to avoid an overflow check in debug mode
    STRONG.set(ManuallyDrop::new(WEAK.take()).upgrade());
}

unsafe impl GlobalAlloc for Demonic {
    unsafe fn alloc(&self, l: Layout) -> *mut u8 {
        if SIDE_CHANNEL.load(Ordering::Relaxed) {
            SIDE_CHANNEL.store(false, Ordering::Relaxed);
            dupe_rc();
        }
        System.alloc(l)
    }

    unsafe fn dealloc(&self, c: *mut u8, l: Layout) {
        System.dealloc(c, l);
    }
}

#[global_allocator]
static D: Demonic = Demonic;

fn main() {
    let mut oblivious = RcBox::default();
    WEAK.set(Rc::downgrade(&oblivious));
    SIDE_CHANNEL.store(true, Ordering::Relaxed);
    Rc::make_mut(&mut oblivious);
    drop(oblivious);

    // double free. whose fault is it?
    STRONG.set(None);
}

miri backtrace

jwong3@JWF3QD9WNV galloc % MIRIFLAGS=-Zmiri-backtrace=full cargo miri run
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.00s
     Running `/Users/jwong3/.rustup/toolchains/nightly-aarch64-apple-darwin/bin/cargo-miri runner target/miri/aarch64-apple-darwin/debug/galloc`
error: Undefined Behavior: memory access failed: alloc1836 has been freed, so this pointer is dangling
  --> /Users/jwong3/.rustup/toolchains/nightly-aarch64-apple-darwin/lib/rustlib/src/rust/library/std/src/sys/alloc/unix.rs:48:18
   |
48 |         unsafe { libc::free(ptr as *mut libc::c_void) }
   |                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ memory access failed: alloc1836 has been freed, so this pointer is dangling
   |
   = help: this indicates a bug in the program: it performed an invalid operation, and caused Undefined Behavior
   = help: see https://doc.rust-lang.org/nightly/reference/behavior-considered-undefined.html for further information
help: alloc1836 was allocated here:
  --> src/main.rs:28:9
   |
28 |         System.alloc(l)
   |         ^^^^^^^^^^^^^^^
help: alloc1836 was deallocated here:
  --> src/main.rs:32:9
   |
32 |         System.dealloc(c, l);
   |         ^^^^^^^^^^^^^^^^^^^^
   = note: BACKTRACE (of the first span):
   = note: inside `std::sys::alloc::unix::<impl std::alloc::GlobalAlloc for std::alloc::System>::dealloc` at /Users/jwong3/.rustup/toolchains/nightly-aarch64-apple-darwin/lib/rustlib/src/rust/library/std/src/sys/alloc/unix.rs:48:18: 48:54
note: inside `<Demonic as std::alloc::GlobalAlloc>::dealloc`
  --> src/main.rs:32:9
   |
32 |         System.dealloc(c, l);
   |         ^^^^^^^^^^^^^^^^^^^^
note: inside `_::__rust_dealloc`
  --> src/main.rs:37:11
   |
36 | #[global_allocator]
   | ------------------- in this procedural macro expansion
37 | static D: Demonic = Demonic;
   |           ^^^^^^^
   = note: inside `std::alloc::dealloc` at /Users/jwong3/.rustup/toolchains/nightly-aarch64-apple-darwin/lib/rustlib/src/rust/library/alloc/src/alloc.rs:119:14: 119:64
   = note: inside `<std::alloc::Global as std::alloc::Allocator>::deallocate` at /Users/jwong3/.rustup/toolchains/nightly-aarch64-apple-darwin/lib/rustlib/src/rust/library/alloc/src/alloc.rs:269:22: 269:51
   = note: inside `<std::boxed::Box<i32> as std::ops::Drop>::drop` at /Users/jwong3/.rustup/toolchains/nightly-aarch64-apple-darwin/lib/rustlib/src/rust/library/alloc/src/boxed.rs:1679:17: 1679:66
   = note: inside `std::ptr::drop_in_place::<std::boxed::Box<i32>> - shim(Some(std::boxed::Box<i32>))` at /Users/jwong3/.rustup/toolchains/nightly-aarch64-apple-darwin/lib/rustlib/src/rust/library/core/src/ptr/mod.rs:574:1: 574:56
   = note: inside `<std::rc::Rc<std::boxed::Box<i32>> as std::ops::Drop>::drop` at /Users/jwong3/.rustup/toolchains/nightly-aarch64-apple-darwin/lib/rustlib/src/rust/library/alloc/src/rc.rs:2256:17: 2256:66
   = note: inside `std::ptr::drop_in_place::<std::rc::Rc<std::boxed::Box<i32>>> - shim(Some(std::rc::Rc<std::boxed::Box<i32>>))` at /Users/jwong3/.rustup/toolchains/nightly-aarch64-apple-darwin/lib/rustlib/src/rust/library/core/src/ptr/mod.rs:574:1: 574:56
   = note: inside `std::ptr::drop_in_place::<std::option::Option<std::rc::Rc<std::boxed::Box<i32>>>> - shim(Some(std::option::Option<std::rc::Rc<std::boxed::Box<i32>>>))` at /Users/jwong3/.rustup/toolchains/nightly-aarch64-apple-darwin/lib/rustlib/src/rust/library/core/src/ptr/mod.rs:574:1: 574:56
   = note: inside `std::cell::Cell::<std::option::Option<std::rc::Rc<std::boxed::Box<i32>>>>::set` at /Users/jwong3/.rustup/toolchains/nightly-aarch64-apple-darwin/lib/rustlib/src/rust/library/core/src/cell.rs:428:26: 428:27
   = note: inside closure at /Users/jwong3/.rustup/toolchains/nightly-aarch64-apple-darwin/lib/rustlib/src/rust/library/std/src/thread/local.rs:349:17: 349:45
   = note: inside `std::thread::LocalKey::<std::cell::Cell<std::option::Option<std::rc::Rc<std::boxed::Box<i32>>>>>::initialize_with::<{closure@std::thread::LocalKey<std::cell::Cell<std::option::Option<std::rc::Rc<std::boxed::Box<i32>>>>>::set::{closure#0}}, ()>` at /Users/jwong3/.rustup/toolchains/nightly-aarch64-apple-darwin/lib/rustlib/src/rust/library/std/src/thread/local.rs:311:9: 311:27
   = note: inside `std::thread::LocalKey::<std::cell::Cell<std::option::Option<std::rc::Rc<std::boxed::Box<i32>>>>>::set` at /Users/jwong3/.rustup/toolchains/nightly-aarch64-apple-darwin/lib/rustlib/src/rust/library/std/src/thread/local.rs:344:9: 351:11
note: inside `main`
  --> src/main.rs:47:5
   |
47 |     STRONG.set(None);
   |     ^^^^^^^^^^^^^^^^

EDIT: made the example less pathological/more realistic

CAD97 · 2024-10-13T16:24:54Z

Yes, since the allocator currently isn't allowed to panic.

This isn't fully clear in the current docs; they say that

It’s undefined behavior if global allocators unwind. This restriction may be lifted in the future, but currently a panic from any of these functions may lead to memory unsafety.

but this may refer either to any attempt to initiate a panic within the scope of allocation, or only to an attempt to unwind from (out of) an allocation function.

Otherwise this code […] is unsound if Global.alloc() is allowed to soundly call into the Rc:

Good proof of concept. It looks like Rc::make_mut should .dec_strong() earlier to ~~pre-poop its pants~~ guard against reentrancy despite the comment's assertion that this branch doesn't need panic protection.

But this does clearly illustrate the difficulty of accounting for allocation being a potential reentrancy point and I think put it on a similar footing as the isize allocation max where treating it as a global concern is basically untenable.

Additionally, I think this illustrates that if the panicking restriction isn't total and only raises library UB at the point of an unwinding out of an alloc function, then we need to have a flag in the panic count that prevents calling into the panic hook for a panic in the allocator.

It's probably also a good idea to put a helpful assert_unsafe_precondition! reentrancy guard in Global. The associated cost should be dominated easily by the actual allocation work.

RalfJung · 2024-10-13T16:35:14Z

That is a neat example. :) However, your code link points at a ptr::copy_nonoverlapping, so I was confused at first... did you mean to point at this instead?

https://github.com/rust-lang/rust/blob/ef4e8259b5016d85e261587b605028b2ff06c13d/library/alloc/src/rc.rs#L1872

That's a call to UniqueRcUninit::new, so allocation is involved.

Cc @rust-lang/wg-allocators for some more input on the intended soundness contract here.

It's probably also a good idea to put a helpful assert_unsafe_precondition! reentrancy guard in Global. The associated cost should be dominated easily by the actual allocation work.

The same example could be constructed using custom local allocators, right? We can't automatically add a reentrancy guard to those, I think...

RalfJung · 2024-10-13T16:39:45Z

but this may refer either to any attempt to initiate a panic within the scope of allocation, or only to an attempt to unwind from (out of) an allocation function.

Yeah... also, if we take away their ability to panic, we should give them some other way to halt execution. And even then it's an enormous footgun as rustc will insert a bunch of panics automatically... so probably what we should rather do is have a function like with_immediate_abort_on_panic(|| { /* code here */ }) such that if the closure starts a panic, this does not invoke the panic hook, and only calls code that we control and carefully ensure will not allocate. All allocators should then be implemented with that function.

jwong101 · 2024-10-13T18:39:21Z

However, your code link points at a ptr::copy_nonoverlapping, so I was confused at first... did you mean to point at this instead?

Yeah, sorry for the confusion. I didn't know which part to point out, so I highlighted the part where the Rc copies the Box into the new allocation, because it assumes that it's still the only strong reference after the allocation in UniqueRcUninit::new. It probably would've been clearer to highlight the allocation(and the possibility of it running arbitrary code that could make a new strong reference) instead.

bjorn3 · 2024-10-13T19:37:01Z

so probably what we should rather do is have a function like with_immediate_abort_on_panic(|| { /* code here */ }) such that if the closure starts a panic, this does not invoke the panic hook, and only calls code that we control and carefully ensure will not allocate.

There is no way to implement such a function for no_std programs. It did need a thread local variable, which requires std.

comex · 2024-10-13T22:28:09Z

There is no way to implement such a function for no_std programs. It did need a thread local variable, which requires std.

The cost of toggling a thread-local variable, while small, is not ideal for something as hot as an allocator. I can imagine schemes where it would instead just be checking a global variable, but that would still be a measurable expense.

I wish Rust had an effect system (although I like @nikomatsakis' "function flavor" term better). In this case, allocators could use a "no panic" flavor to ensure they never called into any code that could panic. There's been a drumbeat of requests for a "no panic" feature over the years, and this would be one more use case. Admittedly, a flavor system would not be a panacea; we would still be left with the problem that writing no-panic code is (currently) extremely unergonomic. But that's solvable. Without a flavor system, though, we can't even get started on solving it.

RalfJung mentioned this issue Apr 22, 2024

Better documentation needed: reentrancy through the panic hook #505

Open

alecmocatta mentioned this issue Jun 19, 2024

-Zoom=panic double panic due to panicking allocating rust-lang/rust#126683

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Who is responsible for preventing reentrancy issues through the allocator? #506

Who is responsible for preventing reentrancy issues through the allocator? #506

RalfJung commented Apr 17, 2024 •

edited

Loading

CAD97 commented Apr 17, 2024

RalfJung commented Apr 17, 2024

CAD97 commented Apr 22, 2024

jwong101 commented Oct 13, 2024 •

edited

Loading

CAD97 commented Oct 13, 2024

RalfJung commented Oct 13, 2024

RalfJung commented Oct 13, 2024 •

edited

Loading

jwong101 commented Oct 13, 2024

bjorn3 commented Oct 13, 2024

comex commented Oct 13, 2024

Who is responsible for preventing reentrancy issues through the allocator? #506

Who is responsible for preventing reentrancy issues through the allocator? #506

Comments

RalfJung commented Apr 17, 2024 • edited Loading

CAD97 commented Apr 17, 2024

RalfJung commented Apr 17, 2024

CAD97 commented Apr 22, 2024

jwong101 commented Oct 13, 2024 • edited Loading

CAD97 commented Oct 13, 2024

RalfJung commented Oct 13, 2024

RalfJung commented Oct 13, 2024 • edited Loading

jwong101 commented Oct 13, 2024

bjorn3 commented Oct 13, 2024

comex commented Oct 13, 2024

RalfJung commented Apr 17, 2024 •

edited

Loading

jwong101 commented Oct 13, 2024 •

edited

Loading

RalfJung commented Oct 13, 2024 •

edited

Loading