Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is longjmp through rust code UB #404

Open
chorman0773 opened this issue May 23, 2023 · 49 comments
Open

Is longjmp through rust code UB #404

chorman0773 opened this issue May 23, 2023 · 49 comments

Comments

@chorman0773
Copy link
Contributor

chorman0773 commented May 23, 2023

What is the result of using setjmp in C, entering rust code, then calling longjmp, either from Rust, or after first returning into C. Is this considered undefined behaviour or defined behaviour.

Because of Pin (and likely others, things like replace_with), this cannot be sound in general, but there is a question of whether it is considered immediately undefined behaviour or whether it is defined in Rust.

This intersects with @rust-lang/wg-ffi-unwind. From https://rust-lang.zulipchat.com/#narrow/stream/210922-project-ffi-unwind/topic/cost.20of.20supporting.20longjmp.20without.20annotations and https://rust-lang.zulipchat.com/#narrow/stream/210922-project-ffi-unwind/topic/longjmp.20rules the likely answer is "Undefined Behaviour" even through frames without destructors/catch_unwind.

In addition to determining the desired behaviour, we should consider what guarantees we've made that would make longjmp unsound (beyond Pin).

(Cc prior broader discussion: #211)

@digama0
Copy link

digama0 commented May 23, 2023

I think it should be possible for at least some trivial versions of setjmp/longjmp to be sound. For example if both setjmp and longjmp calls are in C and they don't jump over any rust stack frames, then it should be possible to argue that this is just FFI having some lang-specific behavior which isn't relevant, as long as it does not explicitly manifest as "impossible behavior" from the rust code's perspective.

The next level of difficulty is jumping from C to C over a Rust stack frame which has no destructors, and here I think it's still fine, since this is as-if you just unwind those frames. (We may want to make this UB if there is a "nounwind" annotation of some kind. Possibly this is already true of all C calls, in which case this and everything after is UB at the unwind.)

If you jump from C to C over a Rust stack frame which has destructors, I still think it is fine as regards immediate UB, but this is unsound behavior and may trigger UB in subsequent rust code since a destructor was skipped.

For C to Rust or Rust to C, the question does not arise because (for a variety of reasons) the libc crate does not support setjmp or longjmp, last I checked. So you would have to go design that interface first.

@thomcc
Copy link
Member

thomcc commented May 23, 2023

The next level of difficulty is jumping from C to C over a Rust stack frame which has no destructors, and here I think it's still fine, since this is as-if you just unwind those frames.
...
For C to Rust or Rust to C, the question does not arise because (for a variety of reasons) the libc crate does not support setjmp or longjmp, last I checked. So you would have to go design that interface first.

So, at work we call setjmp (sigsetjmp, really), and then immediately call into C, which may longjmp over Rust stack frames (ones which have no destructors) back to our setjmp. At the moment, it's almost certainly a form of UB (one which works in practice very reliably at the moment), since we don't use the (still-unstable) "ffi_returns_twice" attribute1

I would strongly prefer this not be immediately UB though, since this kind of thing would be desirable to support and isn't that uncommon -- any Rust library that wants to call a C library that uses sjlj-based error-handling will want to do something like this.

Footnotes

  1. We'd probably have to redesign this interface to effectively use without a significant performance cost, it inhibits inlining, so we'd have to use macros here instead of functions.

@Amanieu
Copy link
Member

Amanieu commented May 23, 2023

For C to Rust or Rust to C, the question does not arise because (for a variety of reasons) the libc crate does not support setjmp or longjmp, last I checked. So you would have to go design that interface first.

We don't support setjmp because it has a weird calling convention that returns twice, however just calling longjmp from Rust is fine. It pretty much works like C to C over a Rust stack frame.

Calling setjmp is never going to be supported by Rust. The only way to safely use it is to call it using inline asm wrapper which would look something like this (credits to @nbdd0121, from Zulip):

unsafe fn setjmp<F: FnOnce() -> c_int>(env: *mut jmp_buf, f: F) -> c_int {
    let mut f = core::mem::MaybeUninit::new(f);
    extern "C" fn call_f<F: FnOnce() -> c_int>(f: *mut F) -> c_int {
        (unsafe { f.read() })()
    }
    let ret: c_int;
    std::arch::asm!(
        "call setjmp",
        "test eax, eax",
        "jne 1f",
        "mov rdi, r12",
        "call {}",
        "1:",
        sym call_f::<F>,
        in("rdi") env,
        in("r12") f.as_mut_ptr(),
        lateout("rax") ret,
        clobber_abi("C"),
    );
    ret
}

@chorman0773
Copy link
Contributor Author

chorman0773 commented May 23, 2023

I would strongly prefer this not be immediately UB though, since this kind of thing would be desirable to support and isn't that uncommon -- any Rust library that wants to call a C library that uses sjlj-based error-handling will want to do something like this.

Note that any library that has that would have to be careful arround Pin anyways, and probably also destructors (and catch_unwind) in general, since windows and unix are quite inconsistent wrt. longjmp evaluating destructors.

@chorman0773
Copy link
Contributor Author

Things that definitely make longjmp through rust unsound:

  • Pin
  • catch_unwind and Destructors, which may be guarding unsafe code

Things that may make longjmp through certain rust frames UB (or otherwise must be unspecified):

  • catch_unwind and Destructors, which are run on Windows IIRC, and nowhere else.
  • I'd assume abort-on-unwind handlers for extern "C" fns (and into extern "C-unwind" in an panic=abort crate), for same reason (not 100% sure on this).

@digama0
Copy link

digama0 commented May 24, 2023

By the way, I don't think that soundness of longjmp is our (t-opsem's) concern. We are only concerned about what causes immediate UB, and soundness is a derived property of libraries that don't cause UB. So Pin basically doesn't matter to us at all. (Obviously we do need to care about it for documentation purposes, but that doesn't seem to be the title question.)

@comex
Copy link

comex commented May 24, 2023

@RalfJung
Copy link
Member

By the way, I don't think that soundness of longjmp is our (t-opsem's) concern. We are only concerned about what causes immediate UB, and soundness is a derived property of libraries that don't cause UB. So Pin basically doesn't matter to us at all. (Obviously we do need to care about it for documentation purposes, but that doesn't seem to be the title question.)

Well we do care about defining the semantics in a way that Pin is sound. ;)

However, setjmp/longjmp cannot be UB in the sense of "the AM stops with an error when you do this", since the AM doesn't even have such an operation. So this is really a subquestion of the general question of what FFI/asm can do to the Rust AM implementation state, which seems to live at the intersection of t-opsem and t-compiler.

@nbdd0121
Copy link

We define frame deallocation of non-POFs to be undefined.

@nbdd0121
Copy link

If frames to be deallocated consist of only POFs, then from Rust AM's perspective that's no different from just unwinding, given that no code is to be executed anyway?

@Amanieu
Copy link
Member

Amanieu commented May 24, 2023

If frames to be deallocated consist of only POFs, then from Rust AM's perspective that's no different from just unwinding, given that no code is to be executed anyway?

The question basically boils down to: is the compiler allowed to add unwind landing pads when the original code doesn't have any? If so then it is UB to skip these with longjmp, even if the original program has no droppable types in a frame.

@digama0
Copy link

digama0 commented May 24, 2023

However, setjmp/longjmp cannot be UB in the sense of "the AM stops with an error when you do this", since the AM doesn't even have such an operation.

It could, though. It would be something like an unwind from a function with a flag in the exception state to skip all destructors (call it "forced unwind"). This would then otherwise act just like an unwind, except that destructors would not be run on the way out. If you hit a nounwind function, you get UB. (There might also be functions which are no-normal-unwind which are okay with an unwind that skips destructors, in which case only "normal" unwinding is UB, not forced unwind. This is I think the common case for function calls that lack a cleanup block continuation.)

Obviously, forced unwind is unsafe since it breaks safety invariants, but it's not undefinable in the AM, and having a sane definition of what it does on the AM will avoid some of the weirder states this operation would otherwise end up in. Of course adding an operation to the AM means that it is now a thing you have to worry about happening from any opaque or external function, but you can control the fallout from that by liberally sprinkling "no-forced-unwind" on such functions until you can be sure they can be supported.

@chorman0773
Copy link
Contributor Author

chorman0773 commented May 24, 2023 via email

@digama0
Copy link

digama0 commented May 24, 2023

Note that skipping destructors is not even guaranteed - on SEH destructors
are run and catches will "catch" a longjmp. I do not know enough about how
panics are implemented on SEH to say whether this can even be fiesibly
disabled.

I think that would be handled by having longjmp do a normal unwind on windows, assuming it calls all destructors and rust drop impls count as destructors for this purpose. It would still be possible to implement forced unwind on windows using inline assembly, but I don't know whether we need to declare that UB for other reasons (i.e. windows gets messed up because it needs all unwinds to go through SEH for some reason).

@bjorn3
Copy link
Member

bjorn3 commented May 24, 2023

LLVM will optimize out destructors only reachable when unwinding from calls to nounwind functions, yet it allows to longjmp out of a nounwind function just fine. This means that even on windows is undefined if destructors run or not and which ones do run.

@digama0
Copy link

digama0 commented May 24, 2023

That sounds pretty much like undefined behavior. I can't possibly see how LLVM could justify such a thing otherwise. It is quite reasonable to assert that nounwind functions can't unwind, but that means that longjmp on windows is violating LLVM assumptions by unwinding anyway so that would have to be UB when it hits a nounwind function. (This is all talking about the LLVM semantics, not rust.)

@bjorn3
Copy link
Member

bjorn3 commented May 24, 2023

See rust-lang/rust#88243 (comment) and the comment after that.

@digama0
Copy link

digama0 commented May 24, 2023

Ugh, this is awkward. I don't think we want to expose the LLVM behavior of evaluating a random subset of the destructors as that is impossible to code for, and for the most part this is already LLVM UB:

This function attribute indicates that the function never raises an exception. If the function does raise an exception, its runtime behavior is undefined. However, functions marked nounwind may still trap or generate asynchronous exceptions. Exception handling schemes that are recognized by LLVM to handle asynchronous exceptions, such as SEH, will still provide their implementation defined semantics.

But the last sentence here is a cop-out which exposes this subsetting behavior and classifies it as "implementation defined". I'm not sure how a longjmp can be classified as a "trap or asynchronous exception" though, since it is clearly synchronous and not a trap AFAIK. Apparently this tortured reading is used to justify using nounwind in MSVC even when longjmp is used.

I would prefer we expose two reliable behaviors: either all of the destructors are evaluated, or none of them are. If we can't ensure either of those two behaviors, it should just be declared as UB. However, I don't think we need that UB fallback, as long as rustc stops using nounwind on functions that call longjmp or external functions on windows, like @nikic was suggesting in the linked thread.

Alternatively, we could just go the route of C++ and declare it UB to jump over any destructors. That amounts to eliminating the second "no destructors are evaluated" unwind mode from the AM, but it would make longjmp and pthread_kill immediate UB in the majority of cases, which seems worse to me since I doubt that will stop people from doing it anyway.

@nbdd0121
Copy link

nbdd0121 commented May 25, 2023

I would prefer we expose two reliable behaviors: either all of the destructors are evaluated, or none of them are. If we can't ensure either of those two behaviors, it should just be declared as UB.

We definitely can't guarantee all destructors to be evaluated. For example, in glibc pthread_exit will use forced unwind first, and then use longjmp to skip over the rest of frames, when forced unwind reached end of stack. So if we can call chain like this:

start_routine -> Rust Frame 1 -> C Frame 2 -> Rust Frame 3 -> pthread_exit

then it will unwind through frame 3 and skip over frame 1 & 2. Needless to say that longjmp won't call any destructors. We can prevent destructors from being run at all, by instructing personality function to skip over destructors during forced unwind.

For Windows, we currently run all destructors for -Cpanic=unwind and skip all of them for -Cpanic=abort. This prevents the nounwind guards that we add to abort when unwinding through POFs. We certainly can try to stop running the destructors for -Cpanic=unwind, but that'll probably be undesirable performance-wise.

Alternatively, we could just go the route of C++ and declare it UB to jump over any destructors. That amounts to eliminating the second "no destructors are evaluated" unwind mode from the AM, but it would make longjmp and pthread_kill immediate UB in the majority of cases, which seems worse to me since I doubt that will stop people from doing it anyway.

I don't get the reasoning. How do we reason about code that relies on destructors on stack being run without declaring skipping destructors on stack UB? The POF requirement for ensuring longjmp/pthread_kill to be non-UB sounds very reasonable to me.

@digama0
Copy link

digama0 commented May 25, 2023

Alternatively, we could just go the route of C++ and declare it UB to jump over any destructors. That amounts to eliminating the second "no destructors are evaluated" unwind mode from the AM, but it would make longjmp and pthread_kill immediate UB in the majority of cases, which seems worse to me since I doubt that will stop people from doing it anyway.

I don't get the reasoning. How do we reason about code that relies on destructors on stack being run without declaring skipping destructors on stack UB? The POF requirement for ensuring longjmp/pthread_kill to be non-UB sounds very reasonable to me.

My concern is that Rust makes use of destructors for a lot of things, and there are a lot of destructors which are probably fine to skip (either no-ops or memory leaks) and making it UB to have any destructors IMO doesn't respect an operational nature for UB.

Ideally there shouldn't be any difference between a destructor that does nothing and no destructor (especially since rustc has a lot of implementation detail smarts in the drop checker which may make it difficult as a user to determine whether in fact there is a destructor in the frame). Taking drop flags into account, you really can't even know statically whether a destructor will run, so detecting no-op destructors would require adding a magic extra op into the opsem to signal that a destructor is running, even if no drop code is present.

We definitely can't guarantee all destructors to be evaluated. For example, in glibc pthread_exit will use forced unwind first, and then use longjmp to skip over the rest of frames, when forced unwind reached end of stack. So if we can call chain like this:

start_routine -> Rust Frame 1 -> C Frame 2 -> Rust Frame 3 -> pthread_exit

then it will unwind through frame 3 and skip over frame 1 & 2.

This might actually be okay, since it can be modeled as a normal unwind through frame 3, followed by a catch and rethrow in the C code to no-drop unwind through 1 & 2.

@BatmanAoD
Copy link
Member

Deallocating frames is a very niche use-case in any language, and must always be treated with caution by users regardless of what we do to make it safer. It fundamentally cannot interact well with the concept of RAII, whether the language with RAII patterns is Rust, C++, or anything else.

The POF restriction is not just a reasonable restriction, but is the minimal restriction we can make; if we ever do make certain use-cases of longjmp formally well-defined in Rust, then it is quite possible that the restriction will be narrower than what's stated in RFC-2945.

I'm not really sure what you mean by "an operational nature for UB." But I do not see any reasonable possibility of defining any kind of "well-defined" drop/destructor behavior in the presence of longjmp or pthread_exit.

@digama0
Copy link

digama0 commented May 25, 2023

To be clear, I think that the POF restriction is reasonable for the safety condition (among probably quite a few other conditions). I just don't think it's appropriate for the opsem itself.

@comex
Copy link

comex commented May 25, 2023

I'm not really sure what you mean by "an operational nature for UB." But I do not see any reasonable possibility of defining any kind of "well-defined" drop/destructor behavior in the presence of longjmp or pthread_exit.

Why not? There's a middle ground between "well-defined" and "undefined" - something like "the implementation nondeterministically runs or skips the destructor". This would imply that the compiler could not perform any optimizations that assume that destructors always run (are there any?), but libraries still could do so (including the standard library).

@digama0
Copy link

digama0 commented May 25, 2023

Yes, we could do that. I really hope we don't, but we could. Nondeterminism here would yield exponentially many possibilities, which is practically impossible for users to handle correctly and would also make Miri sad. Having a deterministic but platform specific behavior seems better to me.

@RalfJung
Copy link
Member

In my view, the Rust op.sep simply has no way to longjmp over a stack frame. So the question of an operational spec does not even come up. Instead, "what happens on longjmp" is similar to questions like "what happens when I change the stack pointer": this is about the (compiler-controlled) invariant that relates the real machine state with the AM state, and what we say users are required to do when modifying the real machine state in a way that it not possible with AM operations.

@comex
Copy link

comex commented May 25, 2023

@RalfJung

It makes sense to me to define it as not literally "calling longjmp" but something lower-level that would also let you implement longjmp yourself, like (for the non-Windows interpretation of longjmp) –

FFI A calls into Rust code which calls back into FFI B, but then execution never returns from FFI B to the Rust code's ABI-designated return address, and instead, any or all of the following happen:

  • (a) the Rust code's stack frame(s) are overwritten by arbitrary data;
  • (b) new calls into Rust code are made with the stack pointer pointing into the same region as those frames;
  • (c) if FFI A was itself called from some other Rust code earlier in the stack, then FFI A returns to that Rust code.

In other words, FFI B longjmps to FFI A, and then FFI A either (a) makes calls to non-Rust code reusing that portion of the stack, (b) makes calls to Rust code reusing that portion of the stack, and/or (c) returns.

But we still have to define what happens to the Abstract Machine in that case, unless we want to leave longjmp as always UB.

@RalfJung
Copy link
Member

Again, this is just like messing with the stack pointer: it involves changing Rust-controlled state from outside Rust.

We can of course state that certain ways of doing that are okay. But what we are looking at here is below the level of abstraction of the Abstract Machine. We don't "have to" define anything, but in this case it sounds like we want to say that if all the frames removed from the stack are POF then this is fine.

@CAD97
Copy link

CAD97 commented May 27, 2023

In the case of POF, the implementation can likely decide that this is allowable completely independent from the AM op.sem, as the implementation can "merely" implement the AM semantics such that manipulating the stack pointer in a controlled fashion behaves as-if/identically-to an unwind with the same control flow w.r.t. observable effects.

Thus, "can you do this" feels like a rustc/implementation question rather than an AM/opsem question. However, it can become an opsem question if we want to guarantee that doing so is allowed no matter what target is implementing Rust, by partially specifying how the abstract semantics are lowered to target operations (here, likely by defining the concept of next operation in a controllable manner).

Given Rust defines some things roughly in terms of C (e.g. core::ffi, extern/repr "C"), we do have a soft expectation that the Rust AM is lowered to a target capable of hosting the C AM, so it is arguably relevant to at least mention exotic capabilities of the C AM (e.g. longjmp) in the opsem, even if just as a note that the Rust AM does not provide a direct equivalent and any use that impacts Rust frames is at best nonportable.

TL;DR: this is a question of the AM to target lowering, not of the AM opsem. It's debatably in the broad domain of UCG but not T-opsem.

@nbdd0121
Copy link

If we allow longjmp, then we can't create landing pads out from nowhere. i.e.:

fn foo(x: &mut u32) {
	*x += 1;
	bar();
}

cannot be transformed to

fn foo(x: &mut u32) {
	defer! { *x += 1; }
	bar();
}

@thomcc
Copy link
Member

thomcc commented May 27, 2023

Isn't this already true? My read of https://github.com/rust-lang/rfcs/blob/master/text/2945-c-unwind-abi.md#plain-old-frames implies that we already have a notion of a plain-old-frame as not having destructors, and thus such a transformation already being forbidden.

I could be mistaken though.

@nbdd0121
Copy link

nbdd0121 commented May 27, 2023

The whole point of defining POF in the RFC is to make longjmp over code with destructors undefined. So my argument is that this is indeed an AM/opsem question.

(unless I misunderstand the scope of opsem team)

@CAD97
Copy link

CAD97 commented May 28, 2023

Do note that the RFC as accepted says that POF is necessary but not sufficient for forced unwinds to be allowed, and that there is no sufficient condition as of yet. The only thing set by that RFC is that a forced overwind over a non-POF is definitely UB.

(That actually answers one of @chorman0773's questions; per that RFC, longjmp over a non-POF is considered UB.)

The opsem just defines what happens, not how it's done. longjmp is (presumably) not an operation defined by the AM, so (it's undefined) it's not in scope of the opsem to say what happens if you try to do it, because it's a non-behavior. It's the domain of the lowering from AM to target-that-defines-longjmp to define (or not) how the implementation interacts with longjmp.

This is admittedly a weird edge case, since generally the Rust AM can be thought of as capable of directly hosting the C AM (i.e. by direct translation of semantics, not by interpreter or porting), but longjmp steps outside of that homomorphism. A more drastic question of "what happens if a thread is stopped and unwound/deallocated in the middle of a Rust frame" (as opposed to at an observable/FFI point) is more obviously not an opsem question, but is fundamentally the same class of question as it happening at an observable/FFI point; it's just not something which the AM can attempt to do, so it's not in the domain of the operational semantics of the AM.

@digama0
Copy link

digama0 commented May 28, 2023

I want to reiterate that I see no reason whatsoever for longjmp to be something "outside the AM". It's literally just an unwind that skips destructors. There could be a flag for this and everything. There are some platform-specific details to work out, especially on non-POF frames, but assuming we ignore that situation / declare it UB there is not much else necessary to do to make this a legal citizen.

Of course, if you longjmp anywhere other than "up the stack" then I think the comments about this being similar to directly setting the instruction pointer apply, and that should just be immediate UB. But the normal use of this in C code is not that problematic. (setjmp on the other hand is weird, but @Amanieu already mentioned how this can be fixed by some inline assembly.)

@RalfJung
Copy link
Member

Sure we could add longjmp to the Rust AM but IMO we should only do that if Rust actually provides setjmp/longjmp as an operation itself. I don't see why we would carry operations in the AM that we aren't even using as part of the language.

@digama0
Copy link

digama0 commented May 31, 2023

I think those are somewhat independent questions, since exposing it to rust also means designing an API for it and that can also take some time. Having it in the AM means users are allowed to implement longjmp wrappers themselves, which is generally good enough for all the users that currently want this, but sure we can try to make a version that is exposed from the standard library. It would still be unsafe though, of course.

@RalfJung
Copy link
Member

RalfJung commented May 31, 2023 via email

@digama0
Copy link

digama0 commented May 31, 2023

I'm not sure what you mean by that. If this is not an AM operation, then it is not allowed for FFI to do it either, unless the effect is completely unobservable from the rust side. That means no jumping over rust frames, POF or otherwise.

I generally view the set of operations available to the AM as the mechanism we use to make precise "FFI can do anything rust code could do", where "what rust code could do" is interpreted as "some legal sequence of AM operations". Adding an operation to the AM without adding rust surface syntax for it is thus saying that this is an operation that FFI and inline assembly is allowed to do.

@RalfJung
Copy link
Member

RalfJung commented Jun 1, 2023

My thinking was that for POF, the effect is unobservable from the Rust side, since it's just like a regular unwind. That's why POF are okay to just jump over.

But indeed this doesn't force the compiler to preserve the property of being a POF, so I guess we do need more than that... hm.

@chorman0773
Copy link
Contributor Author

On the "Can a POF become a non-POF" question, inlining is also fun - can a non-POF function be inlined into a POF (thus implicating longjmp).

I thought of this while asking a tangentially-related question on the project-ffi-unwind zulip stream.

@digama0
Copy link

digama0 commented Jun 18, 2023

I don't think inlining should pose any special issues, since the "frame" isn't just the entire stack frame but rather the set of live droppable locals at the point of a call to a nested function, and so even within the same function you could have multiple nested calls in which from the point of one call this call frame is a POF and from another it's not. From that perspective inlining can only introduce droppable locals which are not live at the point of existing nested calls, unless you move drop() past one of those calls, which AFAIK is never allowed.

Perhaps it could be allowed to extend a drop() under the as-if rule if the drop call is empty, although in that case why aren't you just deleting the call? It is not clear whether drop calls with empty body count as violating the POF property, and it seems like an optimization hazard if it does.

@BatmanAoD
Copy link
Member

BatmanAoD commented Aug 23, 2023

@digama0 Sorry for the late response. I think you're exactly correct that inlining isn't a special case for POFs, and I also think it's safe to say that the compiler should always be free to delete empty drop() calls, at least as far as POFs are concerned.

It is not clear whether drop calls with empty body count as violating the POF property...

The definition of POF is:

A "POF", or "Plain Old Frame", is defined as a frame that can be trivially deallocated: returning from or unwinding a POF cannot cause any observable effects. This means that POFs do not contain any pending destructors (live Drop objects) or catch_unwind calls.

The second sentence is intended to be a pure extrapolation from the first; the "cannot cause any observable effects" is the only "normative" part of the definition, so to speak. So if calling drop() does nothing, then the compiler can treat the frame as a POF.

...that said, what we ultimately want is for users to be able to guarantee that frames are POFs (so that they can longjmp over them), and there may be situations in which the compiler generates a call to drop but can't know whether or not the function actually does anything (though I can't think of an example off-hand; e.g. Box<dyn ...> has a non-trivial drop regardless of what type is owned by the box). I suspect we'll eventually need to decide whether the compiler needs to be able to "prove" that something is a POF in order to permit longjmp, or whether we take the C/C++ route of requiring the user to ensure that no "important" drops are missed.

@CAD97
Copy link

CAD97 commented Sep 3, 2023

A subset of this question is whether it's permitted to pthread_exit (POSIX) / ExitThread (Win32) from Rust main. This necessarily performs a forced unwind over stack frames not controlled by the user code (the std defined #[start] frame).

Purely from an opsem perspective, it can be argued that this subset of the question is actually a T-libs question, since it falls out of the language definition of when forced unwinding is permitted and how std defines its #[start] routine.


The default #[start] routine needs a catch_unwind landing pad to implement its documented behavior when main panics. Perhaps more interesting is a thread started by std::thread::spawn; it's not as immediately clear that this involves a std stack frame around the provided closure, but it does, and that frame also includes a catch_unwind.

Early termination of a thread is unquestionably unsafe, but I do think it would be somewhat unfortunate if it is unavoidable UB when the thread is created by Rust. Thus I expect that what we'll want is to say is that forced unwinding skips any unwind landing pads (i.e. drop glue and catch_unwind) and is unsound unless the causer of the forced unwind has a proof that doing so is sound. We'd then specify that a forced unwind through process or thread main results in target defined behavior1 and safe misbehavior of the standard library2. A cooperative unwind (e.g. panic) or regular return from main remains strongly preferred.

This would prohibit the compiler from moving observable behavior into unwind handlers, but tbf it seems relatively unlikely to benefit from doing so in practice. The most reasonable thing to move is de/alloc, and that's already considered not observable behavior.

Footnotes

  1. The OS defines what it means for a thread to terminate, e.g. whether terminating the process primary thread terminates the process along with it.

  2. For example, joining an abnormally terminated thread may succeed or may block forever, dependent on how std implements thread joins. Similarly, what happens to thread-local storage is unspecified, and may include deallocation without drop3.

  3. Which means that pinning directly in TLS is unsound. And it might be desirable to be able to consider TLS statics as pinned. There's a crate which provides references to TLS data without a closure by registering a TLS dtor which blocks until all such reference lifetimes are known to have ended. The soundness of such is already iffy (it's relying on all TLS dtors being run before any TLS is deallocated), but it's an easy example that would be broken by a choice other than "leak all TLS" or "destroy all TLS normally."

@BatmanAoD
Copy link
Member

I expect that what we'll want is to say is that forced unwinding skips any unwind landing pads (i.e. drop glue and catch_unwind) and is unsound unless the causer of the forced unwind has a proof that doing so is sound.

@CAD97 Agreed. The introduction of the "POF" terminology in RFC 2945 is intended to make it possible to formalize this.

@Amanieu
Copy link
Member

Amanieu commented Sep 3, 2023

Early termination of a thread is unquestionably unsafe, but I do think it would be somewhat unfortunate if it is unavoidable UB when the thread is created by Rust.

On the other hand, I think it is entirely reasonable to require that exiting a thread with pthread_exit must only be done with a thread explicitly started with pthread_create.

@CAD97
Copy link

CAD97 commented Sep 3, 2023

require that exiting a thread with pthread_exit must only be done with a thread explicitly started with pthread_create.

That is fair, and I only really mentioned specific functions as an example of the discussed functionality.

Back on the "permit it" hand, though, Rust does document that spawned threads correspond to real OS-level threads, including exposing the pthread_t for spawned threads.

the "POF" terminology

Just to be clear, while it's easy to prove a forced unwind over POF sound (as no unwind handlers exist to skip), I expect we'll likely want to permit forced unwinds over non-POF, with the presented motivating example being a thread exit doing a forced unwind across a catch_unwind handler which is unambiguously not a POF. And I don't particularly see how POF tie into specifying such behavior.

@chorman0773
Copy link
Contributor Author

I expect we'll likely want to permit forced unwinds over non-POF

If we do so, we'd have to say it's unspecified whether they run destructors, because we can't actually promise they aren't or are.

@nbdd0121
Copy link

nbdd0121 commented Sep 3, 2023

I don't think we should make pthread_exit legal for a thread created using std, especially that this will add a lot of complexity to the language (about when a forced unwind is legal) and is problematic w.r.t. Pin. What do we gain by allowing this?

@CAD97
Copy link

CAD97 commented Sep 3, 2023

say it's unspecified whether [forced unwinds] run destructors

I would say that if it runs destructors, it's a cooperative unwind, not a forced unwind. Although I do suppose usual usage of the term is more about how/whether the unwind is stopped rather than destructors.

If a "portable" unwind source runs destructors on some targets, then it's a cooperative unwind on those targets, and a forced unwind on the targets where it skips destructors.

And to note, at least on Windows (a notable example where setjmp/longjmp runs unwind handlers), specifying /EHsc to MSVC means that objects in scope aren't destroyed by asynchronous exceptions, so we are capable of ignoring "forced" unwinds piggybacking on the same SEH mechanism.

this will add a lot of complexity to the language [...] What do we gain by allowing this?

The additional language complexity is relatively small; it's just to define what it means to forced unwind over non-POF. The soundness concern is a library concern; obviously it's unsound to do a forced unwind over stack pinned values or soundness-relevant unwind cleanup. The gain is being able to do so, and the main motivating example being able to early-exit a thread. An actual use case of which being exiting the process main thread without terminating the process.

Requiring POF and considering a forced unwind over any nontrivial unwind handlers to be UB is still a viable choice, and one mirroring the effective status of C++, but is not without its limitations.

And for full transparency: I'm not certain that the more permissive option is actually desirable. It's moreso that I think it's potentially desirable and worth considering on merit.

@RalfJung
Copy link
Member

RalfJung commented Sep 4, 2023

An actual use case of which being exiting the process main thread without terminating the process.

And what's a usecase for that?

Right now my thinking is that I don't see sufficient motivation to allow pthread_exit on threads created by the standard library. Even if we arrange things to make it not language-UB to pthread_exit the main thread or a thread::spawn thread, it would still remain library-UB since I don't think we want to constrain which Drop types the standard library can have on those stack frames -- and that means the fact that it is not language-UB isn't actually useful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants