Use asm-based shim for sigsetjmp #1144

thomcc · 2023-05-23T23:31:31Z

See rust-lang/unsafe-code-guidelines#404 (comment) for a proof of concept. This should make our setjmp handling less UB (without requiring unstable features), at the cost of needing some inline asm.

I'd have to write one for aarch64 too, and we'd use the current approach as a fallback for targets without a shim.

thomcc · 2023-05-24T17:33:02Z

Thinking more about it, we should just call it in our C shim.

It's not just the overhead of maintaining x86_64 vs aarch64 (and macos/linux/potentially windows), it's also tricky to get the relocations right -- e.g. call setjmp vs mov _setjmp@GOTPCREL(%rip), %r14; call *%r14;.

eeeebbbbrrrr · 2023-05-24T17:33:56Z

fun fact: plrust purposely doesn't include the cshim.

thomcc · 2023-05-24T17:35:35Z

Ah, right, dang. Okay, it's probably not as bad as I made it out to be above (although both snippets are valid). Ugh.

eeeebbbbrrrr · 2023-05-24T17:39:17Z

Something I was thinking about last night but didn't know how to express in words...

We currently stash the "jumpbuf" (the first argument to sigsetjmp) in Postgres's global exception stack. My assembly is at a 3rd grade level, but I'm not seeing how we can do that with this particular combination of instructions.

thomcc · 2023-05-24T17:41:28Z

Right, that's another part of why I'd rather do this in C if possible, since our implementation would be quite a bit more involved.

eeeebbbbrrrr · 2023-05-24T17:42:42Z

Can you explain what's wrong with our current approach?

I've always thought what we're doing is fine since we're only calling an extern "C" function while the jump is set.

thomcc · 2023-05-24T17:47:33Z

Basically setjmp/sigsetjmp is a function that returns twice, and the compiler doesn't know that.

As an example of how this might be an issue, consider stack slot coloring, which would normally allow a stack slot used for a variable at the time of (sig)setjmp to be reused for another variable later in the function if, based on the function's control flow, the original variable is dead. This is a problem if we longjmp back and try to access said variable.

That's just one example, there are a number of similar issues for functions like this.

eeeebbbbrrrr · 2023-05-24T17:49:56Z

Right, I generally understand why they're UB. I guess my question is... is how we're using them really UB and we've been getting lucky, or are we actually okay because we only do 1 thing after calling sigsetjmp?

thomcc · 2023-05-24T17:52:38Z

I'm not sure. There are a few possibilities, but my belief is: Error cases are rare, and this is only really UB in the case where the error occurs (otherwise setjmp does not return twice) meaning that the case where this causes issues probably doesn't get hit a lot, and may never occur in situations where the UB manifests.

The other possibility is that LLVM internally "knows" that sigsetjmp/setjmp are symbols that refer to functions that return twice. This is also possible but I'd want to verify that before really relying on it.

thomcc · 2023-05-24T19:18:40Z

We also could use ffi_returns_twice as an unstable feature, only for plrust. I'd rather not, but given that it's already using a custom rustc driver, it wouldn't be out of the question.

I'm going to noodle on this a bit, the more I think about it the more likely it seems that this is actually a real problem we're getting away with only because hitting errors is uncommon.

thomcc · 2023-05-25T02:15:18Z

It also occurs to me that we could push the jmp_buf on the postgres error stack prior to calling into that asm shim. There would be a period of time where the postgres error stack has an invalid jmp_buf on it, but that's fine so long as we do it right before the asm shim. (We don't have to worry about concurrency and just have to be sure not to panic between the two operations)

workingjubilee · 2024-02-22T03:44:47Z

note: https://docs.rs/cee-scape/latest/cee_scape/

The [cee-scape](https://crates.io/crates/cee-scape) crate uses a hand-rolled assembly wrapper around libc's `sigsetjmp()`. This has better correctness guarantees than calling `sigsetjmp()` directly, because LLVM is not allowed to reason about / attempt to optimize inside hand-written assembly. `sigsetjmp()` is used for `pg_guard_ffi_boundary_impl()`, which is our wrapper around Postgres' exception handling, since Postgres uses `sigsetjmp()` and `siglongjmp()` to handle multiple return points. Rust's paradigm doesn't understand this - in Rust, a a function call can only return to its caller at one point in time: when it has finished executing. Whereas, sigsetjmp can return to two places _chronologically_ - and so this needs to be handled carefully. Fixes #1144

workingjubilee added the ffi-safety label Jul 24, 2023

workingjubilee mentioned this issue Mar 25, 2024

Use Ceescape's sigsetjmp() wrapper #1621

Merged

workingjubilee closed this as completed in #1621 Mar 25, 2024

tucnak mentioned this issue Oct 1, 2024

Build failed on ppc64el #1897

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use asm-based shim for sigsetjmp #1144

Use asm-based shim for sigsetjmp #1144

thomcc commented May 23, 2023

thomcc commented May 24, 2023

eeeebbbbrrrr commented May 24, 2023

thomcc commented May 24, 2023

eeeebbbbrrrr commented May 24, 2023 •

edited

Loading

thomcc commented May 24, 2023

eeeebbbbrrrr commented May 24, 2023

thomcc commented May 24, 2023

eeeebbbbrrrr commented May 24, 2023

thomcc commented May 24, 2023

thomcc commented May 24, 2023

thomcc commented May 25, 2023 •

edited

Loading

workingjubilee commented Feb 22, 2024

Use asm-based shim for sigsetjmp #1144

Use asm-based shim for sigsetjmp #1144

Comments

thomcc commented May 23, 2023

thomcc commented May 24, 2023

eeeebbbbrrrr commented May 24, 2023

thomcc commented May 24, 2023

eeeebbbbrrrr commented May 24, 2023 • edited Loading

thomcc commented May 24, 2023

eeeebbbbrrrr commented May 24, 2023

thomcc commented May 24, 2023

eeeebbbbrrrr commented May 24, 2023

thomcc commented May 24, 2023

thomcc commented May 24, 2023

thomcc commented May 25, 2023 • edited Loading

workingjubilee commented Feb 22, 2024

eeeebbbbrrrr commented May 24, 2023 •

edited

Loading

thomcc commented May 25, 2023 •

edited

Loading