Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use asm-based shim for sigsetjmp #1144

Closed
thomcc opened this issue May 23, 2023 · 12 comments · Fixed by #1621
Closed

Use asm-based shim for sigsetjmp #1144

thomcc opened this issue May 23, 2023 · 12 comments · Fixed by #1621

Comments

@thomcc
Copy link
Contributor

thomcc commented May 23, 2023

See rust-lang/unsafe-code-guidelines#404 (comment) for a proof of concept. This should make our setjmp handling less UB (without requiring unstable features), at the cost of needing some inline asm.

I'd have to write one for aarch64 too, and we'd use the current approach as a fallback for targets without a shim.

@thomcc
Copy link
Contributor Author

thomcc commented May 24, 2023

Thinking more about it, we should just call it in our C shim.

It's not just the overhead of maintaining x86_64 vs aarch64 (and macos/linux/potentially windows), it's also tricky to get the relocations right -- e.g. call setjmp vs mov _setjmp@GOTPCREL(%rip), %r14; call *%r14;.

@eeeebbbbrrrr
Copy link
Contributor

fun fact: plrust purposely doesn't include the cshim.

@thomcc
Copy link
Contributor Author

thomcc commented May 24, 2023

Ah, right, dang. Okay, it's probably not as bad as I made it out to be above (although both snippets are valid). Ugh.

@eeeebbbbrrrr
Copy link
Contributor

eeeebbbbrrrr commented May 24, 2023

Something I was thinking about last night but didn't know how to express in words...

We currently stash the "jumpbuf" (the first argument to sigsetjmp) in Postgres's global exception stack. My assembly is at a 3rd grade level, but I'm not seeing how we can do that with this particular combination of instructions.

@thomcc
Copy link
Contributor Author

thomcc commented May 24, 2023

Right, that's another part of why I'd rather do this in C if possible, since our implementation would be quite a bit more involved.

@eeeebbbbrrrr
Copy link
Contributor

Can you explain what's wrong with our current approach?

I've always thought what we're doing is fine since we're only calling an extern "C" function while the jump is set.

@thomcc
Copy link
Contributor Author

thomcc commented May 24, 2023

Basically setjmp/sigsetjmp is a function that returns twice, and the compiler doesn't know that.

As an example of how this might be an issue, consider stack slot coloring, which would normally allow a stack slot used for a variable at the time of (sig)setjmp to be reused for another variable later in the function if, based on the function's control flow, the original variable is dead. This is a problem if we longjmp back and try to access said variable.

That's just one example, there are a number of similar issues for functions like this.

@eeeebbbbrrrr
Copy link
Contributor

Right, I generally understand why they're UB. I guess my question is... is how we're using them really UB and we've been getting lucky, or are we actually okay because we only do 1 thing after calling sigsetjmp?

@thomcc
Copy link
Contributor Author

thomcc commented May 24, 2023

I'm not sure. There are a few possibilities, but my belief is: Error cases are rare, and this is only really UB in the case where the error occurs (otherwise setjmp does not return twice) meaning that the case where this causes issues probably doesn't get hit a lot, and may never occur in situations where the UB manifests.

The other possibility is that LLVM internally "knows" that sigsetjmp/setjmp are symbols that refer to functions that return twice. This is also possible but I'd want to verify that before really relying on it.

@thomcc
Copy link
Contributor Author

thomcc commented May 24, 2023

We also could use ffi_returns_twice as an unstable feature, only for plrust. I'd rather not, but given that it's already using a custom rustc driver, it wouldn't be out of the question.

I'm going to noodle on this a bit, the more I think about it the more likely it seems that this is actually a real problem we're getting away with only because hitting errors is uncommon.

@thomcc
Copy link
Contributor Author

thomcc commented May 25, 2023

It also occurs to me that we could push the jmp_buf on the postgres error stack prior to calling into that asm shim. There would be a period of time where the postgres error stack has an invalid jmp_buf on it, but that's fine so long as we do it right before the asm shim. (We don't have to worry about concurrency and just have to be sure not to panic between the two operations)

@workingjubilee
Copy link
Member

workingjubilee pushed a commit that referenced this issue Mar 25, 2024
The [cee-scape](https://crates.io/crates/cee-scape) crate uses a
hand-rolled assembly wrapper around libc's `sigsetjmp()`. This has
better correctness guarantees than calling `sigsetjmp()` directly,
because LLVM is not allowed to reason about / attempt to optimize inside
hand-written assembly. `sigsetjmp()` is used for
`pg_guard_ffi_boundary_impl()`, which is our wrapper around Postgres'
exception handling, since Postgres uses `sigsetjmp()` and `siglongjmp()`
to handle multiple return points. Rust's paradigm doesn't understand
this - in Rust, a a function call can only return to its caller at one
point in time: when it has finished executing. Whereas, sigsetjmp can
return to two places _chronologically_ - and so this needs to be handled
carefully.

Fixes #1144
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants