-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Panic while formatting panic message + always_abort leads to stack overflow #122940
Comments
we can't due to (Presumably thread locals cannot be done without allocating in rust, but can in C ?) Unless we do thread local in a .c library which is NOT guaranteed to not allocate on all platforms (but on linux and windows shouldn't allocate)
[package]
name = "c_thread_local_in_rust"
version = "0.1.0"
edition = "2021"
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
[dependencies]
[build-dependencies]
#cc = { version="1", path = "/tmp/cc-rs" } #used this because it doesn't do this warn: warning: c_thread_local_in_rust@0.1.0: Compiler version doesn't include clang or GCC: "cc" "--version"
cc = "1.0.90" #otherwise, using this is preferred
fn main() {
// Compile the C code into a shared library using cc crate
cc::Build::new()
.file("tl.c")
.shared_flag(true)
.compile("tl");
}
// example.c
//#include <stdio.h>
// Define a thread-local variable
__thread int thread_local_var;
// Function to access the thread-local variable
int *get_thread_local_var() {
return &thread_local_var;
}
// Function to set the value of the thread-local variable
void set_thread_local_var(int value) {
thread_local_var = value;
}
/*
In C, thread-local storage (TLS) can be implemented in different ways, and whether it involves heap allocation or new allocations depends on the specific implementation and the platform.
However, the most common and efficient way to implement thread-local storage in C is by using platform-specific mechanisms such as compiler extensions (`__thread` specifier) or standard library functions (`pthread_key_create`, `TlsAlloc`, etc.), which typically allocate thread-local storage directly from the system or using thread-specific data structures managed by the runtime environment.
For example, in the case of the `__thread` specifier in GCC/Clang, thread-local variables are often allocated directly within the thread's control block or using thread-specific registers, avoiding heap allocation entirely.
Similarly, with `pthread_key_create` or Windows-specific functions like `TlsAlloc`, the thread-local storage is managed by the operating system or the runtime environment, typically without involving heap allocation.
That said, it's crucial to understand that the exact implementation details can vary across platforms, compilers, and runtime environments. Therefore, while thread-local storage in C is often implemented efficiently without heap allocation, it's essential to consult the documentation and consider the specific context and requirements of your application to ensure optimal performance and behavior.
*/
//mod ffi;
use std::thread;
use std::time::Duration;
//so, unlike thread_local!() in rust which supposedly somehow does memory allocation(s) on heap
//upon first access of the var, this in .c file thread local does not, but cannot guarantee it
//doesn't on all other platforms
fn main() {
println!("Hello, world!");
let value = access_thread_local_var();
println!("Thread-local value: {}", value);
set_thread_local_var2(20);
let value = access_thread_local_var();
println!("Thread-local value: {}", value);
// Spawn a new thread
let handle = thread::spawn(|| {
// This closure represents the code that will run in the new thread
for i in 1..=5 {
let value = access_thread_local_var();
println!("Hello from the spawned thread! Message {} ltvar={}", i,value);
thread::sleep(Duration::from_millis(500)); // Sleep for 500 milliseconds
set_thread_local_var2(10*i);
}
});
// Main thread continues executing concurrently with the spawned thread
for i in 1..=3 {
let value = access_thread_local_var();
println!("Hello from the main thread! Message {} ltvar={}", i,value);
thread::sleep(Duration::from_millis(1000)); // Sleep for 1 second
set_thread_local_var2(i);
}
// Wait for the spawned thread to finish
handle.join().unwrap();
println!("Main thread exiting.");
}
//
// Link against the C shared library
//#[link(name = "tl")] //looks like this isn't needed!
//extern {}
// Rust code (lib.rs)
extern {
// Define an external function to access the thread-local variable
fn get_thread_local_var() -> *mut i32;
fn set_thread_local_var(value: i32);
}
// Function to access the thread-local variable from Rust
pub fn access_thread_local_var() -> i32 {
unsafe {
// Call the external function to get a pointer to the thread-local variable
let ptr = get_thread_local_var();
// Dereference the pointer to access the value of the thread-local variable
*ptr
}
}
// Function to set the value of the thread-local variable from Rust
pub fn set_thread_local_var2(value: i32) {
unsafe {
// Call the external function to set the value of the thread-local variable
set_thread_local_var(value);
}
}
output:
As an aside, if I do figure out how to get thread local storage on rust without allocating, then I might be able to fix a bug in libtest whereby tests ran in-process as threads (except for the case of |
This certainly used to be true, but nowadays EDIT: Looking at the code, Not sure how else we could detect reentrancy in the panic handler though. Maybe we can make it depend on |
I don't know what you mean, but if there is a bug here, please report it as a new issue. It's not related to this issue, so please let's avoid getting off-topic. |
You're right. (though I didn't yet scope the whole thing about that issue, I'm only aware of 3 ways that test would break out of the test harness that aren't fixed yet, so I'm postponing reporting it until I do understand it and can formulate it properly - also, perfectionism sux) But what I should've said, that is relevant to this issue(or so I think), is the following: parens(that being said, 'message' can still allocate since it's user-controlled and they may choose to do so, it breaks the 'fork' case of needing to avoid any allocations which is why we've used panic::always_abort in the first place - but this is another issue) however this is just the |
I'm not sure it's worth the effort to distinguish whether it's the panic message formatting or the panic hook that caused the recursive panic. But either way that possible improvement is orthogonal to this issue as well. |
Well no, it would prevent any infinite recursion because eventually at 3rd or whatever panic recursion it would either print a harcoded message, or if even that one panic-ed, then it wouldn't print anything just reach the abort call. That was the point, progressively back off of paths that we've been on, until the only path left is one that doesn't panic or the last one which is abort. EDIT: that being said, maybe I'm misunderstanding something and that's why the above makes sense to me but is in fact wrong (i'm attempting to figure that out and update this after) output:
for this code: #![feature(panic_always_abort)]
//
//src: https://github.com/rust-lang/rust/issues/97181
//this double panic no longer shows stacktrace due to https://github.com/rust-lang/rust/pull/110975
use std::fmt::{Display, self};
struct MyStruct;
struct MyStruct2;
impl Display for MyStruct2 {
fn fmt(&self, _: &mut fmt::Formatter<'_>) -> fmt::Result {
//todo!()
let instance = MyStruct;
panic!("oh1 no, '{}' was unexpected", instance);
}
}
impl Display for MyStruct {
fn fmt(&self, _: &mut fmt::Formatter<'_>) -> fmt::Result {
let instance2 = MyStruct2;
panic!("oh2 no, '{}' was unexpected", instance2);
//todo!()
}
}
fn main() {
let instance = MyStruct;
std::panic::always_abort();//issue: https://github.com/rust-lang/rust/issues/122940
println!("{}", instance);
//assert!(false, "oh no, '{}' was unexpected", instance);
} With these rustc changes Index: /var/tmp/portage/dev-lang/rust-1.75.0-r1/work/rustc-1.75.0-src/library/std/src/panicking.rs
===================================================================
--- .orig/var/tmp/portage/dev-lang/rust-1.75.0-r1/work/rustc-1.75.0-src/library/std/src/panicking.rs
+++ /var/tmp/portage/dev-lang/rust-1.75.0-r1/work/rustc-1.75.0-src/library/std/src/panicking.rs
@@ -739,6 +739,7 @@ fn rust_panic_with_hook(
force_no_backtrace: bool,
) -> ! {
let must_abort = panic_count::increase(true);
+ rtprintpanic!("must_abort={:?}\n", must_abort);
// Check if we need to abort immediately.
if let Some(must_abort) = must_abort {
@@ -746,11 +747,27 @@ fn rust_panic_with_hook(
panic_count::MustAbort::PanicInHook => {
// Don't try to print the message in this case
// - perhaps that is causing the recursive panics.
- rtprintpanic!("thread panicked while processing panic. aborting.\n");
- }
- panic_count::MustAbort::AlwaysAbort => {
+ //rtprintpanic!("thread panicked while processing panic. aborting. msg='{:?}' loc='{:?}'\n",message, location);
+ //FIXME: I need thread local(that doesn't do allocations when accessing it) and
+ //which will let me progressively remove things from the output which might panic
+ //like the 'message' at first, then 'location', then the backtrace dump?
+ rtprintpanic!("thread panicked while processing panic. aborting. msg='not shown' loc='{:?}'\n", location);
+ //let panicinfo = PanicInfo::internal_constructor(
+ // message,//this would infinite panic
+ // location,
+ // can_unwind,
+ // force_no_backtrace,
+ //);
+ //rtprintpanic!("{panicinfo}\nhere's a stacktrace attempt:\n");
+ rtprintpanic!("{}\n", crate::backtrace::Backtrace::force_capture());
+ }
+ panic_count::MustAbort::AlwaysAbort => {
+ rtprintpanic!("in panic_count::MustAbort::AlwaysAbort\n");
// Unfortunately, this does not print a backtrace, because creating
// a `Backtrace` will allocate, which we must to avoid here.
+ static BEEN_HERE:AtomicBool=AtomicBool::new(false);
+ //FIXME: this is the wrong way; need a thread local var, but one that doesn't alloc (for the fork case)
+ if false==BEEN_HERE.swap(true, Ordering::SeqCst) {
let panicinfo = PanicInfo::internal_constructor(
message,
location,
@@ -758,13 +775,27 @@ fn rust_panic_with_hook(
force_no_backtrace,
);
rtprintpanic!("{panicinfo}\npanicked after panic::always_abort(), aborting.\n");
+ //XXX: we'd restore this even if thread local because this whole thing may end up in a
+ //landing pad and some future new panic(like if this is ran within libtest via
+ //'cargo test' with --test-threads 1, it could be catch_unwind-ed aka
+ //caught, then the next test that panics gets back here)
+ //might get itself here and have an unclean path state from this old panic we're doing now.
+ //even the abort below might get caught by atexit hook(iirc) and end up in landing
+ //pad (well, it would with my hacky patch that's about that, iirc)
+ BEEN_HERE.store(false, Ordering::SeqCst);
+ } else {//been here, prevent recursion
+ rtprintpanic!("panicked after panic::always_abort(), aborting. (and recursion prevented) loc='{:?}'\n", location);
+ }
}
}
+ rtprintpanic!("about to abort_internal()\n");
crate::sys::abort_internal();
}
+ rtprintpanic!("about to mut info\n");
let mut info =
PanicInfo::internal_constructor(message, location, can_unwind, force_no_backtrace);
+ rtprintpanic!("about to hook read\n");
let hook = HOOK.read().unwrap_or_else(PoisonError::into_inner);
match *hook {
// Some platforms (like wasm) know that printing to stderr won't ever actually
@@ -773,31 +804,48 @@ fn rust_panic_with_hook(
// methods, this means we avoid formatting the string at all!
// (The panic runtime might still call `payload.take_box()` though and trigger
// formatting.)
- Hook::Default if panic_output().is_none() => {}
+ Hook::Default if panic_output().is_none() => {
+ rtprintpanic!("1\n");
+ }
Hook::Default => {
- info.set_payload(payload.get());
+ rtprintpanic!("2\n");
+ let p=payload.get();
+ rtprintpanic!("22\n");
+ info.set_payload(p);
+ rtprintpanic!("222\n");
default_hook(&info);
}
Hook::Custom(ref hook) => {
- info.set_payload(payload.get());
+ rtprintpanic!("3\n");
+ let p=payload.get();
+ rtprintpanic!("33\n");
+ info.set_payload(p);
+ //info.set_payload(payload.get());
+ rtprintpanic!("333\n");
hook(&info);
}
};
+ rtprintpanic!("4\n");
drop(hook);
+ rtprintpanic!("5\n");
// Indicate that we have finished executing the panic hook. After this point
// it is fine if there is a panic while executing destructors, as long as it
// it contained within a `catch_unwind`.
panic_count::finished_panic_hook();
+ rtprintpanic!("6\n");
if !can_unwind {
+ rtprintpanic!("7\n");
// If a thread panics while running destructors or tries to unwind
// through a nounwind function (e.g. extern "C") then we cannot continue
// unwinding and have to abort immediately.
rtprintpanic!("thread caused non-unwinding panic. aborting.\n");
+ rtprintpanic!("8ai\n");
crate::sys::abort_internal();
}
+ rtprintpanic!("9rp\n");
rust_panic(payload)
}
|
was added at line 315 on October 6th 2022, Would be nice if more people would please confirm whether or not that comment still applies even though it's I'm not too sure about what the docs for
use std::cell::Cell;
thread_local! {
pub static FOO: Cell<u32> = const { Cell::new(1) };
}
assert_eq!(FOO.get(), 1); (feel free to mark this comment as off-topic) |
Rather than messing with thread locals, we could just add a second bit to |
EDIT2:
If indeed it is so, as even the comments re-iterate:
then I posit that for reference: rust/library/std/src/panicking.rs Lines 751 to 760 in 9f25a04
EDIT: that usually though, if others are gonna use it(like after it's stabilized?) for other more benign purposes they won't be seeing a (proper) |
An alternative would be to set the message to In fact I could probably have done that in #122930 as well. |
The comment still applies if you replace "would" by "could". On platforms like x86_64 Linux, We could get around this by using
I'd much prefer that solution. |
Not always. 32bit GNU/Windows uses (EDIT: Actually it's
|
Hah, I found some documentation: C++ specifies it as not-signal-safe. As we rely on the same platform support as C++, I guess that makes it not-signal-safe for us as well. |
That still seems like an unfortunate restriction -- most our formatting code does not need to allocate, so it seems reasonable to say that post-fork code must ensure to not use a panic message that needs formatting. And this way we can at least produce nice panic messages. |
@joboet That is because it assumes only access to the "lowest-common-denominator" parts of the pthread model... something we rarely actually are forced to use. But as you noted, rarely is not never (e.g. GNU Windows). I don't know of a more exhaustive or up-to-date treatment of how TLS variables are handled than https://www.akkadia.org/drepper/tls.pdf unfortunately... |
Oh actually, we already rely on thread-locals being async-signal-safe in practice cause we store the stack overflow guard page location in TLS and access it in the SIGSEGV handler. In that case, the no-allocation solution would work. |
The problem is that in the general/local dynamic model, we call |
I brought this up in a recent adjacent PR and the feedback was that we should first investigate sacrificing another bit of the global count to track "is there a panic in progress in any thread". |
This means, we'll still try to print the 'message'(ie. formatted) on the first try, right?(hover mouse for tooltip(s)) and if that panics, the bit would've been set and we just abort instead of trying to print 'message' again. If so, then this still leaves the problem of user could've allocated1 in its I like your alternative(show unformatted 'message' to avoid both a possible panic and any user-made allocs, in their
(click me to expand)
|
When you |
...huh? if you fork, until you execve, while in a multithreaded program, you cannot call anything that is not async-signal-safe. my understanding is that our rust/library/std/src/io/stdio.rs Lines 904 to 915 in 0824b30
my understanding is that making the lock reentrant with respect to its own thread doesn't actually make it reentrant with respect to the sense that "async signal safe" means. |
Ah, I didn't think stderr would be Well then you can still |
indeed, that is "async signal "safe"". :^) rust/compiler/rustc_driver_impl/src/signal_handler.rs Lines 17 to 37 in d36bdd1
|
FWIW I was wrong about which Windows target does not have |
Example and initial analysis by @correabuscar:
The reason for this is that if both "panic while processing panic" and "panic after
always_abort
" apply, then we use theAlwaysAbort
code path which does format the panic message.This could be fixed by moving this check further down below the
in_panic_hook
check:rust/library/std/src/panicking.rs
Lines 393 to 395 in 9f25a04
But I don't know if this has other adverse side-effects. Can we even access thread-local variables after
fork
(i.e., when always-abort is set)?Cc @Amanieu
The text was updated successfully, but these errors were encountered: