You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The GetStackTrace implementation uses a global atomic to prevent multiple threads from calling into libunwind at the same time:
if (sync_val_compare_and_swap(&g_now_entering, false, true)) {
return 0;
}
The comment, however, says that the issue it's trying to protect from is reentrancy bugs, not concurrency:
// Sometimes, we can try to get a stack trace from within a stack
// trace, because libunwind can call mmap (maybe indirectly via an
// internal mmap based memory allocator), and that mmap gets trapped
// and causes a stack-trace request. If were to try to honor that
// recursive request, we'd end up with infinite recursion or deadlock.
// Luckily, it's safe to ignore those subsequent traces. In such
// cases, we return 0 to indicate the situation.
Given this, it makes more sense for the flag to be a threadlocal rather than a global.
The text was updated successfully, but these errors were encountered:
Previously we used glog's wrapper around libunwind for stack tracing.
However that has a deficiency that it assumes that, process wide, only
one thread can be inside libunwind at a time[1]
It appears that this is left over from some very old versions of
libunwind, or was already unnecessarily conservative. libunwind is meant
to be thread safe, and we have tests that will trigger if it is not.
This just extracts the function body of the glog function we were using
and does the same work manually.
Without this fix, the "collect from all the threads at the same time"
code path resulted in most of the threads collecting an empty trace
since they tried to call libunwind at the same time.
[1] google/glog#298
Change-Id: I3a53e55d7c4e7ee50bcac5b1e81267df56383634
The GetStackTrace implementation uses a global atomic to prevent multiple threads from calling into libunwind at the same time:
The comment, however, says that the issue it's trying to protect from is reentrancy bugs, not concurrency:
Given this, it makes more sense for the flag to be a threadlocal rather than a global.
The text was updated successfully, but these errors were encountered: