Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libunwind GetStackTrace implementation doesn't allow concurrent invocations #298

Closed
toddlipcon opened this issue Feb 15, 2018 · 2 comments
Closed

Comments

@toddlipcon
Copy link

The GetStackTrace implementation uses a global atomic to prevent multiple threads from calling into libunwind at the same time:

  if (sync_val_compare_and_swap(&g_now_entering, false, true)) {
    return 0;
  }

The comment, however, says that the issue it's trying to protect from is reentrancy bugs, not concurrency:

// Sometimes, we can try to get a stack trace from within a stack
// trace, because libunwind can call mmap (maybe indirectly via an
// internal mmap based memory allocator), and that mmap gets trapped
// and causes a stack-trace request.  If were to try to honor that
// recursive request, we'd end up with infinite recursion or deadlock.
// Luckily, it's safe to ignore those subsequent traces.  In such
// cases, we return 0 to indicate the situation.

Given this, it makes more sense for the flag to be a threadlocal rather than a global.

@shinh
Copy link
Collaborator

shinh commented Feb 15, 2018

Yeah, I think you are right. Maybe we can use GLOG_THREAD_LOCAL_STORAGE, though it's not supported by configure build. PR is welcomed :)

toddlipcon added a commit to toddlipcon/kudu that referenced this issue Feb 21, 2018
Previously we used glog's wrapper around libunwind for stack tracing.
However that has a deficiency that it assumes that, process wide, only
one thread can be inside libunwind at a time[1]

It appears that this is left over from some very old versions of
libunwind, or was already unnecessarily conservative. libunwind is meant
to be thread safe, and we have tests that will trigger if it is not.

This just extracts the function body of the glog function we were using
and does the same work manually.

Without this fix, the "collect from all the threads at the same time"
code path resulted in most of the threads collecting an empty trace
since they tried to call libunwind at the same time.

[1] google/glog#298
Change-Id: I3a53e55d7c4e7ee50bcac5b1e81267df56383634
@sergiud
Copy link
Collaborator

sergiud commented Nov 2, 2019

This should have been fixed by #161. Please reopen if still an issue.

@sergiud sergiud closed this as completed Nov 2, 2019
@sergiud sergiud mentioned this issue May 6, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants