Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

<no info> frames on Windows when using unresolved backtraces since 0.3.16 #197

Closed
udoprog opened this issue Jun 3, 2019 · 7 comments
Closed

Comments

@udoprog
Copy link

udoprog commented Jun 3, 2019

failure::Error uses unresolved backtraces (resolved lazily) which seems to be broken in recent versions of the backtrace crate.

The following program on Windows msvc using backtrace versions 0.3.16 - 0.3.28 (rustc 1.37.0-nightly (7840a0b75 2019-05-31)):

fn main() {
    let mut bt = backtrace::Backtrace::new_unresolved();
    bt.resolve();
    println!("{:?}", bt);
}

Produces the following backtrace:

stack backtrace:
   0: <no info> (0x7ff6462820e2)
   1: <no info> (0x7ff6462820e2)
   2: <no info> (0x7ff646282010)
   3: <no info> (0x7ff646274e68)
   4: <no info> (0x7ff646274dc9)
   5: <no info> (0x7ff646271238)
   6: <no info> (0x7ff646271190)
   7: <no info> (0x7ff6462b4137)
   8: <no info> (0x7ff6462b4137)
   9: <no info> (0x7ff6462b7072)
  10: <no info> (0x7ff6462b49e2)
  11: <no info> (0x7ff6462b49e2)
  12: <no info> (0x7ff6462b49e2)
  13: <no info> (0x7ff64627116b)
  14: <no info> (0x7ff646271320)
  15: <no info> (0x7ff6462bf340)
  16: <no info> (0x7ff6462bf340)
  17: BaseThreadInitThunk (0x7ff8c1f97974)
  18: RtlUserThreadStart (0x7ff8c4c1a271)

0.3.15 produces the correct backtrace, so it seems the regression was introduced in 0.3.16:

stack backtrace:
   0: backtrace_not_working::main (0x7ff637ed1178)
             at D:\Work\Repo\backtrace-not-working\src\main.rs:3
   1: std::rt::lang_start::{{closure}}<()> (0x7ff637ed10c0)
             at /rustc/3c235d5600393dfe6c36eeed34042efad8d4f26e\src\libstd\rt.rs:64
   2: std::panicking::try::do_call<closure,i32> (0x7ff637f0feb7)
             at /rustc/3c235d5600393dfe6c36eeed34042efad8d4f26e\/src\libstd\panicking.rs:293
   3: panic_unwind::__rust_maybe_catch_panic (0x7ff637f12df2)
             at /rustc/3c235d5600393dfe6c36eeed34042efad8d4f26e\/src\libpanic_unwind\lib.rs:87
   4: std::rt::lang_start_internal (0x7ff637f10762)
             at /rustc/3c235d5600393dfe6c36eeed34042efad8d4f26e\/src\libstd\rt.rs:48
   5: std::rt::lang_start<()> (0x7ff637ed109b)
             at /rustc/3c235d5600393dfe6c36eeed34042efad8d4f26e\src\libstd\rt.rs:64
   6: main (0x7ff637ed1260)
   7: __scrt_common_main_seh (0x7ff637f1b18c)
             at d:\agent\_work\1\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl:288   8: BaseThreadInitThunk (0x7ff8c1f97974)
alexcrichton added a commit that referenced this issue Jun 3, 2019
This commit updates the MSVC code to leak `dbghelp.dll` on Windows after
we initially load it. This allows for cheaper reuse between invocations
and avoids each backtrace call having to load/unload an entire DLL.
While not profiled just yet given #197 we're going to have to preserve
*something* across calls in and out of this library, so this is going to
be a first required step in any case to doing so.
@alexcrichton
Copy link
Member

Thanks for the report!

This regression is due to #175 I believe with some local testing. As with a number of other platforms there's an initialization and a cleanup step for the library we're using to handle debuginfo. On Windows this is SymInitializeW and SymCleanup. We previously never called SymCleanup in this library, but that caused #165 because the standard library's then-recursive call to SymInitializeW would fail because symbols had already been initialized.

The purpose of #175 was to pair initialize/cleanup calls. What's happening here though is that information (the backtrace frames) are persisted across initialize/cleanup pairs. Namely the frames captured during new_unresolved are later actually processed in a different initialize/cleanup pair. Apparently on MSVC this is invalid and causes issues (who knew?).

I'm not really sure what the best solution for this is, but I think that we need to move towards a model where the symbols are initialized once and only once. That's really the only way to get this to work, and it should also be faster in general. Coordinating with the standard library is the tricky part, however.

@alexcrichton
Copy link
Member

cc @aloucks, you may be interested in this as well

@aloucks
Copy link
Contributor

aloucks commented Jun 3, 2019

Interesting. I can't reproduce this on master or on my original cleanup branch.

@udoprog Do all of the unit tests pass for you?

Out of curiosity, if you put panic!() at the end of main, does it have any impact on the resolution? I saw some weird behavior at one point but I had assumed at the time that I had made some kind of mistake or my eyes had played a trick on me.

Finally, what version of MSVC are you using? I'm using the 2017 MSVC build tools.

@udoprog
Copy link
Author

udoprog commented Jun 3, 2019

#198 fixed it for me.

@aloucks
All tests pass for master.
panic! at the end did nothing for 0.3.28.
MSVC 2017.

@aloucks
Copy link
Contributor

aloucks commented Jun 3, 2019

@udoprog Sorry, I had mistakenly run git pull and overlooked that #198 was just merged. I can reproduce it on master prior to that change.

@alexcrichton What's the motivation behind dynamically loading dbghelp? Also, could it be unloaded with libc::atexit?

@alexcrichton
Copy link
Member

Oh wow I thought I tested it but apparently not well enough, glad that works! In that case I'll go ahead and close :)

@aloucks I tried to dig up the rationale but all I could find was the original implementation (by me). IIRC the mingw toolchain just didn't work with -ldbghelp and I think winapi has some hacks to get around it. In any case it's not a necessary runtime dependency so I think it's more flexible anyway to leave it dynamic if we can (libstd has always done this for example)

Unfortunately libc::atexit doesn't always exist for MSVC or is sorta weird, and it's also not memory safe because it runs before all threads have exited.

@udoprog
Copy link
Author

udoprog commented Jun 4, 2019

You're awesome. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants