Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support split DWARF #260

Merged
merged 1 commit into from
Jan 29, 2023
Merged

Support split DWARF #260

merged 1 commit into from
Jan 29, 2023

Conversation

khuey
Copy link
Contributor

@khuey khuey commented Jan 15, 2023

This is pretty invasive. It also depends on gimli-rs/gimli#637

To make split DWARF work, we'll ultimately need to be able to create the correct gimli::Dwarf and gimli::Units. The former is a blend of sections from the main (at a minimum, .debug_addr) and dwo files. The latter is parsed from the dwo and then has copy_relocated_attributes called with the skeleton unit from the main file.

We also don't want to force the loading of all split DWARF data up front for performance reasons.

This means we need a mechanism for lazy loading of split DWARF data (enter the new SplitDwarfLoader trait) that can be provided by the addr2line consumer. Once the consumer loads the correct gimli::Dwarf and returns it to addr2line, we can produce the correct gimli::Unit.

We then have to pass the correct gimli::Dwarf and gimli::Unit down into the machinery that actually does the work (largely function parsing). To make that possible I've split (heh) ResDwarf into a parsed component that is append only (through the various LazyCells) and a raw component which can be cheaply cloned and can have its gimli::Dwarf replaced when necessary. The single instance of the parsed component is passed around by reference as necessary. The raw component (along with the correct gimli::Unit to use) is computed by ResUnit::dwarf_and_unit_dwo as necessary.

A simple SplitDwarfLoader implementation is included in the tests. I have tested this with -Csplit-debuginfo=unpacked and -Csplit-debuginfo=packed, both with and without -Zsplit-dwarf-kind=single, and together with various outstanding cargo and rustc PRs I have all addr2line tests pass.

@khuey
Copy link
Contributor Author

khuey commented Jan 15, 2023

I suppose I should add a SplitDwarfLoader implementation to the example too.

@mstange
Copy link
Contributor

mstange commented Jan 16, 2023

I haven't looked at the patch in detail yet, but it looks like it requires synchronous loading of the dwo file. In my use of addr2line I cannot load files synchronously, so for my purposes I would need a different API.

It would be great if a lookup could return some kind of "need extra file" token, so that the caller can load the file on their own terms (for example asynchronously) and then call back into some kind of "lookup with extra file" API.

Here's how I handled the analogous macOS case in the API for wholesym: The result from an address lookup is an enum FramesLookupResult, which is either Available, External, or Unavailable. If it's External, the caller can load that extra file and then call another method to finish the lookup.

@khuey
Copy link
Contributor Author

khuey commented Jan 16, 2023

it looks like it requires synchronous loading of the dwo file

Yes that's right.

We do that kind of continuation based value injection for evaluating DWARF expressions in gimli. It's possible to do here but it would be even more invasive than the existing patch.

@philipc
Copy link
Contributor

philipc commented Jan 16, 2023

It's okay to have invasive changes, and asynchronous loading is something we should support, even if it is not initially implemented in this PR. With that in view, maybe it would be better if the loader was passed in each time, instead of storing it in Context.

@bjorn3
Copy link
Contributor

bjorn3 commented Jan 16, 2023

Split DWARF is already supported through Context::new_with_sup, right? This PR merely helps with finding the split debuginfo file as I understand it.

@philipc
Copy link
Contributor

philipc commented Jan 16, 2023

No, Context::new_with_sup is for supplementary object files, such as those created by running dwz after linking.

@bjorn3
Copy link
Contributor

bjorn3 commented Jan 16, 2023

I see

@khuey
Copy link
Contributor Author

khuey commented Jan 17, 2023

Here's a prototype implementation of what things would look at in a continuation style. This is not high enough quality to merge at this point but it'll give you an idea of what the API would have to look like. In particular, look at the documentation of LookupContinuation, the function builtin_split_dwarf_loader::SplitDwarfLoader::run, and the example and correctness test.

@mstange
Copy link
Contributor

mstange commented Jan 17, 2023

Thank you! But why is it so complicated? What are the cases where we need to load multiple extra files? Is it because we don't know for sure which unit an address is located in until we've "looked inside" that unit?

@khuey
Copy link
Contributor Author

khuey commented Jan 17, 2023

basically compiler bugs

addr2line/src/lib.rs

Lines 204 to 213 in 3a2dbaf

/// Finds the CUs for the function address given.
///
/// There might be multiple CUs whose range contains this address.
/// Weak symbols have shown up in the wild which cause this to happen
/// but otherwise this can happen if the CU has non-contiguous functions
/// but only reports a single range.
///
/// Consequently we return an iterator for all CUs which may contain the
/// address, and the caller must check if there is actually a function or
/// location in the CU for that address.

@mstange
Copy link
Contributor

mstange commented Jan 17, 2023

I see. So we could, in theory, say "we don't support that workaround in the split dwarf case" and only allow one unit to be in a different file. Or we could return information about all potential units upfront, and require them all to be loaded in the follow-up call, hoping that the case of multiple potential units is not worth optimizing for.

@khuey
Copy link
Contributor Author

khuey commented Jan 17, 2023

In theory, maybe. I think we'd need to understand rust-lang/backtrace-rs#327 more before we did that.

The way I've written this is the way you'd write it with generators (LookupContinuation<Output, R> is essentially a manually implemented Generator<R = Option<Arc<gimli::Dwarf<R>>, Yield = SplitDwarfLookup<R>, Return = Output>) so it'd be easy to switch if generators are ever stabilized. We could get cute by gathering a Vec<SplitDwarfLookup<R>> up front. I doubt it's worth it, driving it in a loop is straightforward.

@philipc
Copy link
Contributor

philipc commented Jan 17, 2023

Weak symbols have shown up in the wild which cause this to happen

I think this comment is talking about comdat symbols. Not sure if this will be a problem here.

this can happen if the CU has non-contiguous functions but only reports a single range

I think this has only ever happened on macOS, and that is unlikely to need support for split DWARF. Or if it does have split DWARF, it will be with a compiler that generates correct CU ranges.

@khuey
Copy link
Contributor Author

khuey commented Jan 17, 2023

fwiw I see identical code ranges across multiple units with RUSTFLAGS=--Clink-args=-Wl,--icf=all on programs as trivial as gimli's dwarfdump example, including at least one case where the CU claims to cover the range but there's no DW_TAG_subprogram that does inside the CU. So I don't think handling this is optional.

@khuey
Copy link
Contributor Author

khuey commented Jan 17, 2023

(That's on Linux, rustc 1.66.1, lld 10)

@mstange
Copy link
Contributor

mstange commented Jan 17, 2023

Ah, that is unfortunate. Thanks for checking!

Okay, if we need to be prepared to check multiple files anyway, I would still prefer a solution where we know about all potentially-needed files upfront (right after the lookup-address call), so that my wrapper API in wholesym can stay simple and only require one .await call.

@khuey
Copy link
Contributor Author

khuey commented Jan 17, 2023

Trivial functions like drop_in_place could be ICFed over a very large number of CUs, so the set of "potentially needed" files might be very large.

@philipc
Copy link
Contributor

philipc commented Jan 18, 2023

Instead of hiding the control flow, can we return an iterator over the units that may need loading, plus a map function, and the caller can decide whether to interleave the iteration and map calls, or to do some or all of the iteration first.

@khuey
Copy link
Contributor Author

khuey commented Jan 18, 2023

Is it possible? Yes. I don't think it's a great API design though.

I think it'd be better to offer a function that "preloads" all potentially relevant units for an address. So it'd look something like this:

pub struct PreloadInfo<R> {
    indices: Vec<usize>,
    pub loads: Vec<SplitDwarfInfo<R>>,
}

pub struct CompletedPreload<R> {
    indices: Vec<usize>,
    results: Vec<Option<gimli::Dwarf<R>>>,
}

impl<R> PreloadInfo<R> {
    fn complete(self, results: Vec<Option<gimli::Dwarf<R>>>) -> CompletedPreload<R> {
        CompletedPreload {
            indices: self.indices,
            results,
        }
    }
}

impl<R: gimli::Reader> Context<R> {
    fn start_preload(&self, addr: u64) -> PreloadInfo<R>;
    fn complete_preload(&self, preload_info: PreloadInfo<R>) {
        for (index, dwarf) in preload_info.indices.into_iter().zip(preload_info.results.into_iter()) {
            let unit = &self.parsed_dwarf.units[index];
            unit.dwo = do_the_thing(dwarf);
        }
    }
}

fn consumer_api(addr: u64) -> FramesIter {
    let mut preload = ctx.start_preload(addr);
    let loads_to_do = mem::take(preload.loads());
    let results = /* API consumer does the loads */;
    ctx.complete_preload(preload.complete(results));
    let continuation = ctx.find_frames(addr);
    match continuation.state() {
        ControlFlow::Break(v) => v,
        ControlFlow::Continue(_) => unreachable!("addr2line's API promises this won't happen!!!"),
    }
}

@philipc
Copy link
Contributor

philipc commented Jan 19, 2023

Some sort of preload API is fine too. I'd prefer either Iterator<Item=PreloadInfo> or Vec<PreloadInfo> instead of having Vec internal to PreloadInfo and CompletedPreload, but that's a minor difference.

@khuey
Copy link
Contributor Author

khuey commented Jan 19, 2023

The tricky thing there is that the context needs to be able to match the gimli::Dwarf's it gets back with the correct units.

@philipc
Copy link
Contributor

philipc commented Jan 19, 2023

That's done via the index though, which we can still have:

pub struct PreloadInfo<R> {
    index: usize,
    pub load: SplitDwarfInfo<R>,
}

pub struct CompletedPreload<R> {
    index: usize,
    result: Option<gimli::Dwarf<R>>,
}

impl<R> PreloadInfo<R> {
    fn complete(self, result: Option<gimli::Dwarf<R>>) -> CompletedPreload<R> {
        CompletedPreload {
            index: self.index,
            result,
        }
    }
}

@khuey
Copy link
Contributor Author

khuey commented Jan 19, 2023

Oh, I see. I didn't understand what you meant the first time but yeah, that would work.

@khuey
Copy link
Contributor Author

khuey commented Jan 19, 2023

If you're otherwise happy with the API I'll clean things up to a reviewable state this weekend.

@khuey
Copy link
Contributor Author

khuey commented Jan 19, 2023

Also one question I did have in the back of mind while writing this: is code in addr2line allowed to panic? Especially since its presumably running in the panic handler machinery ...

@philipc
Copy link
Contributor

philipc commented Jan 19, 2023

I'm happy with the API if it meets @mstange's needs.

I think it's the standard library's responsibility to handle what happens if this code panics in the panic handler machinery, but we still avoid panics. So only use debug_assert and only do an unwrap if it is logically impossible to fail.

@khuey
Copy link
Contributor Author

khuey commented Jan 19, 2023

The draft commits have unwraps that can be triggered by API misuse. Is that unacceptable?

@philipc
Copy link
Contributor

philipc commented Jan 19, 2023

Ideally we would rework the API so that they are impossible, but it won't block landing this.

@khuey
Copy link
Contributor Author

khuey commented Jan 19, 2023

Ok, I have an idea of how we might accomplish that.

@khuey
Copy link
Contributor Author

khuey commented Jan 20, 2023

Ok I found time sooner than expected. What do you think of that?

@mstange
Copy link
Contributor

mstange commented Jan 20, 2023

Trivial functions like drop_in_place could be ICFed over a very large number of CUs, so the set of "potentially needed" files might be very large.

In this case, does every potential file contain the debug info for drop_in_place? So would the first lookup in practice be successful in this case?

All this makes me thankful for the mach-O OSO stab symbol implementation, where the linked file tells you per function which original object file the function came from.

@philipc
Copy link
Contributor

philipc commented Jan 20, 2023

That looks good assuming it's possible for the consumer to easily store the callbacks for use after the asynchronous load (it's probably okay, just not something I'm sure about).

@khuey
Copy link
Contributor Author

khuey commented Jan 20, 2023

In this case, does every potential file contain the debug info for drop_in_place? So would the first lookup in practice be successful in this case?

I saw at least one file that did not actually contain debug info. So we're not guaranteed that the first lookup would be successful, though in practice I think we would find one pretty quickly.

@philipc
Copy link
Contributor

philipc commented Jan 20, 2023

All this makes me thankful for the mach-O OSO stab symbol implementation, where the linked file tells you per function which original object file the function came from.

That wouldn't help here, because the function came from multiple original object files. There is no single true source, they are all valid, and so you must look at the caller to disambiguate. http://wiki.dwarfstd.org/index.php?title=ICF describes some DWARF extensions to handle this, but I haven't looked to see the status of these, or to see if there are alternative solutions.

@khuey
Copy link
Contributor Author

khuey commented Jan 20, 2023

I'm pretty confident that that particular proposal never went anywhere.

@khuey
Copy link
Contributor Author

khuey commented Jan 20, 2023

That looks good assuming it's possible for the consumer to easily store the callbacks for use after the asynchronous load (it's probably okay, just not something I'm sure about).

This compiles as expected

async fn do_loads<R>(loads: Vec<SplitDwarfLoad<R>>) -> Vec<Option<Arc<gimli::Dwarf<R>>>> {
    unimplemented!();
}

#[tokio::main]
async fn main() {
    let map = find_debuginfo();
    let file = &object::File::parse(&*map).unwrap();
    let ctx = Context::new(file).unwrap();
    let (loads, continuations) = ctx.preload_units(0xdeadbeef).fold((Vec::new(), Vec::new()), |(mut vec1, mut vec2), (thing1, thing2)| {
        vec1.push(thing1);
        vec2.push(thing2);
        (vec1, vec2)
    });
    let load_results = do_loads(loads).await;
    //mem::drop(ctx); // E0505 can't move out of ctx because it's borrowed (by continuations)
    continuations.into_iter().zip(load_results.into_iter()).try_for_each(|(callback, result)| callback(result)).unwrap();
}

@khuey
Copy link
Contributor Author

khuey commented Jan 20, 2023

I think in theory it might be possible to use the DWARF 5 call site information to pick the correct ICFed subprogram but neither gdb nor lldb do this and it would require a different addr2line API anyways (because you'd need to infer which subprogram you're looking at based off its caller).

@khuey
Copy link
Contributor Author

khuey commented Jan 20, 2023

(This will need a gimli release for me to point the PR at before CI will pass, if that's not clear)

@philipc
Copy link
Contributor

philipc commented Jan 23, 2023

I've released gimli 0.27.1

@khuey
Copy link
Contributor Author

khuey commented Jan 23, 2023

Ok, the final issue remaining is the MSRV test. You're testing for an MSRV of 1.42. The ControlFlow enum this PR is using was introduced in 1.55. So we either need to bump the MSRV or replace the enum with our own copy. I don't know how or why the MSRV was chosen so I don't know whether or not we can bump the MSRV.

@philipc
Copy link
Contributor

philipc commented Jan 23, 2023

Bumping the MSRV is ok if it's not too recent. We don't make any promises about it yet. 1.55 is well over a year old, and object is already using 1.52, so 1.55 seems fine.

@khuey
Copy link
Contributor Author

khuey commented Jan 23, 2023

Alright I think we're good to go here.

We can't actually test split DWARF support in CI until rust-lang/cargo#11572, rust-lang/rust#106811, and rust-lang/rust#106904 are merged and percolate into releases but I have tested this myself and it works.

If you're happy with everything I'll squash it down into one commit with a sensible commit message.

@philipc
Copy link
Contributor

philipc commented Jan 24, 2023

This looks fine but I haven't done a detailed review yet. I'll try to soon.

src/lib.rs Outdated
.map(|s| s.to_string_lossy())
.transpose()?
{
path_push(&mut dwo_name, &dwo_suffix);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should be treating paths as UTF-8 strings here. It's passable (but not ideal) in render_file because that can be used simply for display purposes, but here the path is only used for file system access. Can we pass the raw comp_dir and dwo_name instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've made this change. It has downstream effects on builtin_split_dwarf_loader. It's not entirely clear to me what the right thing to do is on Windows (and I don't have a Windows setup to test on) so I've restricted builtin_split_dwarf_loader to unix platforms now.

src/lib.rs Outdated Show resolved Hide resolved
src/lib.rs Show resolved Hide resolved
src/function.rs Outdated Show resolved Hide resolved
src/lib.rs Show resolved Hide resolved
src/lib.rs Show resolved Hide resolved
src/lib.rs Show resolved Hide resolved
@khuey
Copy link
Contributor Author

khuey commented Jan 27, 2023

Coverage is busted again but everything else is green.

@khuey khuey requested a review from philipc January 27, 2023 18:30
Copy link
Contributor

@philipc philipc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@khuey
Copy link
Contributor Author

khuey commented Jan 28, 2023

Do you want me to squash everything?

@philipc
Copy link
Contributor

philipc commented Jan 28, 2023

Yes I'll wait for you to squash.

This changes most of the addr2line API entry points to take a continuation-
based approach that returns either the result or a continuation that can be
resumed once the appropriate split DWARF file is loaded. Actually loading
the split DWARF is deferred to the caller, just as it is for the main and
supplementary debug files.

Because the `gimli::Dwarf` for split DWARF varies by unit, the `Context::dwarf`
accessor is now gone. Instead `Context::find_dwarf_and_unit` returns the
correct `gimli::Dwarf` and `gimli::Unit` pair for a given address. The
`find_frames` API is more or less the same, just in continuation form now.

Callers that do not care to support split DWARF can simply import the
`LookupResultExt` trait and call `skip_all_loads` on the results of
continuation-based APIs to skip loading any split DWARF and immediately
retrieve whatever value can be produced in the absence of that split DWARF.

Internally there have been significant refactorings.

The existing treatment of a supplementary file as a full `ResDwarf` is
unnecessary. Supplementary files are only relevant to addr2line because they
may provide function names. Supplementary files by definition *cannot*
contain any information related to the layout of the binary, since they
are shared among multiple binaries. (In other words the `unit_ranges`
for a supplementary file is always empty). In practice the full `ResUnit`
for a partial compliation units in the supplementary file is also unnecessary,
though that's not changed here.

Split the parsed components from the gimli components, effectively undoing
the introduction of `ResDwarf`. The parsed components for the primary debug
file live in `ParsedDwarf`, along with an optional vector of `ResUnit`s from
the supplementary file. The gimli components now live in `RawDwarf`.

This allows `RawDwarf` to be trivially clonable, and `ParsedDwarf` is completely
independent of anything that would be found in a DWO/DWP.
`ResUnit::dwarf_and_unit_dwo` handles all of the details of invoking the loader
when appropriate and caching its results. Code paths that are going to examine
data that is affected by split DWARF (e.g. anything with functions) call this
function to get the correct `RawDwarf` and `gimli::Unit` to use.

The MSRV is now 1.55.0.
@khuey
Copy link
Contributor Author

khuey commented Jan 28, 2023

Squashed.

@philipc philipc merged commit d48b026 into gimli-rs:master Jan 29, 2023
@philipc
Copy link
Contributor

philipc commented Jan 29, 2023

Thanks for the detailed commit message.

The existing treatment of a supplementary file as a full ResDwarf is
unnecessary. Supplementary files are only relevant to addr2line because they
may provide function names. Supplementary files by definition cannot
contain any information related to the layout of the binary, since they
are shared among multiple binaries. (In other words the unit_ranges
for a supplementary file is always empty).

True, but the reason it was done this way wasn't for the ranges, but so that name_entry could be reused without caring what type of file it was, whereas now we have to pass around a DebugFile parameter. The new way is fine, although it does still try parsing ranges in the supplementary file.

This allows RawDwarf to be trivially clonable

Is this used anywhere?

@khuey
Copy link
Contributor Author

khuey commented Jan 29, 2023

This allows RawDwarf to be trivially clonable

Is this used anywhere?

RawDwarf::dwo (it doesn't actually call clone, which RawDwarf doesn't actually implement, but it's more or less equivalent. I should have been more precise in my vocabulary).

@philipc
Copy link
Contributor

philipc commented Jan 31, 2023

What's the benefit of using ControlFlow for LookupResult? If we didn't do that, then we wouldn't need an extension trait for skip_all_loads.

@khuey
Copy link
Contributor Author

khuey commented Jan 31, 2023

The primary upside over defining our own enum is that ControlFlow will eventually work out of the box with ? and will probably eventually grow some sort of "run to completion" API. I don't feel very strongly about it right now.

facebook-github-bot pushed a commit to facebookexperimental/reverie that referenced this pull request Dec 4, 2023
Summary:
This release includes gimli-rs/addr2line#260 which unblocks supporting Split DWARF in reverie and cadaverdog.

Separately, ayermolo is looking into whether this gimli update fixes an issue with an iOS test that uses lionhead, and uses gimli to dump debuginfo, which is blocking the land of DWARF5.

Reviewed By: jasonwhite

Differential Revision: D51814926

fbshipit-source-id: ae0882432019250060c65792476a0f6e296daffb
@mstange
Copy link
Contributor

mstange commented Apr 13, 2024

The API added here is working well for my use case in samply-symbols / wholesym. One wrinkle I ran into was that I have to call find_frames again after each file read. I can't keep around the lookup continuation across file reads, because the file reads are asynchronous, and I'm running them in a multi-threaded executor, and Context isn't Sync so I can't keep a reference to it across await points.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants