Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue #40 -- Support DWARF-in-PE #69

Merged
merged 3 commits into from
Oct 24, 2018
Merged

Conversation

gsolberg
Copy link

These changes were sufficient to allow reading DWARF records on Mac, Linux, and Windows systems in my application. I found no formal documentation on how to read DWARF records from a PE file, but arrived at this solution by looking at other code and poking around in a PE file.

DWARF sections names in PE files are stored indirectly through a symbol table.
Names of the form "/n" where "n" is a number indicate the section name is
stored in index "n" of a hidden symbol table. The hidden symbol table immediately
follows the COFF symbol table and is a sequence of null terminated CStrings.

@coveralls
Copy link

Coverage Status

Coverage remained the same at 0.0% when pulling fcb8002 on gsolberg:DWARF-in-PE into 18ff86c on gimli-rs:master.

@philipc
Copy link
Contributor

philipc commented Sep 12, 2018

Thanks for the PR! This looks good, but goblin recently added support for decoding these section names too (m4b/goblin#100), and it handles base64 too. So we probably should wait for a goblin release and use it. Or if you want to temporarily change object to use a git version of goblin then that would be fine too.

@gsolberg
Copy link
Author

Ok, that's good news. Goblin is right place to add this support. I have a workaround in place now so I'm not in a big hurry for this support. I'm fine waiting for goblin to release. Shall we kill this PR once goblin is released?

@philipc
Copy link
Contributor

philipc commented Sep 12, 2018

Yeah, let's leave this open until goblin is released so that it isn't lost.

@philipc
Copy link
Contributor

philipc commented Oct 15, 2018

Hi @gsolberg , goblin has been updated. It looks like we just need to update has_debug_symbols now. Are you able to do that and check it works?

Greg Solberg added 2 commits October 16, 2018 16:05
DWARF sections names in PE files are stored indirectly through a symbol table.
Names of the form "/n" where "n" is a number indicate the section name is
stored in index "n" of a hidden symbol table.  The hidden symbol table immediately
follows the COFF symbol table and is a sequence of null terminated CStrings.
@gsolberg
Copy link
Author

The goblin update seems to work. The remaining changes here are to use the actual segment size instead of the allocated size and to implement has_debug_symbols.

src/pe.rs Outdated
@@ -112,7 +112,7 @@ where
if name == section_name {
return Some(Cow::from(
&self.data[section.pointer_to_raw_data as usize..]
[..section.size_of_raw_data as usize],
[..section.virtual_size as usize],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change doesn't seem right to me. My understanding is that size_of_raw_data is the number of bytes in the file, and if virtual_size is larger than this then it needs zero padding. We haven't done zero padding for other file formats, but if it's something you need then we can consider it.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll admit I don't really know what virtual_size is supposed to represent. In my case I have found that virtual_size is always less than size_of_raw_data. My test case uses the gimli example program dwarfdump to dump a DLL I've created with gcc. If I use size_of_raw_data I get an UnexpectedEof error from gimly as it tries to parse the units in the section. Looking at the section data, it looks to me like the data after virtual_size is just unused garbage filling out the actual section size of the file. I can see how this would also not work if virtual_size was larger than size_of_raw_data. I have only tested this on DLLs created with one version of gcc: "gcc.exe (x86_64-posix-seh-rev0, Built by MinGW-W64 project) 8.1.0". Is there some other number in the section that will give us a count of the bytes used instead of the "virtual" (potentially uncompressed?) size? Or maybe we use min(virtual_size, size_of_raw_data)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah interesting, I'll have to dig into this a bit to understand what's going on. Are you able to share that DLL with me, or provide a dump of the headers (I'm not familiar enough with windows to know what's good for that, maybe the rdr example from goblin can help)?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm actually creating DLLs on Windows and parsing them on OS/X. Here's my simple test case.
On Windows:

$ echo "int foo = 0;" > foo.c
$ gcc -shared -o foo.dll foo.c

On OS/X using gimly compiled with this version of object:

$ RUST_BACKTRACE=1 cargo run --example dwarfdump -- foo.dll

I get correct data using virtual_size and a UnexpectedEof error using size_of_raw_data. You can find a copy of my foo.dll here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should have read the doc more carefully:

SizeOfRawData: The size of the section (for object files) or the size of the initialized data on disk (for image files). For executable images, this must be a multiple of FileAlignment from the optional header. If this is less than VirtualSize, the remainder of the section is zero-filled. Because the SizeOfRawData field is rounded but the VirtualSize field is not, it is possible for SizeOfRawData to be greater than VirtualSize as well. When a section contains only uninitialized data, this field should be zero.

So I think the correct behaviour is to use the minimum. Maybe later we can consider zero padding too.

Copy link
Contributor

@philipc philipc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@philipc philipc merged commit d1279ec into gimli-rs:master Oct 24, 2018
mcbegamerxx954 pushed a commit to mcbegamerxx954/object that referenced this pull request Jun 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants