-
Notifications
You must be signed in to change notification settings - Fork 156
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue #40 -- Support DWARF-in-PE #69
Conversation
Thanks for the PR! This looks good, but goblin recently added support for decoding these section names too (m4b/goblin#100), and it handles base64 too. So we probably should wait for a goblin release and use it. Or if you want to temporarily change object to use a git version of goblin then that would be fine too. |
Ok, that's good news. Goblin is right place to add this support. I have a workaround in place now so I'm not in a big hurry for this support. I'm fine waiting for goblin to release. Shall we kill this PR once goblin is released? |
Yeah, let's leave this open until goblin is released so that it isn't lost. |
Hi @gsolberg , goblin has been updated. It looks like we just need to update |
DWARF sections names in PE files are stored indirectly through a symbol table. Names of the form "/n" where "n" is a number indicate the section name is stored in index "n" of a hidden symbol table. The hidden symbol table immediately follows the COFF symbol table and is a sequence of null terminated CStrings.
The goblin update seems to work. The remaining changes here are to use the actual segment size instead of the allocated size and to implement |
src/pe.rs
Outdated
@@ -112,7 +112,7 @@ where | |||
if name == section_name { | |||
return Some(Cow::from( | |||
&self.data[section.pointer_to_raw_data as usize..] | |||
[..section.size_of_raw_data as usize], | |||
[..section.virtual_size as usize], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change doesn't seem right to me. My understanding is that size_of_raw_data
is the number of bytes in the file, and if virtual_size
is larger than this then it needs zero padding. We haven't done zero padding for other file formats, but if it's something you need then we can consider it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll admit I don't really know what virtual_size
is supposed to represent. In my case I have found that virtual_size
is always less than size_of_raw_data
. My test case uses the gimli example program dwarfdump to dump a DLL I've created with gcc. If I use size_of_raw_data
I get an UnexpectedEof
error from gimly as it tries to parse the units in the section. Looking at the section data, it looks to me like the data after virtual_size
is just unused garbage filling out the actual section size of the file. I can see how this would also not work if virtual_size
was larger than size_of_raw_data
. I have only tested this on DLLs created with one version of gcc: "gcc.exe (x86_64-posix-seh-rev0, Built by MinGW-W64 project) 8.1.0". Is there some other number in the section that will give us a count of the bytes used instead of the "virtual" (potentially uncompressed?) size? Or maybe we use min(virtual_size, size_of_raw_data)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah interesting, I'll have to dig into this a bit to understand what's going on. Are you able to share that DLL with me, or provide a dump of the headers (I'm not familiar enough with windows to know what's good for that, maybe the rdr example from goblin can help)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm actually creating DLLs on Windows and parsing them on OS/X. Here's my simple test case.
On Windows:
$ echo "int foo = 0;" > foo.c
$ gcc -shared -o foo.dll foo.c
On OS/X using gimly compiled with this version of object:
$ RUST_BACKTRACE=1 cargo run --example dwarfdump -- foo.dll
I get correct data using virtual_size
and a UnexpectedEof
error using size_of_raw_data
. You can find a copy of my foo.dll here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should have read the doc more carefully:
SizeOfRawData: The size of the section (for object files) or the size of the initialized data on disk (for image files). For executable images, this must be a multiple of FileAlignment from the optional header. If this is less than VirtualSize, the remainder of the section is zero-filled. Because the SizeOfRawData field is rounded but the VirtualSize field is not, it is possible for SizeOfRawData to be greater than VirtualSize as well. When a section contains only uninitialized data, this field should be zero.
So I think the correct behaviour is to use the minimum. Maybe later we can consider zero padding too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Issue gimli-rs#40 -- Support DWARF-in-PE
These changes were sufficient to allow reading DWARF records on Mac, Linux, and Windows systems in my application. I found no formal documentation on how to read DWARF records from a PE file, but arrived at this solution by looking at other code and poking around in a PE file.
DWARF sections names in PE files are stored indirectly through a symbol table.
Names of the form "/n" where "n" is a number indicate the section name is
stored in index "n" of a hidden symbol table. The hidden symbol table immediately
follows the COFF symbol table and is a sequence of null terminated CStrings.