Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

goblin 0.0.18 fails to parse dynsyms #111

Closed
endeav0r opened this issue Oct 21, 2018 · 11 comments
Closed

goblin 0.0.18 fails to parse dynsyms #111

endeav0r opened this issue Oct 21, 2018 · 11 comments

Comments

@endeav0r
Copy link

endeav0r commented Oct 21, 2018

I am parsing a simple binary (I have uploaded the binary here: http://reversing.io/goblin/loop-0 ).

I recently updated to goblin 0.0.18, and noticed some symbols were not being parsed. Elf.dynsyms.len() returns a length of zero. I ran the binary through bingrep, and dynsyms were parsed fine, but then noticed I was on bingrep 0.5.0. I upgraded bingrep to the latest version, and now I get the following:

                 0    WEAK       NOTYPE      _ITM_registerTMCloneTable                0x0                             0x0    
                 0    WEAK       FUNC        __cxa_finalize@@GLIBC_2.2.5              0x0                             0x0    
               530    GLOBAL     FUNC        _init                                    0x0     .init(11)               0x0    

Dyn Syms(0):
  
thread 'main' panicked at 'index out of bounds: the len is 0 but the index is 0', /Users/travis/build/rust-lang/rust/src/libcore/slice/mod.rs:2085:10
note: Run with `RUST_BACKTRACE=1` for a backtrace.

I am troubleshooting through goblin, but thought I would share the issue now.

@endeav0r
Copy link
Author

gnu_hash_len from @philipc's push here: f682b18 is returning 0 for this binary.

@philipc
Copy link
Collaborator

philipc commented Oct 21, 2018

That binary has no defined dynamic symbols, and 7 undefined symbols.

The hash table min_chain value is 1, which means that the defined dynamic symbols start at 1. That is, after the placeholder symbol at index 0. We could potentially change gnu_hash_len to return 1 instead of 0 in this case, but that's not going to gain you anything IMO. I should double check that the logic in gnu_hash_len is correct for an empty hash table though, and that it's not reading undefined values.

What we could do is find the maximum symbol index in the dyn/plt relocations if gnu_hash_len returns 0.

@m4b
Copy link
Owner

m4b commented Oct 21, 2018

I'm confused; the binary appears to have several imported symbols, which the dynamic linker would have to resolve; not finding the symbols (or returning 0, when there are 7) in this case seems like a regression, yes?

@philipc
Copy link
Collaborator

philipc commented Oct 21, 2018

Yes, it's a regression, but the question is, how do you determine how many imported symbols there are? As you know, the binary does not include a length for the dynamic symbol table. AFAIK the dynamic linker accesses the imported symbols via the index in relocations, so if we want to find them then we need to do the same.

@m4b
Copy link
Owner

m4b commented Oct 21, 2018

@philipc ah got it, so isn't the algo then simply going to be, take the relocation array (which we have already), create a count of unique symbols it references, and then SizeWith::size_with::<Sym>(ctx) * count for our dynsym len?

@m4b
Copy link
Owner

m4b commented Oct 21, 2018

I would just like to marinate for a moment that all of this tomfoolery is because the elf spec omits a simple count field :]

@endeav0r
Copy link
Author

This logic is for leaking symbols from amd64 binaries, but it may help: https://github.com/endeav0r/skua/blob/54bbddc42bb16a3a05263e4d85607955c3c90e0a/lib/leak/elf/mod.rs#L204

There's a structure in there somewhere that has the number of symbols.

@philipc
Copy link
Collaborator

philipc commented Oct 21, 2018

@m4b Yes, except you only need the maximum symbol index, not the count of unique symbols.

@endeav0r we're already doing that in hash_len, but this binary has DT_GNU_HASH, not DT_HASH.

@m4b
Copy link
Owner

m4b commented Oct 21, 2018

ok that's good, that means we can compute the size without allocating a hashmap or something; can probably just just do max_idx = std::cmp::max(max_idx, idx); ?

@m4b
Copy link
Owner

m4b commented Oct 24, 2018

@endeav0r ok new version 0.0.19 published, sorry about that :/ please let us know if you see anything else, hopefully this is resolved :)

@endeav0r
Copy link
Author

Thanks guys :) I'll give 0.0.19 a run here in the next few days.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants