Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

uaddr(), usym(), ustack to support PIE ASLR #75

Closed
brendangregg opened this issue Sep 11, 2018 · 7 comments
Closed

uaddr(), usym(), ustack to support PIE ASLR #75

brendangregg opened this issue Sep 11, 2018 · 7 comments
Labels
bug Something isn't working

Comments

@brendangregg
Copy link
Contributor

Ubuntu 18.04 Bionic (and other OSes) have switched to randomizing the address space layout, which breaks simple approaches for symbol resolution. From https://wiki.ubuntu.com/BionicBeaver/ReleaseNotes#Security_Improvements:

In Ubuntu 18.04 LTS, gcc is now set to default to compile applications as position independent executables (PIE) as well as with immediate binding, to make more effective use of Address Space Layout Randomization (ASLR).

The bpftrace uaddr() call needs to work on both normal executables, as well as PIE executables (gcc -pie -fpie). Here's how to tell the difference:

# file uaddr-old uaddr-pie
uaddr-old: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 2.6.32, BuildID[sha1]=1babcdea1d0220ae6982428da4e7e4c665c587d7, not stripped
uaddr-pie: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 2.6.32, BuildID[sha1]=01419c4c8cddc834734c552097af169b83b7d77e, not stripped

From the above output, uaddr-old is an "executable", whereas uaddr-pie is a "shared object".

You can also see this in the address space of a running process:

# pmap -x `pgrep -n uaddr-old` | head
30157:   ./uaddr-old
Address           Kbytes     RSS   Dirty Mode  Mapping
0000000000400000       4       4       0 r-x-- uaddr-old
0000000000400000       0       0       0 r-x-- uaddr-old
0000000000600000       4       4       4 r---- uaddr-old
0000000000600000       0       0       0 r---- uaddr-old
0000000000601000       4       4       4 rw--- uaddr-old
0000000000601000       0       0       0 rw--- uaddr-old
00007ff992637000    1792     804       0 r-x-- libc-2.23.so
00007ff992637000       0       0       0 r-x-- libc-2.23.so

# pmap -x `pgrep -n uaddr-pie` | head
30158:   ./uaddr-pie
Address           Kbytes     RSS   Dirty Mode  Mapping
0000561202e7b000       4       4       0 r-x-- uaddr-pie
0000561202e7b000       0       0       0 r-x-- uaddr-pie
000056120307b000       4       4       4 r---- uaddr-pie
000056120307b000       0       0       0 r---- uaddr-pie
000056120307c000       4       4       4 rw--- uaddr-pie
000056120307c000       0       0       0 rw--- uaddr-pie
00007feec176e000    1792     776       0 r-x-- libc-2.23.so
00007feec176e000       0       0       0 r-x-- libc-2.23.so

Which means techniques like objdump no longer work:

# objdump -tT uaddr-old | grep my
0000000000400500 l     F .text	0000000000000000              frame_dummy
0000000000600e10 l     O .init_array	0000000000000000              __frame_dummy_init_array_entry
0000000000400526 g     F .text	0000000000000011              mysleep
0000000000601038 g     O .data	0000000000000008              mystring

# objdump -tT uaddr-pie | grep my
0000000000000740 l     F .text	0000000000000000              frame_dummy
0000000000200de0 l     O .init_array	0000000000000000              __frame_dummy_init_array_entry
0000000000000770 g     F .text	0000000000000011              mysleep
0000000000201038 g     O .data	0000000000000008              mystring

However:

# bpftrace -e 'uprobe:/root/uaddr-old:mysleep { printf("hit at %llx\n", reg("ip")); }'
Attaching 1 probe...
hit at 400526
hit at 400526
^C

# bpftrace -e 'uprobe:/root/uaddr-pie:mysleep { printf("hit at %llx\n", reg("ip")); }'
Attaching 1 probe...
hit at 561202e7b770
hit at 561202e7b770
^C

uprobe already works for both!

uprobe uses bcc_resolve_symname() to get the offset. Maybe we can do the same here, since it seems to already deal with PIE.

@brendangregg
Copy link
Contributor Author

This probably requires #59 to be fixed first. Here's why:

# gdb -p `pgrep uaddr`
[...]
(gdb) x/s (char *)mystring
0x55a801fbd704:	"abcdef123\n"

Ok, then trying that address:

# ./src/bpftrace -e 'uprobe:/root/uaddr:mysleep { printf("hi %s\n", str(*0x55a801fbd704)); }'
Attaching 1 probe...
hi
hi

doesn't work, because:

# ./src/bpftrace -d -e 'uprobe:/root/uaddr:mysleep { printf("hi %s\n", str(*0x55a801fbd704)); }'
Program
 uprobe:/root/uaddr:mysleep
  call: printf
   string: hi %s\n
   call: str
    dereference
     int: 33281796

It's turned that 64-bit address (0x55a801fbd704) into a 32-bit number.

@mmarchini I wonder if you hit this while you were debugging as well...

@mmarchini
Copy link
Contributor

That makes sense. Maybe fixing #59 will fix this as well.

@brendangregg
Copy link
Contributor Author

I fixed #59, and can now read the string:

# echo 'print (char *)mystring' | gdb -q -p `pgrep -nx uaddr`
Attaching to process 21092
Reading symbols from /root/uaddr...(no debugging symbols found)...done.
Reading symbols from /lib/x86_64-linux-gnu/libc.so.6...(no debugging symbols found)...done.
Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging symbols found)...done.
0x00007fc07a9529a4 in nanosleep () from /lib/x86_64-linux-gnu/libc.so.6
(gdb) $1 = 0x5576c4a28704 "abcdef123\n"
[...]

# ./src/bpftrace -de 'uprobe:/root/uaddr:mysleep { printf("%s\n", str(*0x5576c4a28704)); }'
Program
 uprobe:/root/uaddr:mysleep
  call: printf
   string: %s\n
   call: str
    dereference
     int: 93968593487620
[...]

# ./src/bpftrace -e 'uprobe:/root/uaddr:mysleep { $p = 0x5576c4a28704; printf("%llx is: %s\n", $p, str($p)); }'
Attaching 1 probe...
5576c4a28704 is: abcdef123

5576c4a28704 is: abcdef123

5576c4a28704 is: abcdef123
[...]

Note I'm not dereferencing (*) the address since gdb has given me the direct address rather than a pointer. Maybe the info address gdb command would be better:

(gdb) info address mystring
Symbol "mystring" is at 0x5576c4c29010 in a file compiled without debugging.
[...]

# ./src/bpftrace -e 'uprobe:/root/uaddr:mysleep { $p = 0x5576c4c29010; printf("%llx is: %s\n", $p, str(*$p)); }'
Attaching 1 probe...
5576c4c29010 is: abcdef123

5576c4c29010 is: abcdef123

@brendangregg
Copy link
Contributor Author

As part of fixing uaddr() for PIE ASLR, since it probably involves switching to bcc_resolve_symname(), I'd also improve the error message when the symbol can't be found. It's currently:

# ./src/bpftrace -e 'uprobe:/bin/bash:readline { printf("hi %x\n", uaddr("xxxxxxx")); }'
terminate called after throwing an instance of 'std::invalid_argument'
  what():  stoull
Aborted

I'd file this as a separate ticket, but I think the code will all change anyway to support PIE.

@brendangregg brendangregg changed the title uaddr() to support PIE ASLR uaddr(), usym(), ustack to support PIE ASLR Sep 26, 2019
@brendangregg
Copy link
Contributor Author

I believe I hit something related using ustack. The reproducing workload is:

while :; do sleep 1; done

And tracing it:

# bpftrace -e 'u:/lib/x86_64-linux-gnu/libc.so.6:*nanosleep* /comm == "sleep"/ {
    printf("%s %s\n", usym(reg("ip")), ustack); }'
Attaching 4 probes...
__GI___nanosleep 
        __GI___nanosleep+0

__GI___nanosleep 
        __GI___nanosleep+0

0x7fe67d777990 
        0x7fe67d777990

0x7fe67d777990 
        0x7fe67d777990

0x7fac73cd5990 
        0x7fac73cd5990

0x7fac73cd5990 
        0x7fac73cd5990

Note that the first symbol lookups work, but subsequent ones do not. It sounds like we've cached the symbol addresses for libc in a way that doesn't account for PIE ASLR randomization. It works the first time, but on the second time those symbols are at different addresses.

@mmisono
Copy link
Collaborator

mmisono commented Nov 19, 2019

I hit this issue recently and investigated.

As of uaddr(), I found a workaround. The base address of ASLR can be obtained by consulting vm_area_struct. For example:

% bpftrace  --include linux/sched.h -e 'uprobe:/bin/bash:readline { printf("PS1: %s\n", str(*(curtask->mm->mmap->vm_start + uaddr("ps1_prompt")))); }'
Attaching 1 probe...
PS1: \[\e]0;\u@\h: \w\a\]${debian_chroot:+($debian_chroot)}\[\033[01

This assumes that the start address of the first entry of the vm_area_struct (this is the same as the first entry of /proc/pid/maps/) is the base address of the program. I guess generally this holds but is not certain. To deal with properly, further consideration will be needed.
As mentioned earlier, another possible solution might be using bcc_resolve_symname, but according to #811, it seems bcc_resolve_symname is not for a general-purpose (e.g., cannot resolve global variables).

As of ustack() and usym(), bpftrace caches bcc's SymbolCache object using the executable name as a key. To fix this issue, bpftrace needs to stop caching when the target file is PIE ASLR. Perhaps we can use the ASLR base address as a cache key, although I'm not sure how to get the address easily (consulting /proc/pid/maps?)

mmisono added a commit to mmisono/bpftrace that referenced this issue Dec 26, 2019
`resolve_usym()` caches a `bcc_symbol` object using an executable name
as a key, but on the ASLR-enabled platform, symbol addresses change with
each execution. Disable (discard) a cache, in this case, to resolve
symbol names properly.

Note and known issues:

- A cache is discarded whenever resolve_usym is called even if a pid is
the same as the old one. This is because pid may be reused.
- This does not check whether a binary is PIE ASLAR or not. Note that
even if a binary is not PIE ASLR, addresses of shared libraries are
randomized if ASLR is enabled.  (If a binary is not PIE ASLR and
`resolve_usym()` resolves symbol in a binary, we can utilize a cache.)
- If ASLR is disabled on the first execution but enabled on the second
execution, `resolve_usym()` for the second run will use the previous
cache.
- I'm not sure how much performance impact this has. If the impact is
huge, maybe this should be an option.
- As discussed in bpftrace#246, symbolizing will fail after process termination
(this is a separate issue). For example:

```
% bpftrace -e 'u:/lib/x86_64-linux-gnu/libc.so.6:*nanosleep* /comm == "sleep"/ { @[ustack] = count(); }'
Attaching 7 probes...
^C

@[
    0x7ff1917cb990
]: 3
@no such file or directory: /proc/3557/personality
[
    0x7fea4211c990
]: 3
@no such file or directory: /proc/3554/personality
[
    0x7f32bc51a990
]: 3
```

-----

Closes bpftrace#1031 and solves the second part of bpftrace#75.
mmisono added a commit to mmisono/bpftrace that referenced this issue Jan 5, 2020
… given

`resolve_usym()` caches a `bcc_symbol` object using an executable name
as a key, but on the ASLR-enabled platform, symbol addresses change with
each execution. Disable (discard) a cache, in this case, to resolve
symbol names properly.

Introduce `BPFTRACE_CACHE_USER_SYMBOLS` env variable to force caching.
Caching is fine if only trace one program execution.

Note and known issues:

- A cache is discarded whenever resolve_usym is called even if a pid is
the same as the old one. This is because pid may be reused.
- This does not check whether a binary is PIE ASLAR or not. Note that
even if a binary is not PIE ASLR, addresses of shared libraries are
randomized if ASLR is enabled.  (If a binary is not PIE ASLR and
`resolve_usym()` resolves symbol in a binary, we can utilize a cache.)
- If ASLR is disabled on the first execution but enabled on the second
execution, `resolve_usym()` for the second run will use the previous
cache.
- As discussed in bpftrace#246, symbolizing will fail after process termination
(this is a separate issue). For example:

```
% bpftrace -e 'u:/lib/x86_64-linux-gnu/libc.so.6:*nanosleep* /comm == "sleep"/ { @[ustack] = count(); }'
Attaching 7 probes...
^C

@[
    0x7ff1917cb990
]: 3
@
[
    0x7fea4211c990
]: 3
@
[
    0x7f32bc51a990
]: 3
```

-----

Closes bpftrace#1031 and solves the second part of bpftrace#75.
mmisono added a commit to mmisono/bpftrace that referenced this issue Jan 5, 2020
… given

`resolve_usym()` caches a `bcc_symbol` object using an executable name
as a key, but on the ASLR-enabled platform, symbol addresses change with
each execution. Disable (discard) a cache, in this case, to resolve
symbol names properly.

Introduce `BPFTRACE_CACHE_USER_SYMBOLS` env variable to force caching.
Caching is fine if only trace one program execution.

Note and known issues:

- A cache is discarded whenever resolve_usym is called even if a pid is
the same as the old one. This is because pid may be reused.
- This does not check whether a binary is PIE ASLAR or not. Note that
even if a binary is not PIE ASLR, addresses of shared libraries are
randomized if ASLR is enabled.  (If a binary is not PIE ASLR and
`resolve_usym()` resolves symbol in a binary, we can utilize a cache.)
- If ASLR is disabled on the first execution but enabled on the second
execution, `resolve_usym()` for the second run will use the previous
cache.
- As discussed in bpftrace#246, symbolizing will fail after process termination
(this is a separate issue). For example:

```
% bpftrace -e 'u:/lib/x86_64-linux-gnu/libc.so.6:*nanosleep* /comm == "sleep"/ { @[ustack] = count(); }'
Attaching 7 probes...
^C

@[
    0x7ff1917cb990
]: 3
@
[
    0x7fea4211c990
]: 3
@
[
    0x7f32bc51a990
]: 3
```

-----

Closes bpftrace#1031 and solves the second part of bpftrace#75.
mmisono added a commit to mmisono/bpftrace that referenced this issue Jan 21, 2020
… given

`resolve_usym()` caches a `bcc_symbol` object using an executable name
as a key, but on the ASLR-enabled platform, symbol addresses change with
each execution. Disable (discard) a cache, in this case, to resolve
symbol names properly.

Introduce `BPFTRACE_CACHE_USER_SYMBOLS` env variable to force caching.
Caching is fine if only trace one program execution.

Note and known issues:

- A cache is discarded whenever resolve_usym is called even if a pid is
the same as the old one. This is because pid may be reused.
- This does not check whether a binary is PIE ASLAR or not. Note that
even if a binary is not PIE ASLR, addresses of shared libraries are
randomized if ASLR is enabled.  (If a binary is not PIE ASLR and
`resolve_usym()` resolves symbol in a binary, we can utilize a cache.)
- If ASLR is disabled on the first execution but enabled on the second
execution, `resolve_usym()` for the second run will use the previous
cache.
- As discussed in bpftrace#246, symbolizing will fail after process termination
(this is a separate issue). For example:

```
% bpftrace -e 'u:/lib/x86_64-linux-gnu/libc.so.6:*nanosleep* /comm == "sleep"/ { @[ustack] = count(); }'
Attaching 7 probes...
^C

@[
    0x7ff1917cb990
]: 3
@
[
    0x7fea4211c990
]: 3
@
[
    0x7f32bc51a990
]: 3
```

-----

Closes bpftrace#1031 and solves the second part of bpftrace#75.
mmisono added a commit to mmisono/bpftrace that referenced this issue Jan 23, 2020
… given

`resolve_usym()` caches a `bcc_symbol` object using an executable name
as a key, but on the ASLR-enabled platform, symbol addresses change with
each execution. Disable (discard) a cache, in this case, to resolve
symbol names properly.

Introduce `BPFTRACE_CACHE_USER_SYMBOLS` env variable to force caching.
Caching is fine if only trace one program execution.

Note and known issues:

- A cache is discarded whenever resolve_usym is called even if a pid is
the same as the old one. This is because pid may be reused.
- This does not check whether a binary is PIE ASLAR or not. Note that
even if a binary is not PIE ASLR, addresses of shared libraries are
randomized if ASLR is enabled.  (If a binary is not PIE ASLR and
`resolve_usym()` resolves symbol in a binary, we can utilize a cache.)
- If ASLR is disabled on the first execution but enabled on the second
execution, `resolve_usym()` for the second run will use the previous
cache.
- As discussed in bpftrace#246, symbolizing will fail after process termination
(this is a separate issue). For example:

```
% bpftrace -e 'u:/lib/x86_64-linux-gnu/libc.so.6:*nanosleep* /comm == "sleep"/ { @[ustack] = count(); }'
Attaching 7 probes...
^C

@[
    0x7ff1917cb990
]: 3
@
[
    0x7fea4211c990
]: 3
@
[
    0x7f32bc51a990
]: 3
```

-----

Closes bpftrace#1031 and solves the second part of bpftrace#75.
fbs pushed a commit that referenced this issue Feb 12, 2020
… given

`resolve_usym()` caches a `bcc_symbol` object using an executable name
as a key, but on the ASLR-enabled platform, symbol addresses change with
each execution. Disable (discard) a cache, in this case, to resolve
symbol names properly.

Introduce `BPFTRACE_CACHE_USER_SYMBOLS` env variable to force caching.
Caching is fine if only trace one program execution.

Note and known issues:

- A cache is discarded whenever resolve_usym is called even if a pid is
the same as the old one. This is because pid may be reused.
- This does not check whether a binary is PIE ASLAR or not. Note that
even if a binary is not PIE ASLR, addresses of shared libraries are
randomized if ASLR is enabled.  (If a binary is not PIE ASLR and
`resolve_usym()` resolves symbol in a binary, we can utilize a cache.)
- If ASLR is disabled on the first execution but enabled on the second
execution, `resolve_usym()` for the second run will use the previous
cache.
- As discussed in #246, symbolizing will fail after process termination
(this is a separate issue). For example:

```
% bpftrace -e 'u:/lib/x86_64-linux-gnu/libc.so.6:*nanosleep* /comm == "sleep"/ { @[ustack] = count(); }'
Attaching 7 probes...
^C

@[
    0x7ff1917cb990
]: 3
@
[
    0x7fea4211c990
]: 3
@
[
    0x7f32bc51a990
]: 3
```

-----

Closes #1031 and solves the second part of #75.
@mmarchini mmarchini added the bug Something isn't working label Feb 17, 2022
@jordalgo jordalgo removed the bcc label Dec 12, 2023
@viktormalik
Copy link
Contributor

I believe that this is now fixed by #2386 (at least to the point where we can't do much more to resolve symbols).

Closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants