Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BPF: Add helpers for BPF links and BTF objects, update tools/ script to work with Linux 5.10+ #152

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

qmonnet
Copy link
Contributor

@qmonnet qmonnet commented Jan 29, 2022

bpf_prog.aux.member_("trampoline") is no longer available starting from kernel 5.10, and the BPF tool script fails to display the linked functions properly on new systems. This PR fixes the issue by retrieving the link and tracing link associated to the program, to access the target.

Since we need to iterate over BPF links to do that, add the relevant iterator; and do one for BTF links as well, since they're all pretty similar. Update the BPF tool script so it can use new those iterators to list links and BTF objects.

Please refer to individual commit descriptions for more details.

Copy link
Owner

@osandov osandov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix! I left a few minor comments.

(This is a good reminder that the BPF helpers need unit tests so that we catch version breakages, but that's for another PR.)

drgn/helpers/linux/bpf.py Outdated Show resolved Hide resolved
drgn/helpers/linux/bpf.py Outdated Show resolved Hide resolved
tools/bpf_inspect.py Outdated Show resolved Hide resolved
tools/bpf_inspect.py Outdated Show resolved Hide resolved
tools/bpf_inspect.py Outdated Show resolved Hide resolved
@qmonnet
Copy link
Contributor Author

qmonnet commented Feb 21, 2022

Thanks a lot for the feedback and review, and apologies for the delay. I'm still working on this PR. I fixed the issues you reported some time ago, but so far I've been unable to find the time to test the latest changes properly. I'm pushing my code anyway but marking as draft, until I get a chance to check that it works correctly.

@osandov
Copy link
Owner

osandov commented Jul 22, 2022

Hi @qmonnet, I'm catching up on my backlog of PRs and I'm revisiting this one. I just merged the first commit from this PR adding the bpf_link_for_each() and bpf_btf_for_each() helpers: 764a858. I also added test cases for all of the BPF helpers: 43f045a.

For the second commit fixing bpf_inspect.py, what sort of testing did you want to do? Now that the unit tests can test BPF, maybe we can add a test case there.

For the third commit, do the link and btf printing commands print anything that isn't available via bpftool? If so, it may not make sense to maintain that code in bpf_inspect.py, too, since its charter is to "list ... properties unavailable via kernel API".

@qmonnet
Copy link
Contributor Author

qmonnet commented Jul 23, 2022

Hi @qmonnet, I'm catching up on my backlog of PRs and I'm revisiting this one. I just merged the first commit from this PR adding the bpf_link_for_each() and bpf_btf_for_each() helpers: 764a858. I also added test cases for all of the BPF helpers: 43f045a.

Thanks a lot! Sorry for failing to follow-up here. The new tests look neat!

For the second commit fixing bpf_inspect.py, what sort of testing did you want to do? Now that the unit tests can test BPF, maybe we can add a test case there.

I wanted to make sure that printing trampolines/target programs would still work as expected on “all” kernel versions. We've got 3 cases after this patch: pre-5.5, 5.5 <= x < 5.10, and >= 5.10. I tested the last case (with 5.15) on my laptop and created a VM to try the first case (5.4), but I never found the time to create a second VM to test the remaining case. Although, given that we have not changed how the script behaves for kernels < 5.10 and that it works as expected on 5.4, I wouldn't expect too many bad surprises and maybe we're good to (rebase and) merge this change.

It would probably be a good idea to have a test with showing programs with the helper, and checking the target program when used with a trampoline. I haven't dug enough yet to check if you had this already, or to check how you are running your CI. Do you cover several kernel versions?

For the third commit, do the link and btf printing commands print anything that isn't available via bpftool? If so, it may not make sense to maintain that code in bpf_inspect.py, too, since its charter is to "list ... properties unavailable via kernel API".

The listing from link and btf should both be available from bpftool, that's correct. Listing programs and maps also have some overlap with bpftool, so I thought it would be nice to have links and BTF in drgn too, for the sake of completeness, given that it was already able to show some of the BPF objects.

Obviously that's your call. If you prefer to leave them aside, I guess we could skip that patch.

Since Linux 5.10 and commit 3aac1ead5eb6 ("bpf: Move
prog->aux->linked_prog and trampoline into bpf_link on attach"), the
kernel "struct bpf_prog_aux" no longer has a "trampoline" attribute.
Instead, it got a "dst_trampoline" attribute, but the latter cannot be
used as a drop-in remplacement, because the reference from the tracing
program to the target is removed after attaching (and this pointer is
NULL).

To retrieve the target program, we must instead loop on the existing
links, find their parent "struct bpf_tracing_link" object, and get the
target program from there. Let's adjust the script accordingly, but keep
compatibility for older kernels too.

Signed-off-by: Quentin Monnet <quentin@isovalent.com>
@qmonnet
Copy link
Contributor Author

qmonnet commented Jul 23, 2022

[Rebased on current main.]

Add two commands to the BPF script, "b" and "l", to list the BTF objects
and the BPF link (respectively) loaded on the system. These commands
make use of the recently introduced helpers for iterating over those
objects.

Older kernels do not have eBPF links, so make sure we handle the lookup
exception properly.

Signed-off-by: Quentin Monnet <quentin@isovalent.com>
@osandov
Copy link
Owner

osandov commented Jul 24, 2022

I wanted to make sure that printing trampolines/target programs would still work as expected on “all” kernel versions. We've got 3 cases after this patch: pre-5.5, 5.5 <= x < 5.10, and >= 5.10. I tested the last case (with 5.15) on my laptop and created a VM to try the first case (5.4), but I never found the time to create a second VM to test the remaining case. Although, given that we have not changed how the script behaves for kernels < 5.10 and that it works as expected on 5.4, I wouldn't expect too many bad surprises and maybe we're good to (rebase and) merge this change.

It would probably be a good idea to have a test with showing programs with the helper, and checking the target program when used with a trampoline. I haven't dug enough yet to check if you had this already, or to check how you are running your CI. Do you cover several kernel versions?

Yup, the CI tests a bunch of kernel versions: https://github.com/osandov/drgn/blob/main/setup.py#L134. The design is documented here: https://github.com/osandov/drgn/tree/main/vmtest.

drgn's test suite doesn't have any test cases for tools yet, so that's something I'd like to figure out. If you can give me an example of how to create some BPF trampolines (perhaps with BPF_TRACE_FENTRY programs?), I can adapt that into a test. I'm imagining something like doing some bpf(2) calls to create the programs like we do for the helper tests, then executing the tool and checking its output.

@qmonnet
Copy link
Contributor Author

qmonnet commented Jul 30, 2022

Yup, the CI tests a bunch of kernel versions: https://github.com/osandov/drgn/blob/main/setup.py#L134. The design is documented here: https://github.com/osandov/drgn/tree/main/vmtest.

Thanks for the pointers

If you can give me an example of how to create some BPF trampolines (perhaps with BPF_TRACE_FENTRY programs?)

I don't have a minimal example to point you to :/ and it seems to be a bit more involved. All examples I can think of are using libbpf (usually with BPF skeletons). For testing here on my side, I ran profiled an eBPF program (any should do) with bpftool (bpftool prog profile id 685 cycles). This attached new programs with fentry/fexit.

I think that in libbpf, it calls attach_trace() and from there bpf_program__attach_btf_id(). In that function we create a BPF link, open a raw tracepoint for the program (bpf_raw_tracepoint_open(), attach its fd to the link. There's also some BTF involved before all that I think, we need to pass the relevant BTF id when loading the fentry program (so before attaching).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants