Resolve unknown/undefined types for tracepoints #1485

viktormalik · 2020-08-26T06:10:12Z

If a type is defined by a typedef located in a non-included header, clang parser will fail. This change analyses error messages from the clang parser and retrieves the missing typedefs from BTF (if it is used). If all types cannot be resolved, it provides the user a hint: either to include BTF (running with --btf) or to include relevant headers.

Resolves #1154, resolves #730. It should also help with iovisor/kubectl-trace#109.

I was also wondering, if it would make sense to get rid of the --btf option completely. AFAIU, it is used for tracepoints only. Using BTF automatically whenever it is available would make most of the "unknown type name" errors disappear in default bpftrace setup.

Checklist

Language changes are updated in docs/reference_guide.md
User-visible and non-trivial changes updated in CHANGELOG.md
The new behaviour is covered by tests

mmisono · 2020-09-01T01:20:24Z

I found that if a struct has a member whose type is not in the BTF, then bpftrace hangs (though probably this does not happen in most cases.)

% ./src/bpftrace -e 'struct foo {aaa a;};  BEGIN { @ = ((struct foo*)0)->a}' -b
// hang

I was also wondering, if it would make sense to get rid of the --btf option completely.

I also think using BTF whenever it is available is reasonable. Is there any downside? possible slowdown?

viktormalik · 2020-09-02T07:34:12Z

I found that if a struct has a member whose type is not in the BTF, then bpftrace hangs (though probably this does not happen in most cases.)

Correct, thanks for noticing. Even though this shouldn't happen for tracepoints, hanging is not good. I fixed it, now it terminates and shows the corresponding error message.

I also think using BTF whenever it is available is reasonable. Is there any downside? possible slowdown?

I didn't notice any slowdown, in my local setup, the run-times are almost equal with and without BTF. Also, we're using BTF for all other probe categories, where a larger number of probes can be attached at a time (thus more BTF lookups), so I don't see any reason not to use it for tracepoints, too.

mmisono

Looks good to me, though it's better to have another review. I think it is OK to remove --btf.

src/clang_parser.cpp

danobi

LGTM overall with small comments, but my only concern is the string parsing of error messages. Not sure if error strings are part of clang API but I'd be worried if it wasn't. I didn't look around too hard in clang for alternative solutions so I'll take it on faith this is the best way.

danobi · 2020-09-02T18:28:31Z

src/clang_parser.cpp

    bool bail_on_error)
 {
+  error_msgs.clear();


Nit: callee clear could cause confusion later and is less flexible than caller clear. I'd prefer caller clear unless there's a strong reason.

danobi · 2020-09-02T18:36:25Z

src/clang_parser.cpp

+  // that imply an unresolved typedef of type_t. This cannot be done below in
+  // clang_visitChildren since clang does not have the unknown type names.
+  const std::string unknown_type_msg = "unknown type name \'";
+  for (auto &msg : diag_msgs)


Nit

Suggested change

for (auto &msg : diag_msgs)

for (const auto &msg : diag_msgs)

danobi · 2020-09-02T18:40:36Z

src/clang_parser.cpp

-    btf_and_input_files.emplace_back(CXUnsavedFile{
+    auto incomplete_types = get_incomplete_types(
+        input, input_files, args, bpftrace.btf_set_);
+    unsigned types_cnt = bpftrace.btf_set_.size();


More portable I suppose

Suggested change

unsigned types_cnt = bpftrace.btf_set_.size();

size_t types_cnt = bpftrace.btf_set_.size();

danobi · 2020-09-02T18:46:30Z

src/clang_parser.cpp

+      };
+      // If additional BTF types were found, we need to repeat the process since
+      // that might have introduced some new unresolved typedefs.
+      check_additional_types = types_cnt != bpftrace.btf_set_.size();


Might be more clear like this instead

Suggested change

check_additional_types = types_cnt != bpftrace.btf_set_.size();

check_additional_types = incomplete_types.size();

Not really, we need to check if a new incomplete type was found (one that wasn't in bpftrace.btf_set_ already), otherwise the loop could not terminate (see the comment above for example).

danobi · 2020-09-02T18:49:13Z

src/clang_parser.cpp

+    auto incomplete_types = get_incomplete_types(
+        input, input_files, args, bpftrace.btf_set_);
+    unsigned types_cnt = bpftrace.btf_set_.size();
+    bpftrace.btf_set_.insert(incomplete_types.cbegin(),
+                             incomplete_types.cend());


These lines should be in the if (process_btf) branch right?

They weren't there before this PR, but I agree - bpftrace.btf_set_ should only be used when BTF is on.

danobi · 2020-09-02T19:09:29Z

I was also wondering, if it would make sense to get rid of the --btf option completely. AFAIU, it is used for tracepoints only. Using BTF automatically whenever it is available would make most of the "unknown type name" errors disappear in default bpftrace setup.

Tentative +1. I remember I tried doing this a couple times but I always hit some kind of realization that it would cause an issue. Can't seem to remember right now. It's possible that I've chipped away at reducing internal usage of --btf that there's no issue anymore.

danobi · 2020-09-02T19:13:56Z

src/clang_parser.cpp

+ *   unknown type name 'type_t'
+ * return type_t.
+ */
+std::string ClangParser::ClangParser::get_unknown_type(


Returning an optional instead of empty string would be cleaner API

Suggested change

std::string ClangParser::ClangParser::get_unknown_type(

std::optional<std::string> ClangParser::ClangParser::get_unknown_type(

If a type is defined by a typedef located in a non-included header, clang parser will fail. This analyses error messages from the clang parser and retrieves the missing typedefs from BTF (if it is used). Add a test that requires this extension. Resolves iovisor/bpftrace#1154.

There are two options how to handle undefined types: - running with --btf as we are now able to parse missing type definitions from BTF - including a correct header file Resolves iovisor#730.

viktormalik · 2020-09-03T07:03:37Z

my only concern is the string parsing of error messages. Not sure if error strings are part of clang API but I'd be worried if it wasn't. I didn't look around too hard in clang for alternative solutions so I'll take it on faith this is the best way.

I don't like this either. I tried to find an API for this, but it seems that there is none. Clang parser replaces all unknown types by int and it seems that it looses the information about the original type name. The only option could be extracting the information from Clang diagnostics (which is what we're doing here, anyway), but it doesn't seem that other parts of the diagnostics are more stable than the message itself.

viktormalik · 2020-09-03T07:16:38Z

Tentative +1. I remember I tried doing this a couple times but I always hit some kind of realization that it would cause an issue. Can't seem to remember right now. It's possible that I've chipped away at reducing internal usage of --btf that there's no issue anymore.

The problem will come if the user tries to define a type that is also parsed from BTF. For example this program:

# bpftrace -e 'struct task_struct {int x;} i:ms:1 { printf("%d\n", curtask->x); }'

won't work because using curtask will force parsing struct task_struct from BTF and it will be then defined twice. I don't see a reason for doing this, but I may be missing something.

Anyway, I have a proposal implementation ready. Once this PR is merged, I'll open a new PR where we can discuss further.

mmisono · 2020-10-16T08:54:46Z

src/clang_parser.cpp

+    // If additional BTF types were found, we need to repeat the process since
+    // that might have introduced some new unresolved typedefs.
+    check_additional_types = types_cnt != bpftrace.btf_set_.size();


@viktormalik
I found this PR causes a slowdown if an unknown struct is big. For example:

before this PR

% time sudo ./src/bpftrace --btf -e 'struct f { struct task_struct x; } BEGIN{exit();}' Attaching 1 probe.. 0.94s user 0.20s system 80% cpu 1.420 tota

after

% time sudo ./src/bpftrace --btf -e 'struct f { struct task_struct x; } BEGIN{exit();}' Attaching 1 probe... 7.76s user 0.59s system 96% cpu 8.646 total

The problem is this commented part, and for task_struct, this loop iterates 9 times. I wonder if we can reduce the iteration. Actually, I'm not sure when we need to iterate until the condition is satisfied. At least, the above example works without any iteration. Do you have an example?

Thanks for noticing this problem. The truth is that I don't have any example that would require multiple iterations, but it seems to me that it could, in theory, occur.

However, what I realized now, is that we run this loop to make sure that the Clang parser succeeds. So I believe that we can stop collecting the incomplete types once the Clang parser runs without errors. I implemented a simple fix in this branch. Could you please try it and confirm that the slowdown is gone? I'll run some tests myself and then I can open a PR.

Yes, it drastically improves speed.

% time sudo ./src/bpftrace --btf -e 'struct f { struct task_struct x; } BEGIN{exit();}' Attaching 1 probe... 0.47s user 0.27s system 72% cpu 1.009 total

Please open a PR and let's discuss this further there.

Great, thanks.
It's not that easy after all, because this simple fix breaks one regression test. The problem is that the type resolution must be run at least once. I'll come with a better fix and open a PR.

viktormalik force-pushed the btf-unknown-typedefs branch 2 times, most recently from 28435a9 to dd8b829 Compare August 26, 2020 11:11

viktormalik force-pushed the btf-unknown-typedefs branch from dd8b829 to a66ef93 Compare September 2, 2020 07:22

mmisono reviewed Sep 2, 2020

View reviewed changes

src/clang_parser.cpp Outdated Show resolved Hide resolved

viktormalik force-pushed the btf-unknown-typedefs branch from a66ef93 to c6843fa Compare September 2, 2020 11:12

danobi reviewed Sep 2, 2020

View reviewed changes

viktormalik added 2 commits September 3, 2020 08:16

Give hint when an undefined type is found

aae6a69

There are two options how to handle undefined types: - running with --btf as we are now able to parse missing type definitions from BTF - including a correct header file Resolves iovisor#730.

viktormalik force-pushed the btf-unknown-typedefs branch from c6843fa to aae6a69 Compare September 3, 2020 06:35

mmisono approved these changes Sep 8, 2020

View reviewed changes

fbs merged commit ec660c7 into bpftrace:master Sep 8, 2020

viktormalik deleted the btf-unknown-typedefs branch September 9, 2020 06:58

mmisono reviewed Oct 16, 2020

View reviewed changes

viktormalik mentioned this pull request Oct 23, 2020

Optimize unknown/incomplete types resolution #1571

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Resolve unknown/undefined types for tracepoints #1485

Resolve unknown/undefined types for tracepoints #1485

viktormalik commented Aug 26, 2020 •

edited

Loading

mmisono commented Sep 1, 2020

viktormalik commented Sep 2, 2020

mmisono left a comment

danobi left a comment

danobi Sep 2, 2020

danobi Sep 2, 2020

danobi Sep 2, 2020

danobi Sep 2, 2020

viktormalik Sep 3, 2020

danobi Sep 2, 2020

viktormalik Sep 3, 2020

danobi commented Sep 2, 2020

danobi Sep 2, 2020

viktormalik commented Sep 3, 2020

viktormalik commented Sep 3, 2020

mmisono Oct 16, 2020

viktormalik Oct 16, 2020

mmisono Oct 16, 2020

viktormalik Oct 16, 2020

	for (auto &msg : diag_msgs)
	for (const auto &msg : diag_msgs)

	unsigned types_cnt = bpftrace.btf_set_.size();
	size_t types_cnt = bpftrace.btf_set_.size();

	check_additional_types = types_cnt != bpftrace.btf_set_.size();
	check_additional_types = incomplete_types.size();

	std::string ClangParser::ClangParser::get_unknown_type(
	std::optional<std::string> ClangParser::ClangParser::get_unknown_type(

Resolve unknown/undefined types for tracepoints #1485

Resolve unknown/undefined types for tracepoints #1485

Conversation

viktormalik commented Aug 26, 2020 • edited Loading

Checklist

mmisono commented Sep 1, 2020

viktormalik commented Sep 2, 2020

mmisono left a comment

Choose a reason for hiding this comment

danobi left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

danobi commented Sep 2, 2020

Choose a reason for hiding this comment

viktormalik commented Sep 3, 2020

viktormalik commented Sep 3, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

viktormalik commented Aug 26, 2020 •

edited

Loading