-
-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Print arbitrary data safely #1107
Conversation
The current behavior is (please correct me if I'm worng): If a struct is struct x {
char* a; // Type::integer
char b[4]; // Type::string
int8 c[4]; // Type::array (pointee size 1)
int d[4]; // Type::array (pointee size 4)
} then,
I think this is OK for now, but it would be nice if the above operations are supported (and would be less confusing). I think the following behavior is straightforward
Footnotes |
I like the idea of having Being able to do And like @mmisono says, being able to read from array types would be nice too. In that case we can automatically determine the amount of bytes that need to be read instead letting the user specify it. |
You might want to take a look at #1104 too. As adding a new type means you'll have to update quite a bit of code. |
6fe58c3
to
1bed0ca
Compare
I've prototyped the This has basic support for using a buffer with maps, but it's not working yet (verifier error). I'll dig into this next. @fbs if you have time, would love to get your thoughts on the changes. |
the basis looks ok to me :), some notes:
|
fcc5c02
to
b3a5a07
Compare
@fbs This is ready for another look and feels pretty good to me. I think I've addressed all of your feedback. There's more nice-to-have stuff to do, e.g. let %r work for regular strings as well, but it might be good to land this and then make adjustments based on usage feedback. |
b3a5a07
to
bbb1f71
Compare
@@ -41,6 +41,7 @@ enum class Type | |||
array, | |||
// BPF program context; needing a different access method to satisfy the verifier | |||
ctx, | |||
buffer, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a comment like ctx hasd?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ctx is currently the only one with a comment, so I'm not sure what's expected here. Any suggestions for what it should say?
src/ast/semantic_analyser.cpp
Outdated
if (buffer_size > max_buffer_size) | ||
buffer_size = max_buffer_size; | ||
|
||
buffer_size++; // allow room for zero byte |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need a zero byte? We already have its length?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I misunderstood what was happening here when I wrote the comment. I think we need the extra byte because otherwise arg.type.size
will be one byte too small (because I'm using a byte to embed the buffer length), and the final print buffer will be corrupted:
ad1adf9
to
c603ee5
Compare
Rebased and fixed a copy-and-paste mistake in codegen. The equality test is failing now due to the recent type fixes (e523e2c), and there's an offset bug when I try to use a buffer as the key for a histogram map. Will look into these issues soon. |
d3743dc
to
a97ccdc
Compare
@fbs Okay, I think this is ready for a final pass. I'll add support for buffers as hist and lhist map keys for a followup PR if that works for you. Seems like an uncommon use case that doesn't need to block us. |
I will take a look asap :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall structure looks great; thanks for all the work. Just some small comments
a97ccdc
to
f9f033f
Compare
@danobi Thanks! I've addressed your feedback except for the type issue noted above |
@danobi Okay, I took a shot at implementing those suggestions. I was with you until here:
Are you suggesting that I stop using a struct, and instead allocate a single buffer to store both the contents and the length? That was how my original implementation worked, but the struct was recommended in an earlier round of feedback, and imo feels cleaner as it lets us use GEP instead of calculating offsets. Maybe I misunderstood your comment, though. |
Yeah that's what I originally meant
Didn't realize you can't use GEP on non-array types. Makes sense I guess. Let's just go with the changes you have in b2f876e, then. The CI is failing on two things:
Otherwise LGTM |
b2f876e
to
9fc6bf5
Compare
I've just opened #1262 to get rid of llvm5 :) |
I think we're good to merge. Any last comments? |
Yes lets take 9fc6bf5 out of this PR. Ill take a quick look at the latest change set (woops misclicked, who puts close right next to comment) |
LGTM |
9fc6bf5
to
5971ca6
Compare
Cheers. You should join our little irc channel on oftc at some point :) |
Or matrix bridge :p |
Does
when
get error:
Does |
Fixes #659. This PR introduces a built-in function
buf
that sanitizes arbitrary data so that it's safe to print. The function is similar in implementation tostr
and shares its constraints on length. The differences mostly stem from not usingCreateProbeReadStr
(can't expect the buffer to end on a zero byte) and sorting out the necessary plumbing.I couldn't find a straightforward way to pass the buffer's length to userspace independently from the buffer itself, so I've embedded it as the first byte in the buffer. Feels a bit odd, but it's simple and works.
I'm opening this as a draft because it still needs docs and perhaps another test. It's ready for initial feedback, though.