-
-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix crashes caused by improper error handling on bpf_look/update/delete_elem #2623
Conversation
bpf_*_elem returns -errno when error occured. These functions will return -ENOENT(-2) when they failed to find a entry. It's a harmless error that could be found when doing delete and print/zero/clear operations on the same map simultaneously. It's not reasonable to stop the whole program at this moment. Signed-off-by: maokelong <chenjinglong1@huawei.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change lgtm. But could you please add a runtime test so we can avoid future regressions?
The script you provided in your PR description seems reasonable to use.
Here's some hints on runtime tests: https://github.com/iovisor/bpftrace/blob/master/tests/README.md#runtime-tests
3468590
to
194743a
Compare
@danobi Done. And here's the test report on my pc. : ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small nit but lgtm otherwise
Signed-off-by: maokelong <chenjinglong1@huawei.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, thanks! Could you also add a CHANGELOG entry?
I'll push a changelog entry up for this |
Thanks for the fix! This did confuse me as I thought we'd already handled at least one of these cases years ago: #823 With a bit of digging it turns out that libbpf changed how they returned errors in 2021: libbpf/libbpf@7c7ba06 |
bpf_*_elem returns -errno when error occured. These functions will return -ENOENT(-2) when they failed to find a entry. It's a harmless error that could be found when doing delete and print/zero/clear operations on the same map simultaneously. It's not reasonable to stop the whole program at this moment.
A simple script could reproduce this annoying situation:
This script crashed almost immediately and reports errors likes:
But It's accepable for users that some entries are deleted asynchronously when tring to clear/print/zero a map.
Checklist
man/adoc/bpftrace.adoc
and if needed indocs/reference_guide.md
CHANGELOG.md