-
Notifications
You must be signed in to change notification settings - Fork 13.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
posix: add fuzz testing using MAVLink messages #12896
Conversation
I wonder if we could be a bit more targeted to the subset of message ids we actually implement in mavlink_receiver. Seeing this with code coverage would also be interesting. |
Anecdotal I can say that while running the tests it prints functions that it executed and I saw many of the mavlink_receiver functions pop up, so I assume all were used at some point. However, you can collect coverage info and visualize it. I'll do that and share it. |
@BazookaJoe1900 just found that I had forgotten that you require the patch below in the c_library_v2/pymavlink submodule, otherwise the address sanitizier will be triggered: |
Apparently this doesn't actually matter, but can you add the build type here? |
@julianoes Is this anything that needs to be added to test docs? http://dev.px4.io/master/en/test_and_ci/ |
@hamishwillee yes I want to add something. And maybe have it fuzz 5 minutes on a known corpus (input data) in CI. |
ok I'll add it there too.
I looked at coverage for the corpus (input data) acquired over about 8 hours and it looks like all Attached is a coverage file. It's in colors, so to see it you have to do something like |
Would it make sense to add even the simplest minimal use of this to Jenkins to prevent from breaking the configuration and other setup helpers? |
@dagar yes, I'll do that. |
This issue has been automatically marked as stale because it has not had recent activity. Thank you for your contributions. |
This issue has been automatically marked as stale because it has not had recent activity. Thank you for your contributions. |
@julianoes Should we let this die? |
3494d31
to
95a4c4c
Compare
@dagar do you have a good idea how we could integrate this into CI? Basically, we should run it for a certain amount of time and record the output if it does fail. If it is still running after some time we would just stop it with Alternatively, we could try to get into the google/oss-fuzz project where they seem to provide free fuzzing for open source projects. |
Any idea what kind of time is involved (imagine slower 2 core machine)?
Cool, I had this on my list of things to look into. It's not clear to me if it's worth the effort to integrate vs doing it ourselves. |
Not sure. More is better but maybe doing 5 mins on each merge to master is pretty good. From what I understand if you cache the CORPUS (some inputs) it will re-use these. |
@julianoes Given how long this PR is open I will close it soon as stale. I know we want to get this in, so let's focus on merging it in a state that is an incremental improvement, rather than keeping it open longer. Could you please make sure this is mergable and documented and then add commits on top where needed later? |
0db7927
to
dcd76eb
Compare
Memory sanitizer found something:
|
@dagar it now fuzzes 3x 10 minutes, for every PR, using 3 different sanitizers. It fuzzes from scratch every time, and does not use a corpus cache. We could alternatively run it daily on master and keep a fuzzing CORPUS in cache. Also see: https://google.github.io/clusterfuzzlite/overview/ |
|
||
if(NOT CMAKE_BUILD_TYPE STREQUAL FuzzTesting) | ||
list(APPEND cxx_flags | ||
-fno-rtti |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could maybe even be a NuttX specific flag.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have no idea, you tell me. Ah, I guess, we just wanted to match what we have with NuttX.
return 0; | ||
} | ||
|
||
void initialize_fake_px4_once() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to throw this into an entirely separate file and have clear entry points (regular main vs fuzzing)?
The regular SITL start is already kind of convoluted and has other issues we need to address (clean shutdown, etc).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would, yes 😄. I'll give it a try.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, have a look :)
80042ac
to
7939744
Compare
7939744
to
0908125
Compare
This adds the env option PX4_FUZZ which runs the LLVM libFuzzer which throws random bytes at mavlink_receiver using MAVLink messages over UDP. The MAVLink messages that are being sent are valid, so the CRC is calculated but the payload and msgid, etc. are generally garbage, unless the fuzzing gets a msgid right by chance. As I understand it, libFuzzer watches the test coverage and will try to execute as much of the code as possible.
0908125
to
9b4da56
Compare
So instead of fuzzing each and every PR for 10minutes, we just fuzz 30mins every 24 hours, at 6am UTC which should be a time when US and Europe might be least active.
@dagar I suggest we merge this and then check if the cron job every 24h does the fuzzing. |
This pull request has been mentioned on Discussion Forum for PX4, Pixhawk, QGroundControl, MAVSDK, MAVLink. There might be relevant details there: |
This adds the env option PX4_FUZZ which runs the LLVM libFuzzer which throws random bytes at mavlink_receiver using MAVLink messages over UDP.
The MAVLink messages that are being sent are valid, so the CRC is calculated but the payload and msgid, etc. are generally garbage, unless the fuzzing gets a msgid right by chance.
As I understand it, libFuzzer watches the test coverage and will try to execute as much of the code as possible.
To run this, do:
To stop it (if it never crashes):
killall px4
, or Ctrl+C.So far I've been running this multiple days on a desktop computer but have not found anything.