posix: add fuzz testing using MAVLink messages #12896

julianoes · 2019-09-04T14:42:58Z

This adds the env option PX4_FUZZ which runs the LLVM libFuzzer which throws random bytes at mavlink_receiver using MAVLink messages over UDP.

The MAVLink messages that are being sent are valid, so the CRC is calculated but the payload and msgid, etc. are generally garbage, unless the fuzzing gets a msgid right by chance.

As I understand it, libFuzzer watches the test coverage and will try to execute as much of the code as possible.

To run this, do:

PX4_FUZZ=1 make px4_sitl_default-clang && (cd build/px4_sitl_default-clang/tmp/rootfs && mkdir CORPUS -p && ../../bin/px4 -detect_leaks=1 CORPUS/)

To stop it (if it never crashes): killall px4, or Ctrl+C.

So far I've been running this multiple days on a desktop computer but have not found anything.

dagar · 2019-09-04T14:52:54Z

I wonder if we could be a bit more targeted to the subset of message ids we actually implement in mavlink_receiver. Seeing this with code coverage would also be interesting.

julianoes · 2019-09-04T15:40:47Z

Seeing this with code coverage would also be interesting.

Anecdotal I can say that while running the tests it prints functions that it executed and I saw many of the mavlink_receiver functions pop up, so I assume all were used at some point.

However, you can collect coverage info and visualize it. I'll do that and share it.

julianoes · 2019-09-04T20:47:22Z

@BazookaJoe1900 just found that I had forgotten that you require the patch below in the c_library_v2/pymavlink submodule, otherwise the address sanitizier will be triggered:
https://github.com/ArduPilot/pymavlink/pull/343/files

dagar · 2019-09-05T01:54:00Z

Apparently this doesn't actually matter, but can you add the build type here?
https://github.com/PX4/Firmware/blob/c8d13bacf2fb561cb3bdf12a4846ee52c06bbccf/CMakeLists.txt#L253

hamishwillee · 2019-09-05T03:51:55Z

@julianoes Is this anything that needs to be added to test docs? http://dev.px4.io/master/en/test_and_ci/

julianoes · 2019-09-05T05:52:10Z

@hamishwillee yes I want to add something. And maybe have it fuzz 5 minutes on a known corpus (input data) in CI.

julianoes · 2019-09-05T05:59:35Z

Apparently this doesn't actually matter, but can you add the build type here?

ok I'll add it there too.

Seeing this with code coverage would also be interesting.

I looked at coverage for the corpus (input data) acquired over about 8 hours and it looks like all handle_message in MavlinkReceiver are covered. However, also all handle_message functions in Simulator are covered and I'm wondering why because no simulator like jMAVSim or Gazebo is running that would send that input.

Attached is a coverage file. It's in colors, so to see it you have to do something like cat coverage_handle_message.txt.

coverage_handle_message.txt

dagar · 2019-09-22T19:03:16Z

Would it make sense to add even the simplest minimal use of this to Jenkins to prevent from breaking the configuration and other setup helpers?

julianoes · 2019-09-23T06:37:17Z

@dagar yes, I'll do that.

stale · 2019-12-22T07:15:42Z

This issue has been automatically marked as stale because it has not had recent activity. Thank you for your contributions.

stale · 2020-08-31T11:37:26Z

This issue has been automatically marked as stale because it has not had recent activity. Thank you for your contributions.

hamishwillee · 2020-08-31T22:04:50Z

@julianoes Should we let this die?

julianoes · 2020-12-07T01:53:29Z

@bkueng @dagar I'm making another effort to get this in.

julianoes · 2020-12-07T02:29:31Z

@dagar do you have a good idea how we could integrate this into CI? Basically, we should run it for a certain amount of time and record the output if it does fail. If it is still running after some time we would just stop it with killall -SIGKILL px4.

Alternatively, we could try to get into the google/oss-fuzz project where they seem to provide free fuzzing for open source projects.

dagar · 2020-12-07T15:52:17Z

@dagar do you have a good idea how we could integrate this into CI? Basically, we should run it for a certain amount of time and record the output if it does fail. If it is still running after some time we would just stop it with killall -SIGKILL px4.

Any idea what kind of time is involved (imagine slower 2 core machine)?

Alternatively, we could try to get into the google/oss-fuzz project where they seem to provide free fuzzing for open source projects.

Cool, I had this on my list of things to look into. It's not clear to me if it's worth the effort to integrate vs doing it ourselves.

julianoes · 2020-12-10T22:43:57Z

Any idea what kind of time is involved (imagine slower 2 core machine)?

Not sure. More is better but maybe doing 5 mins on each merge to master is pretty good. From what I understand if you cache the CORPUS (some inputs) it will re-use these.

LorenzMeier · 2021-01-31T13:49:44Z

@julianoes Given how long this PR is open I will close it soon as stale. I know we want to get this in, so let's focus on merging it in a state that is an incremental improvement, rather than keeping it open longer. Could you please make sure this is mergable and documented and then add commits on top where needed later?

julianoes · 2021-12-21T10:43:06Z

Memory sanitizer found something:

 INFO: A corpus is not provided, starting from an empty corpus
#2	INITED cov: 23 ft: 24 corp: 1/1b exec/s: 0 rss: 77Mb
#3	NEW    cov: 23 ft: 36 corp: 2/2b lim: 4096 exec/s: 0 rss: 77Mb L: 1/1 MS: 1 ChangeBit-
#6	NEW    cov: 25 ft: 51 corp: 3/124b lim: 4096 exec/s: 0 rss: 78Mb L: 122/122 MS: 3 InsertByte-CopyPart-InsertRepeatedBytes-
#10	NEW    cov: 25 ft: 53 corp: 4/129b lim: 4096 exec/s: 0 rss: 78Mb L: 5/122 MS: 4 CopyPart-ChangeBit-CMP-ChangeBit- DE: "m\000\000\000"-
#12	NEW    cov: 26 ft: 54 corp: 5/245b lim: 4096 exec/s: 0 rss: 78Mb L: 116/122 MS: 2 PersAutoDict-InsertRepeatedBytes- DE: "m\000\000\000"-
#14	NEW    cov: 26 ft: 56 corp: 6/280b lim: 4096 exec/s: 0 rss: 78Mb L: 35/122 MS: 2 CrossOver-InsertRepeatedBytes-
#18	REDUCE cov: 26 ft: 56 corp: 6/239b lim: 4096 exec/s: 0 rss: 78Mb L: 75/122 MS: 4 ChangeBinInt-InsertByte-ChangeBit-EraseBytes-
INFO  [logger] logger started (mode=all)
	NEW_FUNC[1/142]: 0x64dc30 in uORB::SubscriptionMultiArray<battery_status_s, (unsigned char)4>::SubscriptionMultiArray(ORB_ID) /src/PX4-Autopilot/platforms/common/uORB/SubscriptionMultiArray.hpp:68
	NEW_FUNC[2/142]: 0x64e080 in uORB::SubscriptionMultiArray<distance_sensor_s, (unsigned char)10>::SubscriptionMultiArray(ORB_ID) /src/PX4-Autopilot/platforms/common/uORB/SubscriptionMultiArray.hpp:68
#23	REDUCE cov: 378 ft: 204 corp: 7/335b lim: 4096 exec/s: 0 rss: 80Mb L: 96/122 MS: 5 InsertByte-CopyPart-CrossOver-InsertByte-PersAutoDict- DE: "m\000\000\000"-
#31	REDUCE cov: 378 ft: 204 corp: 7/305b lim: 4096 exec/s: 0 rss: 80Mb L: 45/122 MS: 3 CopyPart-PersAutoDict-EraseBytes- DE: "m\000\000\000"-
INFO  [mavlink] partner IP: 127.0.0.1
	NEW_FUNC[1/4]:     #0 0x66d718 in Commander::run() /src/PX4-Autopilot/src/modules/commander/Commander.cpp:2592:7
    #1 0x67eeba in ModuleBase<Commander>::run_trampoline(int, char**) /src/PX4-Autopilot/platforms/common/include/px4_platform_common/module.h:180:12
    #2 0xd905b2 in entry_adapter(void*) /src/PX4-Autopilot/platforms/posix/src/px4/common/tasks.cpp:98:2
    #3 0x7f40a103a608 in start_thread (/lib/x86_64-linux-gnu/libpthread.so.0+0x9608)
    #4 0x7f40a0f40292 in __clone (/lib/x86_64-linux-gnu/libc.so.6+0x122292)

DEDUP_TOKEN: Commander::run()--ModuleBase<Commander>::run_trampoline(int, char**)--entry_adapter(void*)
  Uninitialized value was created by a heap allocation
0x598d70 in uORB::SubscriptionInterval::updated() /src/PX4-Autopilot/platforms/common/uORB/SubscriptionInterval.hpp:97
	NEW_FUNC[2/4]:     #0 0x563339 in operator new(unsigned long) /src/llvm-project/compiler-rt/lib/msan/msan_new_delete.cpp:45:35
    #1 0x67ecc5 in Commander::instantiate(int, char**) /src/PX4-Autopilot/src/modules/commander/Commander.cpp:3385:24
    #2 0x67ecc5 in ModuleBase<Commander>::run_trampoline(int, char**) /src/PX4-Autopilot/platforms/common/include/px4_platform_common/module.h:176:15
    #3 0xd905b2 in entry_adapter(void*) /src/PX4-Autopilot/platforms/posix/src/px4/common/tasks.cpp:98:2
    #4 0x7f40a103a608 in start_thread (/lib/x86_64-linux-gnu/libpthread.so.0+0x9608)

DEDUP_TOKEN: operator new(unsigned long)--Commander::instantiate(int, char**)--ModuleBase<Commander>::run_trampoline(int, char**)
SUMMARY: MemorySanitizer: use-of-uninitialized-value /src/PX4-Autopilot/src/modules/commander/Commander.cpp:2592:7 in Commander::run()

julianoes · 2021-12-21T13:33:07Z

@dagar it now fuzzes 3x 10 minutes, for every PR, using 3 different sanitizers. It fuzzes from scratch every time, and does not use a corpus cache. We could alternatively run it daily on master and keep a fuzzing CORPUS in cache.
What would you prefer?

Also see: https://google.github.io/clusterfuzzlite/overview/

dagar · 2021-12-21T14:31:25Z

cmake/px4_add_common_flags.cmake

+
+	if(NOT CMAKE_BUILD_TYPE STREQUAL FuzzTesting)
+		list(APPEND cxx_flags
+			-fno-rtti


This could maybe even be a NuttX specific flag.

I have no idea, you tell me. Ah, I guess, we just wanted to match what we have with NuttX.

dagar · 2021-12-21T14:39:29Z

platforms/posix/src/px4/common/main.cpp

+	return 0;
+}
+
+void initialize_fake_px4_once()


Would it make sense to throw this into an entirely separate file and have clear entry points (regular main vs fuzzing)?

The regular SITL start is already kind of convoluted and has other issues we need to address (clean shutdown, etc).

It would, yes 😄. I'll give it a try.

Done, have a look :)

This adds the env option PX4_FUZZ which runs the LLVM libFuzzer which throws random bytes at mavlink_receiver using MAVLink messages over UDP. The MAVLink messages that are being sent are valid, so the CRC is calculated but the payload and msgid, etc. are generally garbage, unless the fuzzing gets a msgid right by chance. As I understand it, libFuzzer watches the test coverage and will try to execute as much of the code as possible.

So instead of fuzzing each and every PR for 10minutes, we just fuzz 30mins every 24 hours, at 6am UTC which should be a time when US and Europe might be least active.

julianoes · 2022-01-06T07:58:09Z

@dagar I suggest we merge this and then check if the cron job every 24h does the fuzzing.

DronecodeBot · 2023-12-03T19:07:26Z

This pull request has been mentioned on Discussion Forum for PX4, Pixhawk, QGroundControl, MAVSDK, MAVLink. There might be relevant details there:

https://discuss.px4.io/t/fuzz-testing-for-px4/35623/2

julianoes requested review from dagar and bkueng September 4, 2019 14:42

julianoes mentioned this pull request Sep 4, 2019

Param init undefined behaviour #12897

Open

BazookaJoe1900 mentioned this pull request Sep 4, 2019

mavlink - reading to undefined memory #12900

Closed

weekly-digest bot mentioned this pull request Sep 8, 2019

Weekly Digest (1 September, 2019 - 8 September, 2019) #12924

Closed

weekly-digest bot mentioned this pull request Sep 29, 2019

Weekly Digest (22 September, 2019 - 29 September, 2019) #13050

Closed

stale bot added the stale label Dec 22, 2019

weekly-digest bot mentioned this pull request Dec 29, 2019

Weekly Digest (22 December, 2019 - 29 December, 2019) #13804

Closed

julianoes marked this pull request as draft June 2, 2020 11:29

stale bot removed the stale label Jun 2, 2020

stale bot added the stale label Aug 31, 2020

stale bot removed the stale label Aug 31, 2020

julianoes force-pushed the pr-fuzz-testing branch from 3494d31 to 95a4c4c Compare December 7, 2020 01:52

julianoes marked this pull request as ready for review December 7, 2020 02:29

julianoes marked this pull request as draft May 28, 2021 11:40

julianoes force-pushed the pr-fuzz-testing branch 9 times, most recently from 0db7927 to dcd76eb Compare December 21, 2021 09:18

dagar reviewed Dec 21, 2021

View reviewed changes

julianoes force-pushed the pr-fuzz-testing branch from 80042ac to 7939744 Compare January 3, 2022 08:23

julianoes marked this pull request as ready for review January 3, 2022 08:24

julianoes force-pushed the pr-fuzz-testing branch from 7939744 to 0908125 Compare January 5, 2022 07:30

julianoes added 2 commits January 6, 2022 07:11

Add clusterfuzzlite to fuzz in CI

9b4da56

julianoes force-pushed the pr-fuzz-testing branch from 0908125 to 9b4da56 Compare January 6, 2022 06:12

workflows: Set up batch fuzzing every 24 hours

d6b268a

So instead of fuzzing each and every PR for 10minutes, we just fuzz 30mins every 24 hours, at 6am UTC which should be a time when US and Europe might be least active.

dagar approved these changes Jan 7, 2022

View reviewed changes

dagar merged commit be0a5b4 into master Jan 7, 2022

dagar deleted the pr-fuzz-testing branch January 7, 2022 15:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

posix: add fuzz testing using MAVLink messages #12896

posix: add fuzz testing using MAVLink messages #12896

julianoes commented Sep 4, 2019 •

edited

Loading

dagar commented Sep 4, 2019

julianoes commented Sep 4, 2019 •

edited

Loading

julianoes commented Sep 4, 2019

dagar commented Sep 5, 2019

hamishwillee commented Sep 5, 2019

julianoes commented Sep 5, 2019

julianoes commented Sep 5, 2019 •

edited

Loading

dagar commented Sep 22, 2019

julianoes commented Sep 23, 2019

stale bot commented Dec 22, 2019

stale bot commented Aug 31, 2020

hamishwillee commented Aug 31, 2020

julianoes commented Dec 7, 2020

julianoes commented Dec 7, 2020

dagar commented Dec 7, 2020

julianoes commented Dec 10, 2020

LorenzMeier commented Jan 31, 2021

julianoes commented Dec 21, 2021

julianoes commented Dec 21, 2021

dagar Dec 21, 2021

julianoes Dec 21, 2021 •

edited

Loading

dagar Dec 21, 2021

julianoes Dec 21, 2021

julianoes Jan 3, 2022

julianoes commented Jan 6, 2022

DronecodeBot commented Dec 3, 2023

posix: add fuzz testing using MAVLink messages #12896

posix: add fuzz testing using MAVLink messages #12896

Conversation

julianoes commented Sep 4, 2019 • edited Loading

dagar commented Sep 4, 2019

julianoes commented Sep 4, 2019 • edited Loading

julianoes commented Sep 4, 2019

dagar commented Sep 5, 2019

hamishwillee commented Sep 5, 2019

julianoes commented Sep 5, 2019

julianoes commented Sep 5, 2019 • edited Loading

dagar commented Sep 22, 2019

julianoes commented Sep 23, 2019

stale bot commented Dec 22, 2019

stale bot commented Aug 31, 2020

hamishwillee commented Aug 31, 2020

julianoes commented Dec 7, 2020

julianoes commented Dec 7, 2020

dagar commented Dec 7, 2020

julianoes commented Dec 10, 2020

LorenzMeier commented Jan 31, 2021

julianoes commented Dec 21, 2021

julianoes commented Dec 21, 2021

dagar Dec 21, 2021

Choose a reason for hiding this comment

julianoes Dec 21, 2021 • edited Loading

Choose a reason for hiding this comment

dagar Dec 21, 2021

Choose a reason for hiding this comment

julianoes Dec 21, 2021

Choose a reason for hiding this comment

julianoes Jan 3, 2022

Choose a reason for hiding this comment

julianoes commented Jan 6, 2022

DronecodeBot commented Dec 3, 2023

julianoes commented Sep 4, 2019 •

edited

Loading

julianoes commented Sep 4, 2019 •

edited

Loading

julianoes commented Sep 5, 2019 •

edited

Loading

julianoes Dec 21, 2021 •

edited

Loading