integrated fuzz testing #20702

andrewrk · 2024-07-21T01:47:18Z

Make it so that unit tests can ask for fuzz input:

test "foo" {
    const input_bytes = std.testing.fuzzInput(.{});
    try std.testing.expect(!std.mem.eql(u8, "canyoufindme", input_bytes));
}

Introduce flags to the compiler: -ffuzz, -fno-fuzz. These end up passing -fsanitize=fuzzer-no-link to Clang for C/C++ files. Introduce build system equivalent API.

However, neither the CLI interface nor the build system interface is needed in order to enable fuzzing. The only thing that is needed is to ask for fuzz input in unit tests, as in the above example.

When the build runner interacts with the test runner, it learns which tests, if any, are fuzz tests. Then when unit tests pass, it moves on to fuzz testing, by providing our own implementation of the genetic algorithms that drive the input bytes (similar to libFuzzer or AFL), and re-compiling the unit test binary with -ffuzz enabled.

Fuzz testing is level-driven so we will need some CLI to operate those options. For example, zig build --fuzz might start fuzzing indefinitely, while zig build --fuzz=300s declares success after fuzzing for five minutes. When fuzz testing is not requested, it defaults to a small number of iterations just to smoke test that it's all working.

Some sort of UI would be nice. For starters this could just be std.Progress. In the future perhaps there could be a live-updating HTML page to visualize progress and code coverage in realtime. How cool would it be to watch source code turn from red to green live as the fuzzer finds new branches?

I think there's value in being able to fuzz test a mix of Zig and C/C++ source code, so let's start with evaluating LLVM's instrumentation and perhaps being compatible with it, or at least supporting it. First step is to implement the support library in Zig.

-ffuzz will be made available as a comptime flag in @import("builtin") so that it can be used, for example, to choose the naive implementation of std.mem.eql which helps the fuzzer to find interesting branches.

Comments are welcome. Note this is an enhancement not a proposal. The question is not "whether?" but "how?".

squeek502 · 2024-07-21T02:25:16Z

Note that fuzz testing benefits a lot from starting with an input corpus of short, unique, and relevant inputs:

https://github.com/AFLplusplus/AFLplusplus/blob/stable/docs/fuzzing_in_depth.md#2-preparing-the-fuzzing-campaign

So Zig will likely want to have a way to provide the build/test runner with seed inputs as well. Imaginary syntax:

test "foo" {
    std.testing.fuzzCorpus(&.{
        @embedFile("inputs/input01"),
        @embedFile("inputs/input02"),
    });
    const input_bytes = std.testing.fuzzInput();
    try std.testing.expect(!std.mem.eql(u8, "canyoufindme", input_bytes));
}

But this may not be ideal since it's part of the test code itself. Separating things out into some sort of separate "setup the fuzzer" + "provide a function to repeatedly call" might be worthwhile.

(side note: dictionaries can also be helpful and would similarly ideally be provided in some sort of setup phase)

FWIW here's what Go's integrated fuzz testing looks like:

func FuzzReverse(f *testing.F) {
    testcases := []string{"Hello, world", " ", "!12345"}
    for _, tc := range testcases {
        f.Add(tc)  // Use f.Add to provide a seed corpus
    }
    f.Fuzz(func(t *testing.T, orig string) {
        // ...
    })
}

mlugg · 2024-07-21T02:31:08Z

When the build runner interacts with the test runner, it learns which tests, if any, are fuzz tests.

How does this mechanism work? If you've not thought about this yet, as a random (possibly bad) idea: perhaps std.testing.fuzzInput returns error{NeedFuzz}![]const u8 or similar, and if -ffuzz is not provided, it just returns error.NeedFuzz, which the caller propagates with try and the test runner can then report to the build runner.

nektro · 2024-07-21T02:31:59Z

example of a repo that does it today (with afl integration) (with separate zig build test and zig build fuzz steps)
https://github.com/nektro/zig-json

andrewrk · 2024-07-21T02:39:18Z

How does this mechanism work?

The build runner already runs the test runner as a child process with the test runner protocol over stdio, so that it can keep running unit tests when one of them crashes the process, and check that a unit test triggered a safety panic as expected (#1356). It also makes the parent process know which test was being executed if the unit test crashes the process.

Doing this over stdio is super handy because it even works in strange environments such as via QEMU, wine, or wasmtime.

The function can set a flag indicating that a fuzz test was encountered, then return random bytes (smoke test). Before the test runner sends EOF to the parent process it will send a message indicating metadata about the fuzz tests in the compilation. The build runner then has all the information it needs to enter Fuzz Mode after the main build pipeline is done.

mlugg · 2024-07-21T02:52:55Z

That makes sense -- nicely designed.

Here's a tangentially related question. Like other parts of the compiler, our testing infrastructure is moving towards a strong bias to running via the build system. Is there, perhaps, an argument to be made for renaming zig test to zig build-test and maybe even eliminating the non-compiler-protocol test runner functionality? The standalone command provides a worse UX, but its name can kind of indicate to people that it's "the way" to test their code; this often leads to people doing incorrect things like trying to zig test individual files within a project (when the correct thing is to test their entire project with a test filter set).

This fuzzing stuff is another example of very tight integration between the build system and compiler, where directly running zig test would at the very least provide a worse UX. (I don't quite understand what the -ffuzz option is intended to do to Zig code, if anything, so don't have a solid grasp of whether it would work at all; does the test runner or the build runner provide the fuzzing inputs?)

Perhaps this is a silly idea; but if you think it has some merit, I'll spin it off into a separate proposal.

andrewrk · 2024-07-21T03:05:46Z

This would also apply to build-exe, build-lib, build-obj, translate-c, and objcopy. I think there is value in supporting both workflows; the simplicity of using the lower-level commands is quite handy when troubleshooting. I think it's fine if people use zig test to test a single file, as long as it works, but of course the build system is there for managing more complex invocations as well as multiplexing.

The fuzz tests in zig test mode would still run but would only do 1 iteration each, with (probably useless) random input. Perfect for writing the fuzz test before you actually want to give it a spin with zig build --fuzz, and for noticing when you broke it.

To answer your question about -ffuzz, it enables instrumentation in the generated code so that the fuzzer gets feedback on the branches that were taken based on its generated input. This helps it search the state space much more efficiently. The idea here is that there would be two builds of the unit tests - one without this instrumentation for unit tests, and one with the instrumentation that also links in the support library code, for doing fuzz testing.

Edit: now that I think about it, I don't think it would be that hard to make -ffuzz work in combination with zig test as well, although my driving motivation is still the all-powerful zig build integration.

squeek502 · 2024-07-21T03:26:56Z

The fuzz tests in zig test mode would still run but would only do 1 iteration each, with (probably useless) random input. Perfect for writing the fuzz test before you actually want to give it a spin with zig build --fuzz, and for noticing when you broke it.

IMO the ideal would be that in zig test mode it would run the test once with each of the provided input corpus. For a well constructed corpus, this would actually test many different code paths (while being finite + quick).

However, I can't really think of a way to make defining an input corpus work with Zig's current test syntax, so a proof-of-concept that always fuzzes starting with an empty input is probably the way to go.

matklad · 2024-07-21T11:59:36Z

Some sort of UI would be nice. For starters this could just be std.Progress. In the future perhaps there could be a live-updating HTML page to visualize progress and code coverage in realtime. How cool would it be to watch source code turn from red to green live as the fuzzer finds new branches?

Fuzzing often is done in a distributed manner: ten machines simultaneously running the fuzzer. To enable these kind of use-cases, it would be useful to access the results from the build system. Eg, fuzz step could produce a report in JSON file, which you then can use as an input to “CreateGitHubIssueStep” or some such.

kristoff-it · 2024-07-21T17:53:07Z

Here's a half-baked idea that maybe somebody could turn into something workable: have a mechanism to ensure that fuzzing hits a certain line of code and that shows a failure otherwise.

* Add the `-ffuzz` and `-fno-fuzz` CLI arguments. * Detect fuzz testing flags from zig cc. * Set the correct clang flags when fuzz testing is requested. It can be combined with TSAN and UBSAN. * Compilation: build fuzzer library when needed which is currently an empty zig file. * Add optforfuzzing to every function in the llvm backend for modules that have requested fuzzing. * In ZigLLVMTargetMachineEmitToFile, add the optimization passes for sanitizer coverage. * std.mem.eql uses a naive implementation optimized for fuzzing when builtin.fuzz is true. Tracked by #20702

andrewrk · 2024-07-22T04:47:57Z

Here's a half-baked idea that maybe somebody could turn into something workable: have a mechanism to ensure that fuzzing hits a certain line of code and that shows a failure otherwise.

Sounds related to sometimes assertions.

* Add the `-ffuzz` and `-fno-fuzz` CLI arguments. * Detect fuzz testing flags from zig cc. * Set the correct clang flags when fuzz testing is requested. It can be combined with TSAN and UBSAN. * Compilation: build fuzzer library when needed which is currently an empty zig file. * Add optforfuzzing to every function in the llvm backend for modules that have requested fuzzing. * In ZigLLVMTargetMachineEmitToFile, add the optimization passes for sanitizer coverage. * std.mem.eql uses a naive implementation optimized for fuzzing when builtin.fuzz is true. Tracked by #20702

dweiller · 2024-07-22T07:20:55Z

The fuzz tests in zig test mode would still run but would only do 1 iteration each, with (probably useless) random input. Perfect for writing the fuzz test before you actually want to give it a spin with zig build --fuzz, and for noticing when you broke it.

IMO the ideal would be that in zig test mode it would run the test once with each of the provided input corpus. For a well constructed corpus, this would actually test many different code paths (while being finite + quick).

However, I can't really think of a way to make defining an input corpus work with Zig's current test syntax, so a proof-of-concept that always fuzzes starting with an empty input is probably the way to go.

Instead of specifying a corpus in Zig code, what about providing it to the test/build runner on the CLI? Could we have --fuzz take an optional argument specifying a corpus directory? Since different tests presumably will want different corpuses we'd need some mechanism of associating different input files with tests - maybe something simple like sub-directories named by the fully qualified name of a test would work.

When fuzzing with AFLPlusPlus I often have updated my corpus with new seed files from a previous fuzzing run so that the next run doesn't have to re-explore the same search space from scratch. For this reason, I think it would make more sense for the input corpus to not be specified in the code. With a CLI flag, the build-runner could even be made to automatically update the corpus with new seeds if desired.

* Add the `-ffuzz` and `-fno-fuzz` CLI arguments. * Detect fuzz testing flags from zig cc. * Set the correct clang flags when fuzz testing is requested. It can be combined with TSAN and UBSAN. * Compilation: build fuzzer library when needed which is currently an empty zig file. * Add optforfuzzing to every function in the llvm backend for modules that have requested fuzzing. * In ZigLLVMTargetMachineEmitToFile, add the optimization passes for sanitizer coverage. * std.mem.eql uses a naive implementation optimized for fuzzing when builtin.fuzz is true. Tracked by #20702

squeek502 · 2024-07-22T08:56:14Z

Instead of specifying a corpus in Zig code, what about providing it to the test/build runner on the CLI?

Depends what the intended use cases are. From the OP, it sounds like running multiple fuzz tests (for a finite amount of time each) is an intended use case, so specifying a corpus for each fuzz test via the CLI might be a bit tricky. Reading from some particular location based on the fully qualified test name would work but would make renaming/moving tests around a chore (and a potential footgun-of-sorts if you don't realize there's a mismatch in the corpus/test FQN).

The-King-of-Toasters · 2024-07-22T11:05:31Z

Existing languages have a lot of magic re: how fuzzing targets are defined. For example, Go requires targets to:

Be contained in a _test.go file.
Have its function name start with Fuzz.
Be defined as a void function with a *testing.F as its only parameter.
Use only a small list of built-in types for tests.

Fuzzing in Rust via cargo-fuzz is better in that:

Targets are defined using the fuzz_target macro.
Custom types can be created via the arbitrary crate.

This is how the glob-match crate is fuzzed, something I missed when I made a Zig port of it.

IMO I believe we could get the best of both worlds by having a fuzz block, similar to the existing test blocks, along with a seperate std.fuzz namespace for e.g. adding data to a corpus, creating arbitrary types.

dweiller · 2024-07-22T11:35:10Z

Instead of specifying a corpus in Zig code, what about providing it to the test/build runner on the CLI?

Depends what the intended use cases are. From the OP, it sounds like running multiple fuzz tests (for a finite amount of time each) is an intended use case, so specifying a corpus for each fuzz test via the CLI might be a bit tricky. Reading from some particular location based on the fully qualified test name would work but would make renaming/moving tests around a chore (and a potential footgun-of-sorts if you don't realize there's a mismatch in the corpus/test FQN).

With the plan of a two-pass system where the first pass detects which tests are fuzz tests, perhaps we can have a std.testing.fuzzCorpusDir("path/to/corpus/directory") which is called in a test and used in the first pass to register a corpus directory for that test and that information is relayed back to the build runner for use when compiling in fuzz mode. std.testing.fuzzCorpusDir would be a no-op when compiled with fuzzing active.

* Add the `-ffuzz` and `-fno-fuzz` CLI arguments. * Detect fuzz testing flags from zig cc. * Set the correct clang flags when fuzz testing is requested. It can be combined with TSAN and UBSAN. * Compilation: build fuzzer library when needed which is currently an empty zig file. * Add optforfuzzing to every function in the llvm backend for modules that have requested fuzzing. * In ZigLLVMTargetMachineEmitToFile, add the optimization passes for sanitizer coverage. * std.mem.eql uses a naive implementation optimized for fuzzing when builtin.fuzz is true. Tracked by #20702

Arya-Elfren · 2024-07-23T13:44:33Z

For those who know more about fuzzers and instrumentation: how hard would it be to make this generic enough and make integrations into different instrumentation/fuzzing libraries? Letting you plug fizzing engines or, for example, if I was making a zig library for a different language and they used a specific fuzzer and I wanted to fuzz the calls to zig using the same system (getting coverage etc.). Almost like having "custom fuzz runners + integration" the same way we can have custom build and test runners?

jamii · 2024-07-24T01:09:46Z

The fuzz tests in zig test mode would still run but would only do 1 iteration each, with (probably useless) random input.

Non-deterministic CI failures ahoy!

After fuzzing in a lot of different projects, I like the interface in go. In test mode just run the provided inputs and in fuzz mode use those inputs to seed the corpus.

Many fuzzing tools also have a corpus minimization option which produces the minimum set of inputs that obtain the same coverage as the full corpus. I like to copy those back into the fuzz test to get good coverage in test mode.

andrewrk · 2024-07-24T01:12:18Z

Non-deterministic CI failures ahoy!

Not so fast!

jamii · 2024-07-24T01:37:58Z

Some minor quality-of-life options from other tools:

Set a timeout after which the fuzz test will be killed and the process restarted.
Choose whether timing out is considered a fail or a pass (eg timing out when fuzzing an interpreter is expected).
Try to detect unique failures by recording basic program state (eg a honggfuzz failure looks like SIGSEGV.PC.5555556f9ec4.STACK.19f217c3bb.CODE.1.ADDR.7fffff7fed70.INSTR.mov____%r8d,-0x320(%rbp).fuzz - any failures with matching values will be reported as duplicates, and by default only the first failure and the smallest failure will be reported). This is invaluable if you have some unfixable bugs but still want to fuzz for new bugs.
As @matklad mentioned above, have some way to export and merge coverage reports. Useful for monitoring (it's quite easy to accidentally break fuzzer coverage and not notice) and for additional tools like 'sometimes asserts'.

AlekSi · 2024-07-24T09:30:42Z

Set a timeout after which the fuzz test will be killed and the process restarted.

Choose whether timing out is considered a fail or a pass (eg timing out when fuzzing an interpreter is expected).

As someone with a lot of experience in fuzzing in Go, I can't stress enough how important it is. Without them, continuous fuzzing is essentially broken in Go: golang/go#48157, golang/go#56238, golang/go#52569

matklad · 2024-07-24T13:12:46Z

Non-deterministic CI failures ahoy!

We do something at TigerBeetle here. I am not sure if what we do is brilliant or cursed. What we do is that we use commit sha as a seed for "run fuzz tests once on CI" check:

https://github.com/tigerbeetle/tigerbeetle/blob/e35fc23877aef917850c9af9b5591c8f0fb8da87/.github/workflows/ci.yml#L54

this gives you deterministic results, where you don't have to fiddle with CI logs to fish out the seed, because knowing commit hash is enough
but it still avoids the pitwal of using a single seed and than, eg, always going to one branch of swarm testing

* Add the `-ffuzz` and `-fno-fuzz` CLI arguments. * Detect fuzz testing flags from zig cc. * Set the correct clang flags when fuzz testing is requested. It can be combined with TSAN and UBSAN. * Compilation: build fuzzer library when needed which is currently an empty zig file. * Add optforfuzzing to every function in the llvm backend for modules that have requested fuzzing. * In ZigLLVMTargetMachineEmitToFile, add the optimization passes for sanitizer coverage. * std.mem.eql uses a naive implementation optimized for fuzzing when builtin.fuzz is true. Tracked by ziglang#20702

andrewrk added this to the 0.14.0 milestone Jul 21, 2024

andrewrk mentioned this issue Jul 22, 2024

initial support for integrated fuzzing #20725

Merged

andrewrk mentioned this issue Jul 24, 2024

integrate fuzz testing into the build system #20773

Merged

andrewrk closed this as completed in #20773 Jul 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

integrated fuzz testing #20702

integrated fuzz testing #20702

andrewrk commented Jul 21, 2024 •

edited

Loading

squeek502 commented Jul 21, 2024 •

edited

Loading

mlugg commented Jul 21, 2024

nektro commented Jul 21, 2024

andrewrk commented Jul 21, 2024 •

edited

Loading

mlugg commented Jul 21, 2024 •

edited

Loading

andrewrk commented Jul 21, 2024 •

edited

Loading

squeek502 commented Jul 21, 2024

matklad commented Jul 21, 2024

kristoff-it commented Jul 21, 2024 •

edited

Loading

andrewrk commented Jul 22, 2024

dweiller commented Jul 22, 2024 •

edited

Loading

squeek502 commented Jul 22, 2024 •

edited

Loading

The-King-of-Toasters commented Jul 22, 2024

dweiller commented Jul 22, 2024 •

edited

Loading

Arya-Elfren commented Jul 23, 2024

jamii commented Jul 24, 2024

andrewrk commented Jul 24, 2024

jamii commented Jul 24, 2024

AlekSi commented Jul 24, 2024

matklad commented Jul 24, 2024 •

edited

Loading

integrated fuzz testing #20702

integrated fuzz testing #20702

Comments

andrewrk commented Jul 21, 2024 • edited Loading

squeek502 commented Jul 21, 2024 • edited Loading

mlugg commented Jul 21, 2024

nektro commented Jul 21, 2024

andrewrk commented Jul 21, 2024 • edited Loading

mlugg commented Jul 21, 2024 • edited Loading

andrewrk commented Jul 21, 2024 • edited Loading

squeek502 commented Jul 21, 2024

matklad commented Jul 21, 2024

kristoff-it commented Jul 21, 2024 • edited Loading

andrewrk commented Jul 22, 2024

dweiller commented Jul 22, 2024 • edited Loading

squeek502 commented Jul 22, 2024 • edited Loading

The-King-of-Toasters commented Jul 22, 2024

dweiller commented Jul 22, 2024 • edited Loading

Arya-Elfren commented Jul 23, 2024

jamii commented Jul 24, 2024

andrewrk commented Jul 24, 2024

jamii commented Jul 24, 2024

AlekSi commented Jul 24, 2024

matklad commented Jul 24, 2024 • edited Loading

andrewrk commented Jul 21, 2024 •

edited

Loading

squeek502 commented Jul 21, 2024 •

edited

Loading

andrewrk commented Jul 21, 2024 •

edited

Loading

mlugg commented Jul 21, 2024 •

edited

Loading

andrewrk commented Jul 21, 2024 •

edited

Loading

kristoff-it commented Jul 21, 2024 •

edited

Loading

dweiller commented Jul 22, 2024 •

edited

Loading

squeek502 commented Jul 22, 2024 •

edited

Loading

dweiller commented Jul 22, 2024 •

edited

Loading

matklad commented Jul 24, 2024 •

edited

Loading