bzip2: Add bzip2 to oss-fuzz #1887

bshastry · 2018-10-18T14:33:09Z

This PR integrates bzip2 into oss-fuzz (CC #402 ).

It contains two targets
- comp/decompressor, and a standalone decompressor
- should support all sanitizers
- test harnesses out of source since there is no official bzip2 repo that seems to be maintained.

The decompressor target flags a wild read within seconds.

jonathanmetzman

We approve of adding this project. Thanks!
Really awesome that it has found a bug already.

Do you have jseward's approval for this?

I left some comments. The only blockers are the printing and the question I had about the safety of the hardcoded sizes given to malloc.
Everything else is optional to fix/improve (note that is what nit means).

jonathanmetzman · 2018-10-18T15:10:19Z

projects/bzip2/bzip2_compress_target.c

+{
+    int r;
+    unsigned int nZ, nOut;
+    char *zbuf = malloc(size + 600 + (size / 100));


What are these hardcoded values? Are they guaranteed to always work?

They have been borrowed from here. I am not sure if it is always guaranteed to work, but this code is present in a test harness that is required to test the robustness of (de)compression.

@julian-seward1 what do you think?

I think it's OK. They just reflect the fact that the worst case output size is 101% of the input size + 600 bytes (I assume -- this is now nearly 20 years old). Since the buffer is in mallocville, presumably asan will complain if it gets overrun. I doubt that will happen though.

Thank you, I've sent a new PR #1894 in which I use the same buffer sizes (1 MB) as the original code. I think the size is more of an issue during decompression that I fixed now.

jonathanmetzman · 2018-10-18T15:14:36Z

projects/bzip2/bzip2_compress_target.c

+
+    r = BZ2_bzBuffToBuffCompress(zbuf, &nZ, (char *)data, size, 9, 0, 30);
+    if (r != BZ_OK) {
+        fprintf(stdout, "Compression error: %d\n", r);


We really prefer not to have logging messages output by the fuzzer. Can you put this printf and others behind an env var?

jonathanmetzman · 2018-10-18T15:15:06Z

projects/bzip2/Dockerfile

+FROM gcr.io/oss-fuzz-base/base-builder
+MAINTAINER bshas3@gmail.com
+RUN apt-get update && apt-get install -y make autoconf automake libtool wget
+RUN wget ftp://sources.redhat.com/pub/bzip2/v102/bzip2-1.0.2.tar.gz 


".tar.gz"
This is kind of funny.

Does this mean you're testing 1.0.2 here? That is out of date. You should be testing 1.0.6.

What is the official hosting repo for this version? Could you please link a URL here? Thanks.

jonathanmetzman · 2018-10-18T15:16:24Z

projects/bzip2/bzip2_compress_target.c

+    r = BZ2_bzBuffToBuffCompress(zbuf, &nZ, (char *)data, size, 9, 0, 30);
+    if (r != BZ_OK) {
+        fprintf(stdout, "Compression error: %d\n", r);
+        goto cleanup;


nit: goto seems like overkill in this case but fine to keep IMO.

jonathanmetzman · 2018-10-18T15:19:24Z

projects/bzip2/bzip2_compress_target.c

+    }
+    else {
+        assert(nOut == size);
+        assert(memcmp(data, outbuf, size) == 0);


Great to test for correctness. I suspect it won't turn up anything but it would be fantastic if it did.

projects/bzip2/bzip2_compress_target.c

jonathanmetzman · 2018-10-18T15:28:20Z

projects/bzip2/bzip2_compress_target.c

+    unsigned int nZ, nOut;
+    char *zbuf = malloc(size + 600 + (size / 100));
+
+    r = BZ2_bzBuffToBuffCompress(zbuf, &nZ, (char *)data, size, 9, 0, 30);


nit: this doesn't have to come in this PR but could varying blockSize100k and workFactor increase coverage/bugs?
You can do that by dedicating input bytes (as discussed previously), or by reusing input bytes (such as by hashing and using that to seed a random number generator).
I'm actually unsure which way is better, dedicating bytes seems more correct but I don't think the fuzzer is smart enough to take advantage of this and it has the unfortunate side-effect of making the corpus not actually bz2 but X bytes + bz2 (I would probably reuse bytes, but this is just my best guess).

Yeah, this is a good point. I will think about it :-)

Thinking out loud here: I think the following results in fairly uniformly random workFactors

int hash = hash32(data, size, salt); workFactor = hash%251; // max workFactor = 250

Because

hash may be assumed to be uniformly random

random modulo div X (prime) retains randomness?

We got lucky with 251 though. Even if we were to not have the upper bound (modulo) be a prime e.g., in the case of block size, the max block size is 9 (which means we need to choose mod 11 instead of mod 10). This means we may have (1/11) chance of block size being 10, but this is okay since bzip2 handles such misconfigurations gracefully?

IMO this kind of stuff is more art than science so I could be wrong this:
I wouldn't worry about the options having a uniformly random distribution. I think as long as an option value isn't super unlikely, the odds are that the fuzzer will create a lot of testcases with that value.

This means we may have (1/11) chance of block size being 10, but this is okay since bzip2 handles such misconfigurations gracefully?

I would probably just do mod 10 even if some values are more likely to be selected than others.

Thanks :-)

Will incorporate this in upcoming PR.

jonathanmetzman · 2018-10-18T15:33:28Z

projects/bzip2/bzip2_compress_target.c

+
+    nOut = size*2;
+    char *outbuf = malloc(nOut);
+    r = BZ2_bzBuffToBuffDecompress(outbuf, &nOut, zbuf, nZ, 0, 0);


nit: Maybe in later PR we could vary the small param?
(it might result in less bugs since I read that the algorithm used will be slower, but I suspect introducing a slow path, unless really bad could increase bugs found).

ack. Will address in future PR.

jonathanmetzman · 2018-10-18T15:34:10Z

projects/bzip2/bzip2_decompress_target.c

+
+    nOut = size*2;
+    char *outbuf = malloc(nOut);
+    r = BZ2_bzBuffToBuffDecompress(outbuf, &nOut, (char *)data, size, 0, 0);


nit: same comment about small param as in other file.

ack, same as above.

jonathanmetzman · 2018-10-18T15:46:09Z

projects/bzip2/bzip2_decompress_target.c

+    char *outbuf = malloc(nOut);
+    r = BZ2_bzBuffToBuffDecompress(outbuf, &nOut, (char *)data, size, 0, 0);
+    if (r != BZ_OK) {
+        fprintf(stdout, "Decompression error: %d\n", r);


Same comment about printf as in other file. Please don't do it.

julian-seward1 · 2018-10-18T16:44:05Z

Good, good. Please let me know what you find.

Dor1s

Left some comments for bzip2_decompress_target.c, the same applies for the compress fuzz target.

Dor1s · 2018-10-19T22:45:09Z

projects/bzip2/bzip2_decompress_target.c

+    unsigned int nZ, nOut;
+
+    nOut = size*2;
+    char *outbuf = malloc(nOut);


I'd highly recommend to use a static buffer for this with some fixed size (e.g. 256 KB, see zlib example: https://github.com/google/oss-fuzz/blob/master/projects/zlib/zlib_uncompress_fuzzer.cc#L11). It should give a very significant speed up.

Maybe not.
a static buffer might be faster, but then we may lose some asan buffer overflow checking on it.

Also a question about nOut = size*2;
Does this mean that the decompression is often expected to return early due to end-of-output?

Dor1s · 2018-10-19T22:45:52Z

projects/bzip2/bzip2_decompress_target.c

+        return 0;
+    }
+
+    assert(nOut == size);


This and the next lines are wrong, why would we expect the decompressed data to be the same as its compressed representation?

~~Aren't we comparing original fuzzed data with the compressed and subsequently decompressed data? I'd expect decompressed(compressed (data)==data. Same with size.~~

Oops, sorry. I thought we were talking about the compression target. I'll look into this and ping back.

inferno-chromium · 2018-10-22T15:15:10Z

bzip2 target bzip2_compress_target is having startup crash, please get the crash fixed asap upstream or disable ubsan config in project.yaml, this is causing exceptions due to no archived build for ubsan config.

Step #13: BAD BUILD: /workspace/out/undefined/bzip2_compress_target seems to have either startup crash or exit:
Step #13: INFO: Seed: 784997807
Step #13: INFO: Loaded 1 modules (2257 inline 8-bit counters): 2257 [0x763cf0, 0x7645c1),
Step #13: INFO: Loaded 1 PC tables (2257 PCs): 2257 [0x511de0,0x51aaf0),
Step #13: INFO: -max_len is not provided; libFuzzer will not generate inputs larger than 4096 bytes
Step #13: INFO: A corpus is not provided, starting from an empty corpus
Step #13: blocksort.c:302:7: runtime error: left shift of 1 by 31 places cannot be represented in type 'int'
Step #13: #0 0x47bee4 in fallbackSort /src/bzip2-1.0.2/blocksort.c:302:7
Step #13: #1 0x47b037 in BZ2_blockSort /src/bzip2-1.0.2/blocksort.c
Step #13: #2 0x46bb65 in BZ2_compressBlock /src/bzip2-1.0.2/compress.c:658:7
Step #13: #3 0x4665c0 in handle_compress /src/bzip2-1.0.2/bzlib.c
Step #13: #4 0x4661ae in BZ2_bzCompress /src/bzip2-1.0.2/bzlib.c:501:21
Step #13: #5 0x46a4bc in BZ2_bzBuffToBuffCompress /src/bzip2-1.0.2/bzlib.c:1300:10
Step #13: #6 0x430eed in LLVMFuzzerTestOneInput /src/bzip2_compress_target.c:47:9
Step #13: #7 0x440b28 in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) /src/libfuzzer/FuzzerLoop.cpp:570:15
Step #13: #8 0x43fef5 in fuzzer::Fuzzer::RunOne(unsigned char const*, unsigned long, bool, fuzzer::InputInfo*, bool*) /src/libfuzzer/FuzzerLoop.cpp:479:3
Step #13: #9 0x4424a1 in fuzzer::Fuzzer::ReadAndExecuteSeedCorpora(std::__1::vector<std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator >, fuzzer::fuzzer_allocator<std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator > > > const&) /src/libfuzzer/FuzzerLoop.cpp:765:5
Step #13: #10 0x442895 in fuzzer::Fuzzer::Loop(std::__1::vector<std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator >, fuzzer::fuzzer_allocator<std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator > > > const&) /src/libfuzzer/FuzzerLoop.cpp:805:3
Step #13: #11 0x435559 in fuzzer::FuzzerDriver(int*, char***, int ()(unsigned char const, unsigned long)) /src/libfuzzer/FuzzerDriver.cpp:764:6
Step #13: #12 0x4310a8 in main /src/libfuzzer/FuzzerMain.cpp:20:10
Step #13: #13 0x7ff519f8f82f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f)
Step #13: #14 0x405df8 in _start (out/undefined/bzip2_compress_target+0x405df8)
Step #13:
Step #13: SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior blocksort.c:302:7 in
Step #13: MS: 0 ; base unit: 0000000000000000000000000000000000000000
Step #13: 0xa,

bshastry · 2018-10-24T19:45:07Z

Hi @inferno-chromium #1894 fixes this.

…#1887

* bzip2: Bug fixes; added citations * bzip2: Bump bzip2 version to 1.0.6 * bzip2: Variable blockSize100k, workFactor, and small * bzip2: Remove hardcoded buffer sizes; fix nZ (compressor) to point to real buffer size * bzip: Remove ubsan from project.yaml beacuse of start-up crash #1887

bzip2: Add bzip2 to oss-fuzz

47e5447

jonathanmetzman reviewed Oct 18, 2018

View reviewed changes

bzip2: Hid prints behind env var

fa04a50

jonathanmetzman approved these changes Oct 19, 2018

View reviewed changes

jonathanmetzman merged commit c6a08db into google:master Oct 19, 2018

Dor1s reviewed Oct 19, 2018

View reviewed changes

bshastry mentioned this pull request Oct 20, 2018

bzip2: Bug fixes #1894

Merged

tmatth pushed a commit to tmatth/oss-fuzz that referenced this pull request Oct 22, 2018

bzip2: Add bzip2 to oss-fuzz (google#1887)

5c6098b

bshastry added a commit to bshastry/oss-fuzz that referenced this pull request Oct 24, 2018

bzip: Remove ubsan from project.yaml beacuse of start-up crash google…

e4c0879

…#1887

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bzip2: Add bzip2 to oss-fuzz #1887

bzip2: Add bzip2 to oss-fuzz #1887

bshastry commented Oct 18, 2018 •

edited

Loading

jonathanmetzman left a comment

jonathanmetzman Oct 18, 2018

bshastry Oct 19, 2018

julian-seward1 Oct 21, 2018

bshastry Oct 21, 2018

jonathanmetzman Oct 18, 2018

bshastry Oct 19, 2018

jonathanmetzman Oct 18, 2018

julian-seward1 Oct 21, 2018

bshastry Oct 21, 2018

jonathanmetzman Oct 18, 2018

bshastry Oct 19, 2018

jonathanmetzman Oct 18, 2018

bshastry Oct 19, 2018

jonathanmetzman Oct 18, 2018 •

edited

Loading

bshastry Oct 19, 2018

jonathanmetzman Oct 19, 2018

bshastry Oct 19, 2018

jonathanmetzman Oct 18, 2018

bshastry Oct 19, 2018

jonathanmetzman Oct 18, 2018

bshastry Oct 19, 2018

jonathanmetzman Oct 18, 2018

bshastry Oct 19, 2018

julian-seward1 commented Oct 18, 2018

Dor1s left a comment •

edited

Loading

Dor1s Oct 19, 2018

kcc Oct 19, 2018

Dor1s Oct 19, 2018

bshastry Oct 20, 2018 •

edited

Loading

inferno-chromium commented Oct 22, 2018

bshastry commented Oct 24, 2018

bzip2: Add bzip2 to oss-fuzz #1887

bzip2: Add bzip2 to oss-fuzz #1887

Conversation

bshastry commented Oct 18, 2018 • edited Loading

jonathanmetzman left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jonathanmetzman Oct 18, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

julian-seward1 commented Oct 18, 2018

Dor1s left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bshastry Oct 20, 2018 • edited Loading

Choose a reason for hiding this comment

inferno-chromium commented Oct 22, 2018

bshastry commented Oct 24, 2018

bshastry commented Oct 18, 2018 •

edited

Loading

jonathanmetzman Oct 18, 2018 •

edited

Loading

Dor1s left a comment •

edited

Loading

bshastry Oct 20, 2018 •

edited

Loading