-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add libfuzzer compatible fuzz harness #7512
Conversation
I wasn't sure on how best to manage the Cmake build files for fuzz tests, so I'm certainly open to suggestions on some of the changes here :) |
Because of the special toolchain requirements, I think this test should be moved to
The
The guiding principle here is "ask for what you want", here the |
Thanks for the quick response, I've made the suggested changes and also added some of the precursors for multi-fuzzer engine builds for oss-fuzz :) |
Looks like I've got at least a couple of formatting things to fix. On a different note, I've got a corresponding draft PR to integrate with OSS-fuzz. A couple of things I'll need for that integration to go through;
If you are happy for me to do so, I can pull the email addresses from the git log, it's worth noting however that these emails will be listed in plaintext in the oss-fuzz repo. |
No, but when it inevitably fails somewhere, having comments and/or messages to help explain would be nice |
As discussed in halide#4088 the fuzz_simplify harness is an easy candidate to begin support for coverage driven feedback that is required for high performance fuzz testing.
Ok so I think there is a bit of a bug with cmakes |
Co-authored-by: Alex Reinking <alex_reinking@berkeley.edu>
I think I've covered most things here. Is this build failure related to my changes? The error is;
|
LGTM, waiting on @alexreinking approval |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great! Thanks for the contribution!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@steven-johnson would you mind double checking these? I'll have a PR up shortly with these changes.
Type fuzz_types[] = {UInt(1), UInt(8), UInt(16), UInt(32), Int(8), Int(16), Int(32)}; | ||
const int fuzz_type_count = sizeof(fuzz_types) / sizeof(fuzz_types[0]); | ||
|
||
std::string fuzz_var(int i) { | ||
return std::string(1, 'a' + i); | ||
} | ||
|
||
Expr random_var() { | ||
int fuzz_count = rng() % fuzz_var_count; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: change as per #7546
std::vector<int> divisors = {t.lanes()}; | ||
for (int dd = 2; dd < t.lanes(); dd++) { | ||
if (t.lanes() % dd == 0) { | ||
divisors.push_back(dd); | ||
} | ||
} | ||
|
||
return divisors[rng() % divisors.size()]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks fine as per #7546
if (T.is_int() && T.bits() == 32) { | ||
overflow_undef = true; | ||
} | ||
if (T.is_scalar()) { | ||
int var = rng() % fuzz_var_count + 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be fdp.ConsumeIntegralInRange<int>(0, fuzz_var_count)
as per #7546 NOTE: drop the + 1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The semantics of fuzz_var_count aren't clear to me -- in various for loops we do for (int i = 0; i < fuzz_var_count; i++)
which implies that we need to limit the range to max=fuzz_var_count-1?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah looking at this again the var
variable seems to only get used once on line 56, so I could probably just get rid of it and replace the condition on line 56 with fdp.ConsumeBool
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm guessing it was +1 in the beginning so that the condition on line 56 was sometimes true/false. In any case it seems like a really roundabout way of doing it. I'm just going to replace it with the consumebool method.
return cast(T, v1); | ||
} else { | ||
if (overflow_undef) { | ||
// For Int(32), we don't care about correctness during | ||
// overflow, so just use numbers that are unlikely to | ||
// overflow. | ||
return cast(T, (int)(rng() % 256 - 128)); | ||
return cast(T, fdp.ConsumeIntegralInRange<int>(-128, 128)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be fdp.ConsumeIntegralInRange<int>(-128, 127)
as per #7546
} else { | ||
return cast(T, (int)(rng() - RAND_MAX / 2)); | ||
return cast(T, fdp.ConsumeIntegralInRange<int>(-RAND_MAX / 2, RAND_MAX / 2)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be fdp.ConsumeIntegralInRange<int>(-RAND_MAX / 2, RAND_MAX / 2 -1)
as per #7546
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
...except, RAND_MAX applies to the rand()
call in stdlib; does libfuzzer also make the same promise about range?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well the allowable range is going to be std::numeric_limits<int>::min()
-> std::numeric_limits<int>::min()
which is what ConsumeIntegral<int>
will do but then if you want to constrain it further you can use the ConsumeIntegralInRange
so no there aren't any promises other than the given range.
The FuzzedDataProvider class is just taking a stream of bytes from the fuzzer and turning them into other types,in this case an integral. The FuzzedDataProvider class just makes managing this easier as it essentially moves a cursor along the byte stream so that you aren't consuming the the same data twice.
int op = rng() % op_count; | ||
Expr a = random_expr(fdp, T, depth); | ||
Expr b = random_expr(fdp, T, depth); | ||
int op = fdp.ConsumeIntegralInRange<int>(0, op_count); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be fdp.ConsumeIntegralInRange<int>(0, op_count-1)
as per #7546
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or better yet, use PickValueInArray()?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True that would be better
} | ||
|
||
const int bin_op_count = sizeof(make_bin_op) / sizeof(make_bin_op[0]); | ||
const int bool_bin_op_count = sizeof(make_bool_bin_op) / sizeof(make_bool_bin_op[0]); | ||
const int op_count = bin_op_count + bool_bin_op_count + 5; | ||
|
||
int op = rng() % op_count; | ||
int op = fdp.ConsumeIntegralInRange<int>(0, op_count); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be fdp.ConsumeIntegralInRange<int>(0, op_count-1)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes.. but looking at this now, the logic is a little fragile; the switch
statement has implicit assumptions about the count and ordering of the two op arrays. What happens if someone inserts a new one? Would be nice to make this a bit more robust, e.g.:
- Use custom enum values to associate the op functions with the switch statements?
- Or maybe, rework
make_bin_op
andmake_bool_bin_op
arrays so that each entry is apair<>
with the existing op + a lambda function that contains the code from the relevant switch case, so that the switch goes away entirely?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(I realize the fragility was pre-existing before your changes, but improving the resilience here seems easy and worthwhile while we are here)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, looking at this again a second time I'm struggling to understand what is going on here. Where did this 5
come from.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh nevermind I see what is going on here... that's a bit wack. I'll refactor as per your suggestions.
Type random_type(int width) { | ||
Type T = fuzz_types[rng() % fuzz_type_count]; | ||
Type random_type(FuzzedDataProvider &fdp, int width) { | ||
Type T = fdp.PickValueInArray(fuzz_types); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hadn't noticed this call existed -- perhaps we should aggressively use it in other cases too where possible (e.g. line 47, line 102)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It won't work on line 47 as its a vector and even though it would have been trivial to implement it doesn't work on vectors. But 102 could definetely use that.
As discussed in #4088 the fuzz_simplify harness is an easy candidate to begin support for coverage-driven feedback required for high-performance fuzz testing.