-
-
Notifications
You must be signed in to change notification settings - Fork 6.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Oss-Fuzz] Adding fuzzer-stl_operations #2160
Conversation
I may not understand the issue completely, but that change does not really increase the coverage of the parsers, but only adds some more constructors. In the end, any JSON input will be processed by the same parser - regardless whether it was read from a file, a string, a vector, etc. How did you come up with the coverage numbers? Do you have insights into fuzzing? I never digged too deep into this. |
I do think that these targets bring a tiny bit of value because the parser is technically a different block of code in each of them. @nlohmann is correct that they will never catch programmer errors within the library's codebase that will not be caught by the existing fuzz tests. However, it's theoretically possible to run into a compiler bug that would exhibit itself through the interaction of a complex-ish iterator implementation and the parser. Whether this is something worth throwing compute at is debatable considering how unlikely such a scenario is. Regardless of all that, these coverage numbers are "slightly" misleading since the tests causes additional template instantiations that are never-ever going to be seen in the wild. The ordered containers: Edit: wait, I'm suddenly confused. From the title of this PR, I thought this was fuzzing the parser, but it makes no sense since parsing non-contiguous containers is not merged into the This is fuzzing the |
So far, all value from the fuzzers were to detect issues during parsing. I do not see value in extending the fuzzing to the input adapters, because they can be unit tested 100%. Though, technically, we would just wasting Google's resources here, I'd rather improve the coverage of the existing fuzz targets if it is not 100% yet. |
… when it has large data
@nlohmann and @FrancoisChabot thank you very much for your comments. I apologize for the wrong and confusing name of the fuzz target and lack of explanation. I have updated this pull request with more explanation about the functions that this fuzz target tests. I have also added how did I come up with the coverage stats. I am an intern at Google and writing/improving fuzz target for this repo is a part of my internship project. I learned about fuzzing only a few days back. I got the coverage stats for each of the functions in this fuzz target, and I found out that the improvement in coverage because of parsing or direct conversion from STL containers was very little, like you mentioned. The main improvement came because of loops in JSON map & JSON vectors, and because of methods like push_back and get. I request you to please go through the pull request again and decide if it is of value. |
@tanuj208 That is cool to hear! I really appreciate the effort Google already put in checking this project with OSS-Fuzz. I have a question: How exactly is the project integrated into OSS Fuzz? Do you just execute |
I'm going to reiterate (and hopefully better reword) my concern that I seriously suspect this new fuzzer does not fuzz what you think it fuzzes: If I compile and run the following: #include "nlohmann/json.hpp"
#include <iostream>
int main() {
std::string data = "[1, 3, 4]";
std::vector<uint8_t> vec(data.begin(), data.end());
// parsing from STL containers
nlohmann::json j_vector(vec);
std::cout << j_vector << "\n";
} The result is:
That's because initializing a Changing the contents of the vector (and the same goes for any stl container passed this way) cannot change which code paths are executed, and that makes it a bad candidate for fuzzing. Now. if you did: |
I agree with @FrancoisChabot - however, exactly this check (parsing from a vector of bytes from OSS Fuzz) is already performed by |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fuzzing makes no sense like this.
std::unordered_multiset<uint8_t> umultiset(data, data + size); | ||
|
||
// parsing from STL containers | ||
json j_vector(vec); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a constructor call, not a parse call. Instead of interpreting the bytes in the containers to create a JSON value, the containers are interpreted as JSON array. As such, no real library code is tested, but just copy/move constructors from STL containers.
@nlohmann and @FrancoisChabot, I understand what you were trying to explain and why the fuzz target I wrote cannot change code paths with different inputs. I will close this pull request for now. I am very thankful for all the quick replies. @nlohmann regarding your question about OSS Fuzz, @oliverchang has more context, I will let him respond. |
If there is any way to improve the current fuzzing, e.g. with a grammar or so, please let us know! |
A huge +1 to adding a grammar to the fuzzer as per https://github.com/google/AFL#9-fuzzer-dictionaries, especially considering afl appears to come pre-packaged with a json dictionary. That being said, this is a property of how the fuzzer is invoked rather than how it is built, which seems to be external to this repo. |
Json is integrated into OSS-Fuzz here. We do essentially just execute |
After adding the JSON dictionary, the coverage did not increase since fuzzer was running for a long time and after trying millions of random inputs, it picked up the JSON format by itself. But since it is good to have a dictionary, I added that in oss-fuzz repository. It should get merged soon. |
I went through the codebase and added a new fuzz target - fuzzer-stl_operations to improve the coverage of the existing fuzz targets.
This fuzz target tests the following functionality -
By doing so, the coverage goes up by
Line coverage: 284 lines
Function coverage: 31 functions
Region coverage: 89 regions
Coverage report
To get this coverage report, I did the following:
Let me know if there is any incorrect usage of the API