-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
hypercall: dump RANGE_SUBMIT filters in workdir #10
Conversation
to be parsed by kAFL fuzzer on shutdown
Maybe do this here, on first snapshot where PT is also setup for the rest of the campaign? kafl.qemu/nyx/hypercall/hypercall.c Line 111 in 1f535cd
Then it will work independently of how the range is configured (I think RANGE_SUBMIT will override previous host-side settings but not sure). Also, can make this a separate function "dump_pt_settings" or so.. |
@schumilo any opinion on this? |
This sounds like a very front-end-specific task. Is there really a good reason to implement this in the backend instead of utilizing the existing feature set? Why not simply dump the guest-provided ranges via DUMP_FILE while also executing RANGE_SUBMIT (we could just put everything into a kAFL-guest specific helper function for that)? From there on, we can implement everything else in the front end (e.g., |
There are at least two ways to configure this and only Qemu knows the true value at the time of setting up PT. I remember some discussion if RANGE_SUBMIT takes precedence over host-side settings or only overrides unused filter regions etc. Or what happens if you call it twice? Doing this in the individual agents means that we cannot rely on this info to be there and accurate. We can look at it, and maybe it should take precedence over host side config settings or maybe not? It just continues the confusion.. We could report it back via the aux buffer and then the frontend could write it to the disk, but simply dumping the truth as seen by Qemu seems more reliable + useful feature. |
https://github.com/IntelLabs/kafl.qemu/blob/kafl_stable/nyx/hypercall/hypercall.c#L292 The hypercall checks whether the requested IP filter is already configured, and simply returns if that's the case. |
Exactly! So, isn't it easier to implement this feature in the agent & front end (considering the extra effort to maintain, test and write a test only for this particular feature in the first place)? In case the range was configured manually, the front end needs to create a file with the information for that in the workdir. If the RANGE_SUBMIT values are set by the guest and copied over to the host via DUMP_FILE, we will also get this information. And at that point, it's pretty straightforward for the front end to figure out which config file should be used. And in case neither the manual conf file nor the auto conf file exists, the front-end reports an error. I won't argue if you still think this feature would be better implemented in the backend. But since this information is also stored in the snapshot, we could also try another approach and parse the actual snapshot files: https://github.com/nyx-fuzz/QEMU-Nyx/blob/qemu-nyx-4.2.0/nyx/state/snapshot_state.h#L19 This is probably a much better approach than creating front-end-specific files based on the execution of specific hypercalls. However, currently, the snapshot is only created if we enable this feature (for instance, sometimes it simply makes no sense to dump and store the entire snapshot on the hard drive; tough that feature is always enabled in multiprocessing mode), but this is something we could quickly fix by always saving the metadata but not necessarily the entire snapshot data (like the RAM dump and block device diff). If parsing a binary file in Python is tricky, we could also use a custom format like JSON to store the information in a separate file. |
BTW: If the goal of this PR is that |
kafl cov can already work based on the snapshot. The older code is still there (and overall a big mess now) but loading the snapshot is much more reliable especially since the pre-snapshot part is rarely used now. But if you then want to tell Directly creating coverage is also there, the fuzzer can save the binary pt dumps away during the fuzzing run and we decode them later. The only overhead is for qemu to actually write out the trace buffer and for the frontend to move (and zip it). But again, the later ptdump decoding needs to know the pt ranges. So yes, we have the info in the frontend and we can write a file if done by the agent, it just seems to me that the ground truth is in qemu and that's where I would dump it from to have it always available and consistent. But hey, I'm not writing the patch... :-) |
Okay, so how about getting the trace region information from the snapshot files If this sounds reasonable to you, I could start with the PR next week. With that, I will fix the current limitation and if required extend the serializer to additionally store the meta-data (which includes information such as the configured trace regions) in a JSON-like format. My goal here is that both QEMU-Nyx repos remain to some degree aligned, but especially the current patch is something I would rather not like to merge into the main repo at a later point. |
How about the following PR instead? nyx-fuzz/QEMU-Nyx#53 |
Thanks for the discussion. Since the information is already present in the root snapshot, let's reuse that. What about the limitation of memory snapshots ? you said that it could be lifted quite easily. |
Seems like that this limitation does not affect kAFL at all; simply because it does not use the "skip_serialization" option: However, all other libnyx-based fuzzers (such as AFL++) will require a libnyx update, but since we ship all components as submodules, we can fix that simply by updating all commit IDs accordingly. The PR is now also ready: |
The proposed patch is also already tested by me, but maybe I have missed something important. However, as soon as you have confirmed that everything works as expected with kAFL, I will merge this PR and also another one into the main branch ( |
@schumilo I confirm your PR works fine with kAFL. LGTM ! |
Okay, thanks for the update :-) |
Closing this PR as nyx-fuzz/QEMU-Nyx#53 has been merged in upstream repo and integrated into latest |
This PR aims to dump the PT filters set by the guest through
RANGE_SUBMIT
, in order for the fuzzer to parse them on shutdown, and update his ownWORKDIR/config.yaml
config dump.The final goal is to enable users to launch
kalf cov
silently without further IP filters configuration required.Related: IntelLabs/kafl.fuzzer#68