-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Processors] Add Binary File Parsing Processor #24195
Conversation
💚 Build Succeeded
Expand to view the summary
Build stats
Test stats 🧪
Trends 🧪💚 Flaky test reportTests succeeded. Expand to view the summary
Test stats 🧪
|
To make it so that this doesn't require |
Pinging @elastic/security-external-integrations (Team:Security-External Integrations) |
Hi! We're labeling this issue as |
This pull request is now in conflicts. Could you fix it? 🙏
|
Hi! We're labeling this issue as |
Hi! |
What does this PR do?
This is a very large PR. It adds support for parsing data out of
pe
,macho
,elf
, andlnk
files and dumping it to Elasticsearch. It's undergone some minor fuzzing, but does panic due to oversized memory allocations with some malformed elf files (some examples of which are inlibbeat/formats/fixtures/elf/crashes
) that break go's elf parsing library. I'll look into seeing if I can figure out a workaround or see if I can commit an upstream patch at some point.This parses files through leveraging a new
add_file_data
processor created to parse files specified at a given path. One of the oddities with how this works is that due to the supported formats not currently being finalized in ECS, any beat/module that uses this will need to add the extended fields that this processor adds into a module'sfields.yml
. Ideally this would eventually be replaced with either official ECS support or more modular field definitions through packages.It builds off of the work by @peasead in elastic/ecs#1097, elastic/ecs#1071, and elastic/ecs#1077 with some minor changes and the addition of an lnk file format.
For now, it adds the corresponding field mappings and templated processor settings to the
auditbeat.file_integrity
module.Last major thing is that internally thetelfhash
calculations use the capstone disassembly framework to disassemble and hash non-exported call sites. Capstone is written in C and, if we want to keep the telfhash code around then I'll have to look into compiling it into libbeat (unless someone has other ideas).The processor itself takes a number of configuration arguments to reduce the impact of doing this for every single file on the system and instead parsing files that are of specific interest. Configuration used for filtering/changing failure modes are:
Additionally, like all processors, you can filter out even more noise with processor conditions by adding a
when
block.Checklist
CHANGELOG.next.asciidoc
orCHANGELOG-developer.next.asciidoc
.Use cases
File forensics.
Logs
Enabling with the following in
auditbeat.yml
:executing the following:
and then querying:
gives me the following:
CC: @ebeahan , @dcode , @peasead