Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File.macho create #1097

Closed
wants to merge 39 commits into from
Closed

File.macho create #1097

wants to merge 39 commits into from

Conversation

peasead
Copy link
Contributor

@peasead peasead commented Nov 5, 2020

RFC Preview

Resolves #1096

  • Have you signed the contributor license agreement? Yes
  • Have you followed the contributor guidelines? Yes
  • For proposing substantial changes or additions to the schema, have you reviewed the RFC process? Yes
  • If submitting code/script changes, have you verified all tests pass locally using make test? NA at this stage
  • If submitting schema/fields updates, have you generated new artifacts by running make and committed those changes? NA at this stage
  • Is your pull request against master? Unless there is a good reason otherwise, we prefer pull requests against master and will backport as needed. Yes
  • Have you added an entry to the CHANGELOG.next.md? NA at this stage

peasead and others added 5 commits October 20, 2020 14:37
Prepare link to Logs docs changing with the 7.10 release in "products…
Prepare link to Logs docs changing with the 7.10 release in "getting-…
@peasead peasead self-assigned this Nov 5, 2020
@peasead peasead added RFC and removed RFC:candidate labels Nov 5, 2020
@andrewstucki
Copy link
Contributor

Not that I'm super excited about it, but--how do you envision working with fat object files with multiple architectures? Just wondering because I would think you'd almost need to structure the entire macho field sets as nested if you want to handle them... @webmat what are your thoughts on making something that could be as large as a macho field set entirely nested?

rfcs/text/0000-create-file-mach-o.md Outdated Show resolved Hide resolved
rfcs/text/0000-create-file-mach-o.md Outdated Show resolved Hide resolved
rfcs/text/0000-create-file-mach-o.md Outdated Show resolved Hide resolved
@webmat webmat added the 1.9.0 label Nov 18, 2020
rfcs/text/0000-create-file-mach-o.md Outdated Show resolved Hide resolved
rfcs/text/0000-create-file-mach-o.md Outdated Show resolved Hide resolved
rfcs/text/0000-create-file-mach-o.md Outdated Show resolved Hide resolved
rfcs/text/0000-create-file-mach-o.md Outdated Show resolved Hide resolved
rfcs/text/0000-create-file-mach-o.md Outdated Show resolved Hide resolved
rfcs/text/0000-create-file-mach-o.md Outdated Show resolved Hide resolved
rfcs/text/0000-create-file-mach-o.md Outdated Show resolved Hide resolved
rfcs/text/0000-create-file-mach-o.md Outdated Show resolved Hide resolved
rfcs/text/0000-create-file-mach-o.md Outdated Show resolved Hide resolved
rfcs/text/0000-create-file-mach-o.md Outdated Show resolved Hide resolved
rfcs/text/0000-create-file-mach-o.md Outdated Show resolved Hide resolved
rfcs/text/0000-create-file-mach-o.md Outdated Show resolved Hide resolved
rfcs/text/mach-o/mach-o.yml Outdated Show resolved Hide resolved
peasead and others added 10 commits December 23, 2020 14:07
Co-authored-by: Andrew Stucki <andrew.stucki@gmail.com>
Co-authored-by: Andrew Stucki <andrew.stucki@gmail.com>
Co-authored-by: Andrew Stucki <andrew.stucki@gmail.com>
Co-authored-by: Andrew Stucki <andrew.stucki@gmail.com>
Co-authored-by: Andrew Stucki <andrew.stucki@gmail.com>
Co-authored-by: Andrew Stucki <andrew.stucki@gmail.com>
Co-authored-by: Eric Beahan <ebeahan@gmail.com>
@peasead
Copy link
Contributor Author

peasead commented Feb 9, 2021

@ebeahan @dcode has made the call on the nested field type for macho.headers.

Are there any hanging questions for this PR stage?

@ebeahan ebeahan removed the 1.9.0 label Feb 10, 2021
Copy link
Member

@ebeahan ebeahan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made a pass on the macho YAML field definition.

The only major change I made was removing the sub-fields from macho.segments.sections, due to using type: flattened. I also touch on this in another comment below.

rfcs/text/0000-create-file-macho.md Outdated Show resolved Hide resolved
rfcs/text/0000-create-file-macho.md Outdated Show resolved Hide resolved
rfcs/text/0000-create-file-macho.md Outdated Show resolved Hide resolved
rfcs/text/0000-create-file-macho.md Outdated Show resolved Hide resolved
rfcs/text/0000-create-file-macho.md Show resolved Hide resolved
peasead and others added 2 commits February 16, 2021 14:57
Co-authored-by: Eric Beahan <ebeahan@gmail.com>
Co-authored-by: Eric Beahan <ebeahan@gmail.com>
devonakerr
devonakerr previously approved these changes Feb 17, 2021
@ebeahan
Copy link
Member

ebeahan commented Mar 5, 2021

I think we're nearly there with this one. Here's a summary of what's already completed and what's outstanding for advancing:

  • Identified sponsor. @devonakerr is sponsoring and has already reviewed ✅
  • High-level description of examples of usage
  • High-level description of example sources of data
  • Identified potential concerns and implementation challenges/complexity
  • Subject matter experts identified and weighed in
  • Outlined initial field definitions.*

*I believe the thread here is the last open discussion point, but I'll make another pass at the current field definitions in the proposal.

| Name | Type | Description |
|--------------------------------------------|------------|-----------------------------------------------------------------------------|
| macho.cpu | object | CPU information for the file. |
| macho.cpu.architecture | keyword | CPU architecture target for the file. |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andrewstucki I did some cross-checking between the proposed field list here and your work in elastic/beats#24195. I pulled those mappings for macho into a gist for easier reference.

100% alignment isn't necessary at all at this point. However, there are enough differences between that approach and what's captured here I think it's worth raising.

@peasead @dcode @devonakerr

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have since updated this PR with a proposed layout that aligns sections and segments with ELF and PE (which doesn't have segments).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @dcode.

There's still some large differences between the implementation in elastic/beats#24195 and what's proposed here.

Again, not something we need to completely resolve now. At a minimum, let's capture the difference in approaches as a Concern to make sure we revisit the topic in the next stage's PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andrewstucki the strategy I'm suggesting here is splitting out fat binary components that can be unified by the same file metadata (namely hashes or other source unique id's). That will allow analyzing all binaries by their sections, segments, symbols, etc in a seamless way, regardless of platform or if their multi-platform. Do you think that can work for your implementation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just pining to see if there was a resolution here.

Copy link
Contributor

@andrewstucki andrewstucki Mar 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dcode sorry, just caught up on this. To clarify, you're talking about essentially emitting an event per cpu architecture? I'm not completely against that, but it does seem like a pretty artificial constraint to model something that is, by its nature, a single entity (i.e. a multi-arch mach-o file).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just as a use-case example: let's say we're triggering an alert on a malicious binary that has been compiled for both the new m1 as well as older, intel-based chipsets and we want to add in object-level data about the file. Would we emit two separate alerts, one for each architecture? Seems kind of odd.

Co-authored-by: Derek Ditch <dcode@users.noreply.github.com>
Co-authored-by: Derek Ditch <dcode@users.noreply.github.com>
@peasead
Copy link
Contributor Author

peasead commented Apr 7, 2021

Moved to new PR #1346

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Create file.mach-o
6 participants