Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File.macho create #1097

Closed
wants to merge 39 commits into from
Closed
Show file tree
Hide file tree
Changes from 6 commits
Commits
Show all changes
39 commits
Select commit Hold shift + click to select a range
7080358
Merge pull request #1 from elastic/master
peasead Oct 20, 2020
314f9ab
Merge pull request #2 from elastic/master
peasead Nov 3, 2020
1448cd6
Merge pull request #3 from elastic/master
peasead Nov 4, 2020
16aae5f
Merge pull request #4 from elastic/master
peasead Nov 5, 2020
ef7bd12
initial commit
peasead Nov 5, 2020
0107542
added PR#
peasead Nov 5, 2020
de73a01
removed field present in code_signature
peasead Nov 10, 2020
714c859
removed field present in code_signature
peasead Nov 10, 2020
07c011d
updated work in signature
peasead Nov 20, 2020
aeadc6b
move executable fields to segments.
peasead Nov 20, 2020
29ecf43
removed signature fields
peasead Dec 23, 2020
6d77439
removed file. from field names
peasead Dec 23, 2020
16ad2bc
Update rfcs/text/0000-create-file-mach-o.md
peasead Dec 23, 2020
6969054
Update rfcs/text/0000-create-file-mach-o.md
peasead Dec 23, 2020
692cc5a
Update rfcs/text/0000-create-file-mach-o.md
peasead Dec 23, 2020
f64a08d
Update rfcs/text/0000-create-file-mach-o.md
peasead Dec 23, 2020
805f6c5
Update rfcs/text/0000-create-file-mach-o.md
peasead Dec 23, 2020
cd6a5e0
Update rfcs/text/0000-create-file-mach-o.md
peasead Dec 23, 2020
cdd9766
Update rfcs/text/0000-create-file-mach-o.md
peasead Dec 23, 2020
b8c02ce
renamed mach-o to macho
peasead Dec 23, 2020
6e8e729
Merge branch 'file.macho-create' of github.com:peasead/ecs into file.…
peasead Dec 23, 2020
72ee845
removed plurality from "header"
peasead Dec 23, 2020
c6f20b2
created usage doc
peasead Dec 23, 2020
bbd1afd
removed header plurality, sections to flattened
peasead Jan 13, 2021
e0e5a1a
changed macho.segments to nested
peasead Feb 1, 2021
689fa39
typo in segments.size
peasead Feb 1, 2021
ccf1b88
corrected segments.sections fieldtype
peasead Feb 1, 2021
84bdb2e
added cdhash to RFC doc.
peasead Feb 1, 2021
c049773
Fixed segments.offset fieldtype
peasead Feb 1, 2021
3fd0931
typo on rfc doc for segments.flags
peasead Feb 1, 2021
a53a52b
back to headers from header
peasead Feb 3, 2021
276acfe
Update 0000-create-file-macho.md
peasead Feb 9, 2021
d996d6f
Update macho.yml
peasead Feb 9, 2021
fc30c23
ecs housekeeping edits
ebeahan Feb 10, 2021
442d212
Update rfcs/text/0000-create-file-macho.md
peasead Feb 16, 2021
a7ff6ae
Update rfcs/text/0000-create-file-macho.md
peasead Feb 16, 2021
a96fd55
Update rfcs/text/0000-create-file-macho.md
peasead Feb 16, 2021
5177e2e
Update rfcs/text/0000-create-file-macho.md
peasead Mar 11, 2021
1347c0d
Update rfcs/text/0000-create-file-macho.md
peasead Mar 11, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
155 changes: 155 additions & 0 deletions rfcs/text/0000-create-file-mach-o.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,155 @@
# 0000: Create the Mach-O sub-field of the File fieldset

- Stage: **0 (strawperson)**
peasead marked this conversation as resolved.
Show resolved Hide resolved
- Date: **TBD**

Create the Mach Object (Mach-O) sub-field, of the `file` top-level fieldset. This document metadata can be used for malware research, as well as coding and other application development efforts.
peasead marked this conversation as resolved.
Show resolved Hide resolved

## Fields

**Stage 0**

This RFC is to create the Mach-O sub-field within the `file.` fieldset. This will include 35 sub-fields.

| Name | Type | Description |
|--------------------------------------------|------------|-----------------------------------------------------------------------------|
| file.mach-o.cpu | object | CPU information for the file. |
peasead marked this conversation as resolved.
Show resolved Hide resolved
| file.mach-o.cpu.architecture | keyword | CPU architecture target for the file. |
| file.mach-o.cpu.byte_order | keyword | CPU byte order for the file. |
| file.mach-o.cpu.subtype | keyword | CPU subtype for the file. |
| file.mach-o.cpu.type | keyword | CPU type for the file. |
| file.mach-o.headers | object | Header information for the file. |
peasead marked this conversation as resolved.
Show resolved Hide resolved
| file.mach-o.headers.commands | object | Header load commands for the file. |
| file.mach-o.headers.commands.number | long | Number of load commands for the Mach-O header. |
| file.mach-o.headers.commands.size | long | Size of load commands of the Mach-O header. |
| file.mach-o.headers.commands.type | keyword | Type of the load commands for the Mach-O header. |
| file.mach-o.headers.magic | keyword | Magic field of the Mach-O header. |
| file.mach-o.headers.flags | keyword | Flags set in the Mach-O header. |
| file.mach-o.segments | object | Segment information for the file. |
peasead marked this conversation as resolved.
Show resolved Hide resolved
| file.mach-o.segments.vmaddr | keyword | Memory address of this segment. |
| file.mach-o.segments.name | keyword | Name of this segment. |
| file.mach-o.segments.vmsize | keyword | Memory size of this segment. |
peasead marked this conversation as resolved.
Show resolved Hide resolved
| file.mach-o.segments.fileoff | keyword | File offset of this segment. |
peasead marked this conversation as resolved.
Show resolved Hide resolved
| file.mach-o.segments.filesize | keyword | Amount of memory to map from the file. |
peasead marked this conversation as resolved.
Show resolved Hide resolved
| file.mach-o.segments.sections | object | Section information for the segment of the file. |
peasead marked this conversation as resolved.
Show resolved Hide resolved
| file.mach-o.segments.sections.flags | keyword | Section flags for the segment of the file. |
| file.mach-o.segments.sections.name | keyword | Section name for the segment of the file. |
| file.mach-o.segments.sections.type | keyword | Section type for the segment of the file. |
| file.mach-o.signature | object | Signature information for the file. |
peasead marked this conversation as resolved.
Show resolved Hide resolved
| file.mach-o.signature.candidate_cd_hash | keyword | Code Digest (CD) SHA256 hash of the first 20-bytes of the file. |
| file.mach-o.signature.team_identifier | keyword | Team identifier of the code signing certificate. |
| file.mach-o.signature.sealed_resources | long | Version of the resource envelope for the code signing certificate. |
| file.mach-o.signature.cms_digest | keyword | Cryptographic Message Syntax (CMS) hash of the code signing certificate. |
| file.mach-o.signature.cms_digest_type | keyword | Cryptographic Message Syntax (CMS) type of the code signing certificate. |
| file.mach-o.signature.status | keyword | Verification information for the code signing certificate. |
| file.mach-o.signature.fingerprint | keyword | MD5 digest of the der-encoded certificate information. |
| file.mach-o.executable | object | Information about the executable segment for the file. |
| file.mach-o.executable.segment_base | keyword | Executable segment base size. |
peasead marked this conversation as resolved.
Show resolved Hide resolved
| file.mach-o.executable.segment_limit | keyword | Executable segment limit size. |
| file.mach-o.executable.segment_flags | keyword | Executable segment flags. |
| file.mach-o.page_size | long | Page size of the file. |


**Stage 1**

[New `mach-o.yml` candidate](mach-o/mach-o.yml)]
peasead marked this conversation as resolved.
Show resolved Hide resolved

<!--
Stage 3: Add or update all remaining field definitions. The list should now be exhaustive. The goal here is to validate the technical details of all remaining fields and to provide a basis for releasing these field definitions as beta in the schema. Use GitHub code blocks with yml syntax formatting.
-->

## Usage

**Stage 1**

In performing file analysis, specifically for malware research, understanding file similarities can be used to chain together malware samples and families to identify campaigns and possibly attribution. Additionally, understanding how malware components are re-used is useful in understanding malware telemetry, especially in understanding the impact being made through the introduction of defensive countermeasures.

As an example, if XDR vendors deploys a new malware model to defeat a specific type of ransomware and we start observing a change and/or relationship to the headers, import tables, libraries, etc of that malware family, we can make assumptions that the changes to the malware model are making an impact against the malware family.

As another example, tracking file metadata for specific families is useful in predicting new campaigns if we see similar file metadata being used for new samples. [Example](https://www.bleepingcomputer.com/news/security/maze-ransomware-is-shutting-down-its-cybercrime-operation/), the Maze ransomware family shutting down and re-purposing as Egregor (this is for Windows malware, but the concept is the same).

## Source data

**Stage 1**

This type of data can be provided by logs from VirusTotal, Reversing Labs, Lockheed Martin's LAIKABOSS, Emerson's File Scanning Framework, Target's Strelka, or other file/malware analysis platforms.

* [VirusTotal API](https://developers.virustotal.com/v3.0/reference)
* [Emerson FSF](https://github.com/EmersonElectricCo/fsf)
* [Target Strelka](https://github.com/target/strelka)
* [Lockheed Martin LAIKABOSS](https://github.com/lmco/laikaboss)

<!--
Stage 1: Provide a high-level description of example sources of data. This does not yet need to be a concrete example of a source document, but instead can simply describe a potential source (e.g. nginx access log). This will ultimately be fleshed out to include literal source examples in a future stage. The goal here is to identify practical sources for these fields in the real world. ~1-3 sentences or unordered list.
-->

<!--
Stage 2: Included a real world example source document. Ideally this example comes from the source(s) identified in stage 1. If not, it should replace them. The goal here is to validate the utility of these field changes in the context of a real world example. Format with the source name as a ### header and the example document in a GitHub code block with json formatting.
-->

<!--
Stage 3: Add more real world example source documents so we have at least 2 total, but ideally 3. Format as described in stage 2.
-->

## Scope of impact

**Stage 2**

There should be no breaking changes, depreciation strategies, or significant refactoring as this is creating a sub-field for the existing `file.` fieldset.

While likely not a large-scale ECS project, there would be documentation updates needed to explain the new fields.

<!--
Stage 2: Identifies scope of impact of changes. Are breaking changes required? Should deprecation strategies be adopted? Will significant refactoring be involved? Break the impact down into:
* Ingestion mechanisms (e.g. beats/logstash)
* Usage mechanisms (e.g. Kibana applications, detections)
* ECS project (e.g. docs, tooling)
The goal here is to research and understand the impact of these changes on users in the community and development teams across Elastic. 2-5 sentences each.
-->

## Concerns

<!--
Stage 1: Identify potential concerns, implementation challenges, or complexity. Spend some time on this. Play devil's advocate. Try to identify the sort of non-obvious challenges that tend to surface later. The goal here is to surface risks early, allow everyone the time to work through them, and ultimately document resolution for posterity's sake.
-->

<!--
Stage 2: Document new concerns or resolutions to previously listed concerns. It's not critical that all concerns have resolutions at this point, but it would be helpful if resolutions were taking shape for the most significant concerns.
-->

<!--
Stage 3: Document resolutions for all existing concerns. Any new concerns should be documented along with their resolution. The goal here is to eliminate the risk of churn and instability by resolving outstanding concerns.
-->

<!--
Stage 4: Document any new concerns and their resolution. The goal here is to eliminate risk of churn and instability by ensuring all concerns have been addressed.
-->

## Real-world implementations

<!--
Stage 4: Identify at least one real-world, production-ready implementation that uses these updated field definitions. An example of this might be a GA feature in an Elastic application in Kibana.
-->

## People

The following are the people that consulted on the contents of this RFC.

* @peasead | author
* @devonakerr | sponsor
* @dcode, @peasead | subject matter expert

## References

<!-- Insert any links appropriate to this RFC in this section. -->

### RFC Pull Requests

<!-- An RFC should link to the PRs for each of it stage advancements. -->

* Stage 0: https://github.com/elastic/ecs/pull/1097

<!--
* Stage 1: https://github.com/elastic/ecs/pull/NNN
...
-->
219 changes: 219 additions & 0 deletions rfcs/text/mach-o/mach-o.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,219 @@
---
peasead marked this conversation as resolved.
Show resolved Hide resolved
- name: file.mach-o
title: Mach-O file information.
group: 2
description: >
These fields contain macOS Mach Object (Mach-O) metadata.
type: group
reusable:
top_level: false
expected:
- file
- process
fields:
- name: cpu
level: extended
description: CPU information for the file.
type: object
fields:
- name: architecture
description: CPU architecture target for the file.
type: keyword
level: extended
example: 64-bit

- name: byte_order
description: CPU byte order for the file.
type: keyword
level: extended
example: Little endian

- name: subtype
description: CPU subtype for the file.
type: keyword
level: extended
example: ARM (all) 64-bit

- name: type
description: CPU type for the file.
type: keyword
level: extended
example: ARM 64-bit

- name: headers
level: extended
description: Header information for the file.
type: object
fields:
- name: commands
level: extended
description: Header load commands information for the file.
type: object
fields:
- name: number
description: Number of load commands for the Mach-O header.
type: long
level: extended
example: 23

- name: size
description: Size of load commands of the Mach-O header.
type: long
level: extended
format: bytes
example: 3888

- name: type
description: Type of the load commands for the Mach-O header.
type: keyword
level: extended
example: LC_SYMTAB, 0x2c

- name: magic
description: Magic field of the Mach-O header.
type: keyword
level: extended
example: 0xfeedfacf

- name: flags
description: Flags set in the Mach-O header.
type: keyword
level: extended
example: TWOLEVEL, 0x4000000

- name: segments
level: extended
description: Segment information for the file.
type: object
fields:
- name: vmaddr
description: Memory address of this segment.
type: keyword
level: extended
example: 0x0

- name: name
description: Name of this segment.
type: keyword
level: extended
example: __TEXT, __DATA

- name: vmsize
description: Memory size of this segment.
type: keyword
level: extended
example: 0x4c000

- name: fileoff
description: File offset of this segment.
type: keyword
level: extended
example: 0x0

- name: filesize
description: Amount of memory to map from the file.
type: keyword
level: extended
example: 0x4c000

- name: sections
level: extended
description: Section information for the segment of the file.
type: object
fields:
- name: flags
description: Section flags for the segment of the file.
type: keyword
level: extended
example: SECTION_ATTRIBUTES_USR, S_8BYTE_LITERALS

- name: name
description: Section name for the segment of the file.
type: keyword
level: extended
example: __objc_classname, __stub_helper

- name: type
description: Section type for the segment of the file.
type: keyword
level: extended
example: S_REGULAR, S_CSTRING_LITERALS

- name: signature
level: extended
description: Signature information for the file.
type: object
fields:
- name: candidate_cd_hash
description: Code Digest (CD) SHA256 hash of the first 20-bytes of the file.
type: keyword
level: extended
example: 2035094a7065b29421e7a51f51db9bd61807c3628f210b1f8e667235777dc592

- name: team_identifier
description: Team identifier of the code signing certificate.
type: keyword
level: extended
example: 11A1A1AAAA

- name: sealed_resources
description: Version of the resource envelope for the code signing certificate.
type: long
level: extended
example: 2

- name: cms_digest
description: Cryptographic Message Syntax (CMS) hash of the code signing certificate.
type: keyword
level: extended
example: 3ae1b10f231bee84ca17ab4295c0faaf6cbd535f3cc8010474ec6a67909e1980

- name: cms_digest_type
description: Cryptographic Message Syntax (CMS) type of the code signing certificate.
type: keyword
level: extended
example: 2

- name: status
description: Verification information for the code signing certificate.
type: keyword
level: extended
example: Valid

- name: fingerprint
description: MD5 digest of the der-encoded certificate information.
type: keyword
level: extended
example: 611E5B662C593A08FF58D14AE22452D198DF6C60

- name: executable
level: extended
description: Information about the executable segment for the file.
type: object
fields:
- name: segment_base
description: Executable segment base size.
type: long
format: bytes
level: extended
example: 0

- name: segment_limit
description: Executable segment limit size.
type: long
format: bytes
level: extended
example: 123456

- name: segment_flags
description: Executable segment flags.
type: keyword
level: extended
example: 0x0

- name: page_size
description: Page size of the file.
type: long
format: bytes
level: extended
example: 4096