From 9e30a887420350fcc9bd5b603eee276b3df63c5d Mon Sep 17 00:00:00 2001 From: Andrew Pease <7442091+peasead@users.noreply.github.com> Date: Wed, 10 Feb 2021 18:00:32 -0600 Subject: [PATCH] File.elf create (#1077) Co-authored-by: Eric Beahan --- rfcs/text/0015-create-file-elf.md | 162 ++++++++++++++++++++ rfcs/text/0015/docs/usage/elf.asciidoc | 19 +++ rfcs/text/0015/elf.yml | 198 +++++++++++++++++++++++++ 3 files changed, 379 insertions(+) create mode 100644 rfcs/text/0015-create-file-elf.md create mode 100644 rfcs/text/0015/docs/usage/elf.asciidoc create mode 100644 rfcs/text/0015/elf.yml diff --git a/rfcs/text/0015-create-file-elf.md b/rfcs/text/0015-create-file-elf.md new file mode 100644 index 0000000000..a67d860bcc --- /dev/null +++ b/rfcs/text/0015-create-file-elf.md @@ -0,0 +1,162 @@ +# 0015: Create the ELF sub-field of the File fieldset + +- Stage: **1 (draft)** +- Date: **2021-02-10** + +Create the Executable Linkable Format (ELF) sub-field, of the `file` top-level fieldset. This document metadata can be used for malware research, as well as coding and other application development efforts. + +## Fields + +**Stage 0** + +This RFC is to create the ELF sub-field within the `file.` fieldset. This will include 25 sub-fields. + +| Name | Type | Description | +| ---- | ---- | ----------- | +| elf.creation_date | date | Extracted when possible from the file's metadata. Indicates when it was built or compiled. It can also be faked by malware creators. | +| elf.exports | flattened | List of exported element names and types. | +| elf.exports.name | keyword | Name of exported symbol | +| elf.exports.type | keyword | Type of exported symbol | +| elf.segments | nested | ELF object segment list. | +| elf.segments.type | keyword | ELF object segment type. | +| elf.segments.sections | keyword | ELF object segment sections. | +| elf.header | group | Header information of the ELF file. | +| elf.header.class | keyword | Header class of the ELF file. | +| elf.header.data | keyword | Data table of the ELF header. | +| elf.header.machine | keyword | Machine architecture of the ELF header. | +| elf.header.os_abi | keyword | Application Binary Interface (ABI) of the Linux OS. | +| elf.header.type | keyword | Header type of the ELF file. | +| elf.header.version | keyword | Version of the ELF header. | +| elf.header.abi_version | keyword | Version of the ELF Application Binary Interface (ABI). | +| elf.header.entrypoint | long | Header entrypoint of the ELF file. | +| elf.header.object_version | keyword | "0x1" for original ELF files. | +| elf.imports | flattened | List of imported element names and types. | +| elf.imports.name | keyword | Name of imported symbol | +| elf.imports.type | keyword | Type of imported symbol | +| elf.sections | nested | Section information of the ELF file. | +| elf.sections.flags | keyword | ELF Section List flags. | +| elf.sections.name | keyword | ELF Section List name. | +| elf.sections.physical_offset | keyword | ELF Section List offset. | +| elf.sections.type | keyword | ELF Section List type. | +| elf.sections.physical_size | long | ELF Section List physical size. | +| elf.sections.virtual_address | long | ELF Section List virtual address. | +| elf.sections.virtual_size | long | ELF Section List virtual size. | +| elf.sections.entropy | long | Shannon entropy calculation from the section. | +| elf.sections.chi2 | long | Chi-square probability distribution of the section. | +| elf.shared_libraries | keyword | List of shared libraries used by this ELF object | +| elf.telfhash | keyword | telfhash hash for ELF files. | +| elf.architecture | keyword | Machine architecture of the ELF file. | +| elf.byte_order | keyword | Byte sequence of ELF file. | +| elf.cpu_type | keyword | CPU type of the ELF file. | + + +**Stage 1** + +[New `elf.yml` candidate](../schemas/elf.yml) + + + +## Usage + + +In performing file analysis, specifically for malware research, understanding file similarities can be used to chain together malware samples and families to identify campaigns and possibly attribution. Additionally, understanding how malware components are re-used is useful in understanding malware telemetry, especially in understanding the impact being made through the introduction of defensive countermeasures. + +As an example, if XDR vendors deploys a new malware model to defeat a specific type of ransomware and we start observing a change and/or relationship to the headers, import tables, libraries, etc of that malware family, we can make assumptions that the changes to the malware model are making an impact against the malware family. + +As another example, tracking file metadata for specific families is useful in predicting new campaigns if we see similar file metadata being used for new samples. [Example](https://www.bleepingcomputer.com/news/security/maze-ransomware-is-shutting-down-its-cybercrime-operation/), the Maze ransomware family shutting down and re-purposing as Egregor (this is for Windows malware, but the concept is the same). + +## Source data + +**Stage 1** + +This type of data can be provided by logs from VirusTotal, Reversing Labs, Lockheed Martin's LAIKABOSS, Emerson's File Scanning Framework, Target's Strelka, or other file/malware analysis platforms. + +* [VirusTotal Filebeat module PR](https://github.com/elastic/beats/pull/21815) +* [VirusTotal API](https://developers.virustotal.com/v3.0/reference) +* [Emerson FSF](https://github.com/EmersonElectricCo/fsf) +* [Target Strelka](https://github.com/target/strelka) +* [Lockheed Martin LAIKABOSS](https://github.com/lmco/laikaboss) + + + +**Stage 2** + +### Real world examples + + + + + +## Scope of impact + +**Stage 2** + +There should be no breaking changes, depreciation strategies, or significant refactoring as this is creating a sub-field for the existing `file.` fieldset. + +While likely not a large-scale ECS project, there would be documentation updates needed to explain the new fields. + + + +## Concerns + + + + + + + + + +**ELF Imports** + +Type flattened won't allow explicit field mappings to be defined, so I don't think it's necessary to explicitly list them here. However, is there intent to still describe for data sources on how to "shape" the data for these flattened fields? There are no type: flattened fields today in ECS, so how to best capture provide that type of guidance will need to be hashed out. + +* Field: `elf.imports` +* Comment: https://github.com/elastic/ecs/pull/1077#discussion_r572274291 + + +## People + +The following are the people that consulted on the contents of this RFC. + +* @peasead | author +* @devonakerr | sponsor +* @dcode, @peasead | subject matter expert + +## References + + + +### RFC Pull Requests + + + +* Stage 1: https://github.com/elastic/ecs/pull/1077 + + diff --git a/rfcs/text/0015/docs/usage/elf.asciidoc b/rfcs/text/0015/docs/usage/elf.asciidoc new file mode 100644 index 0000000000..872ae936cb --- /dev/null +++ b/rfcs/text/0015/docs/usage/elf.asciidoc @@ -0,0 +1,19 @@ +[[ecs-elf-ussage]] +=== ELF Usage + +--Description-- + +[discrete] +=== ELF Field Details +| Field | Description | Level | +| ---- | ---- | ----------- | +| elf.creation_date | Extracted when possible from the file's metadata. Indicates when it was built or compiled. It can also be faked by malware creators. | extended | +| ... | ... | ... | +| ... | ... | ... | +| ... | ... | ... | + +[discrete] +=== Field Reuse +The `elf` fields are expected to be nested at: `dll.elf`, `file.elf`, `process.elf`. + +Note also that the `elf` fields are not expected to be used directly at the root of the events. diff --git a/rfcs/text/0015/elf.yml b/rfcs/text/0015/elf.yml new file mode 100644 index 0000000000..bf6d576407 --- /dev/null +++ b/rfcs/text/0015/elf.yml @@ -0,0 +1,198 @@ +--- +- name: elf + title: ELF Header + group: 2 + description: > + These fields contain Linux Executable Linkable Format (ELF) metadata. + type: group + reusable: + top_level: false + expected: + - file + - process + fields: + - name: creation_date + short: Build or compile date. + description: > + Extracted when possible from the file's metadata. Indicates when it was + built or compiled. It can also be faked by malware creators. + type: date + level: extended + + - name: architecture + description: > + Machine architecture of the ELF file. + type: keyword + level: extended + example: ARM, x86-64, etc + + - name: byte_order + description: > + Byte sequence of ELF file. + type: keyword + level: extended + example: Little Endian, Big Endian + + - name: cpu_type + description: > + CPU type of the ELF file. + type: keyword + level: extended + example: Intel, PowerPC, RISC, etc. + + - name: header.class + description: > + Header class of the ELF file. + type: keyword + level: extended + + - name: header.data + description: > + Data table of the ELF header. + type: keyword + level: extended + + - name: header.os_abi + description: > + Application Binary Interface (ABI) of the Linux OS. + type: keyword + level: extended + + - name: header.type + description: > + Header type of the ELF file. + type: keyword + level: extended + + - name: header.version + description: > + Version of the ELF header. + type: keyword + level: extended + + - name: header.abi_version + type: keyword + level: extended + description: > + Version of the ELF Application Binary Interface (ABI). + + - name: header.entrypoint + format: string + level: extended + type: long + description: > + Header entrypoint of the ELF file. + + - name: header.object_version + type: keyword + level: extended + description: > + "0x1" for original ELF files. + + - name: sections + description: > + Section information of the ELF file. + type: nested + level: extended + + - name: sections.flags + description: > + ELF Section List flags. + type: keyword + level: extended + + - name: sections.name + description: > + ELF Section List name. + type: keyword + level: extended + + - name: sections.physical_offset + description: > + ELF Section List offset. + type: keyword + level: extended + + - name: sections.type + description: > + ELF Section List type. + type: keyword + level: extended + + - name: sections.physical_size + description: > + ELF Section List physical size. + format: bytes + type: long + level: extended + + - name: sections.virtual_address + description: > + ELF Section List virtual address. + format: string + type: long + level: extended + + - name: sections.virtual_size + description: > + ELF Section List virtual size. + format: string + type: long + level: extended + + - name: sections.entropy + description: > + Shannon entropy calculation from the section. + format: number + type: long + level: extended + + - name: sections.chi2 + description: > + Chi-square probability distribution of the section. + format: number + type: long + level: extended + + - name: exports + description: > + List of exported element names and types. + level: extended + type: flattened + + - name: imports + description: > + List of imported element names and types. + type: flattened + level: extended + + - name: shared_libraries + description: > + List of shared libraries used by this ELF object + type: keyword + level: extended + normalize: + - array + + - name: telfhash + short: telfhash hash for ELF files + description: > + telfhash is symbol hash for ELF files, just like imphash is imports hash for PE files. Learn more at https://github.com/trendmicro/telfhash. + type: keyword + level: extended + + - name: segments + description: > + ELF object segment list. + type: nested + level: extended + + - name: segments.type + description: ELF object segment type. + type: keyword + level: extended + + - name: segments.sections + description: ELF object segment sections. + type: keyword + level: extended