Skip to content

Commit

Permalink
[1.x] Stage 2 changes for RFC 0009 - data_stream fields (#1215) (#1222)
Browse files Browse the repository at this point in the history
  • Loading branch information
ebeahan authored Jan 13, 2021
1 parent d1e08be commit 4ab85fa
Show file tree
Hide file tree
Showing 8 changed files with 270 additions and 1 deletion.
52 changes: 52 additions & 0 deletions experimental/generated/beats/fields.ecs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -564,6 +564,58 @@
ignore_above: 1024
description: Runtime managing this container.
example: docker
- name: data_stream
title: Data Stream
group: 2
description: 'The data_stream fields take part in defining the new data stream
naming scheme.
In the new data stream naming scheme the value of the data stream fields combine
to the name of the actual data stream in the following manner `{data_stream.type}-{data_stream.dataset}-{data_stream.namespace}`.
This means the fields can only contain characters that are valid as part of
names of data streams. More details about this can be found in this https://www.elastic.co/blog/an-introduction-to-the-elastic-data-stream-naming-scheme[blog
post].
An Elasticsearch data stream consists of one or more backing indices, and a
data stream name forms part of the backing indices names. Due to this convention,
data streams must also follow index naming restrictions. For example, data stream
names cannot include \, /, *, ?, ", <, >, |, ` `. Please see the Elasticsearch
reference for additional https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-create-index.html#indices-create-api-path-params[restrictions].'
type: group
fields:
- name: dataset
level: extended
type: constant_keyword
description: "The field can contain anything that makes sense to signify the\
\ source of the data.\nExamples include `nginx.access`, `prometheus`, `endpoint`\
\ etc. For data streams that otherwise fit, but that do not have dataset set\
\ we use the value \"generic\" for the dataset value. `event.dataset` should\
\ have the same value as `data_stream.dataset`.\nBeyond the Elasticsearch\
\ data stream naming criteria noted above, the `dataset` value has additional\
\ restrictions:\n * Must not contain `-`\n * No longer than 100 characters"
example: nginx.access
default_field: false
- name: namespace
level: extended
type: constant_keyword
description: "A user defined namespace. Namespaces are useful to allow grouping\
\ of data.\nMany users already organize their indices this way, and the data\
\ stream naming scheme now provides this best practice as a default. Many\
\ users will populate this field with `default`. If no value is used, it falls\
\ back to `default`.\nBeyond the Elasticsearch index naming criteria noted\
\ above, `namespace` value has the additional restrictions:\n * Must not\
\ contain `-`\n * No longer than 100 characters"
example: production
default_field: false
- name: type
level: extended
type: constant_keyword
description: 'An overarching type for the data stream.
Currently allowed values are "logs" and "metrics". We expect to also add "traces"
and "synthetics" in the near future.'
example: logs
default_field: false
- name: destination
title: Destination
group: 2
Expand Down
3 changes: 3 additions & 0 deletions experimental/generated/csv/fields.csv
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,9 @@ ECS_Version,Indexed,Field_Set,Field,Type,Level,Normalization,Example,Description
1.9.0-dev+exp,true,container,container.labels,object,extended,,,Image labels.
1.9.0-dev+exp,true,container,container.name,keyword,extended,,,Container name.
1.9.0-dev+exp,true,container,container.runtime,keyword,extended,,docker,Runtime managing this container.
1.9.0-dev+exp,true,data_stream,data_stream.dataset,constant_keyword,extended,,nginx.access,The field can contain anything that makes sense to signify the source of the data.
1.9.0-dev+exp,true,data_stream,data_stream.namespace,constant_keyword,extended,,production,A user defined namespace. Namespaces are useful to allow grouping of data.
1.9.0-dev+exp,true,data_stream,data_stream.type,constant_keyword,extended,,logs,An overarching type for the data stream.
1.9.0-dev+exp,true,destination,destination.address,keyword,extended,,,Destination network address.
1.9.0-dev+exp,true,destination,destination.as.number,long,extended,,15169,Unique number allocated to the autonomous system.
1.9.0-dev+exp,true,destination,destination.as.organization.name,wildcard,extended,,Google LLC,Organization name.
Expand Down
46 changes: 46 additions & 0 deletions experimental/generated/ecs/ecs_flat.yml
Original file line number Diff line number Diff line change
Expand Up @@ -705,6 +705,52 @@ container.runtime:
normalize: []
short: Runtime managing this container.
type: keyword
data_stream.dataset:
dashed_name: data-stream-dataset
description: "The field can contain anything that makes sense to signify the source\
\ of the data.\nExamples include `nginx.access`, `prometheus`, `endpoint` etc.\
\ For data streams that otherwise fit, but that do not have dataset set we use\
\ the value \"generic\" for the dataset value. `event.dataset` should have the\
\ same value as `data_stream.dataset`.\nBeyond the Elasticsearch data stream naming\
\ criteria noted above, the `dataset` value has additional restrictions:\n *\
\ Must not contain `-`\n * No longer than 100 characters"
example: nginx.access
flat_name: data_stream.dataset
level: extended
name: dataset
normalize: []
short: The field can contain anything that makes sense to signify the source of
the data.
type: constant_keyword
data_stream.namespace:
dashed_name: data-stream-namespace
description: "A user defined namespace. Namespaces are useful to allow grouping\
\ of data.\nMany users already organize their indices this way, and the data stream\
\ naming scheme now provides this best practice as a default. Many users will\
\ populate this field with `default`. If no value is used, it falls back to `default`.\n\
Beyond the Elasticsearch index naming criteria noted above, `namespace` value\
\ has the additional restrictions:\n * Must not contain `-`\n * No longer than\
\ 100 characters"
example: production
flat_name: data_stream.namespace
level: extended
name: namespace
normalize: []
short: A user defined namespace. Namespaces are useful to allow grouping of data.
type: constant_keyword
data_stream.type:
dashed_name: data-stream-type
description: 'An overarching type for the data stream.
Currently allowed values are "logs" and "metrics". We expect to also add "traces"
and "synthetics" in the near future.'
example: logs
flat_name: data_stream.type
level: extended
name: type
normalize: []
short: An overarching type for the data stream.
type: constant_keyword
destination.address:
dashed_name: destination-address
description: 'Some event destination addresses are defined ambiguously. The event
Expand Down
69 changes: 69 additions & 0 deletions experimental/generated/ecs/ecs_nested.yml
Original file line number Diff line number Diff line change
Expand Up @@ -983,6 +983,75 @@ container:
short: Fields describing the container that generated this event.
title: Container
type: group
data_stream:
description: 'The data_stream fields take part in defining the new data stream naming
scheme.
In the new data stream naming scheme the value of the data stream fields combine
to the name of the actual data stream in the following manner `{data_stream.type}-{data_stream.dataset}-{data_stream.namespace}`.
This means the fields can only contain characters that are valid as part of names
of data streams. More details about this can be found in this https://www.elastic.co/blog/an-introduction-to-the-elastic-data-stream-naming-scheme[blog
post].
An Elasticsearch data stream consists of one or more backing indices, and a data
stream name forms part of the backing indices names. Due to this convention, data
streams must also follow index naming restrictions. For example, data stream names
cannot include \, /, *, ?, ", <, >, |, ` `. Please see the Elasticsearch reference
for additional https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-create-index.html#indices-create-api-path-params[restrictions].'
fields:
data_stream.dataset:
dashed_name: data-stream-dataset
description: "The field can contain anything that makes sense to signify the\
\ source of the data.\nExamples include `nginx.access`, `prometheus`, `endpoint`\
\ etc. For data streams that otherwise fit, but that do not have dataset set\
\ we use the value \"generic\" for the dataset value. `event.dataset` should\
\ have the same value as `data_stream.dataset`.\nBeyond the Elasticsearch\
\ data stream naming criteria noted above, the `dataset` value has additional\
\ restrictions:\n * Must not contain `-`\n * No longer than 100 characters"
example: nginx.access
flat_name: data_stream.dataset
level: extended
name: dataset
normalize: []
short: The field can contain anything that makes sense to signify the source
of the data.
type: constant_keyword
data_stream.namespace:
dashed_name: data-stream-namespace
description: "A user defined namespace. Namespaces are useful to allow grouping\
\ of data.\nMany users already organize their indices this way, and the data\
\ stream naming scheme now provides this best practice as a default. Many\
\ users will populate this field with `default`. If no value is used, it falls\
\ back to `default`.\nBeyond the Elasticsearch index naming criteria noted\
\ above, `namespace` value has the additional restrictions:\n * Must not\
\ contain `-`\n * No longer than 100 characters"
example: production
flat_name: data_stream.namespace
level: extended
name: namespace
normalize: []
short: A user defined namespace. Namespaces are useful to allow grouping of
data.
type: constant_keyword
data_stream.type:
dashed_name: data-stream-type
description: 'An overarching type for the data stream.
Currently allowed values are "logs" and "metrics". We expect to also add "traces"
and "synthetics" in the near future.'
example: logs
flat_name: data_stream.type
level: extended
name: type
normalize: []
short: An overarching type for the data stream.
type: constant_keyword
group: 2
name: data_stream
prefix: data_stream.
short: The data_stream fields take part in defining the new data stream naming scheme.
title: Data Stream
type: group
destination:
description: 'Destination fields capture details about the receiver of a network
exchange/packet. These fields are populated from a network event, packet, or other
Expand Down
13 changes: 13 additions & 0 deletions experimental/generated/elasticsearch/7/template.json
Original file line number Diff line number Diff line change
Expand Up @@ -303,6 +303,19 @@
}
}
},
"data_stream": {
"properties": {
"dataset": {
"type": "constant_keyword"
},
"namespace": {
"type": "constant_keyword"
},
"type": {
"type": "constant_keyword"
}
}
},
"destination": {
"properties": {
"address": {
Expand Down
25 changes: 25 additions & 0 deletions experimental/generated/elasticsearch/component/data_stream.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
{
"_meta": {
"documentation": "https://www.elastic.co/guide/en/ecs/current/ecs-data_stream.html",
"ecs_version": "1.9.0-dev+exp"
},
"template": {
"mappings": {
"properties": {
"data_stream": {
"properties": {
"dataset": {
"type": "constant_keyword"
},
"namespace": {
"type": "constant_keyword"
},
"type": {
"type": "constant_keyword"
}
}
}
}
}
}
}
3 changes: 2 additions & 1 deletion experimental/generated/elasticsearch/template.json
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,8 @@
"ecs_1.9.0-dev-exp_url",
"ecs_1.9.0-dev-exp_user",
"ecs_1.9.0-dev-exp_user_agent",
"ecs_1.9.0-dev-exp_vulnerability"
"ecs_1.9.0-dev-exp_vulnerability",
"ecs_1.9.0-dev-exp_data_stream"
],
"index_patterns": [
"try-ecs-*"
Expand Down
60 changes: 60 additions & 0 deletions experimental/schemas/data_stream.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
---
- name: data_stream
title: Data Stream
short: The data_stream fields take part in defining the new data stream naming scheme.
description: >
The data_stream fields take part in defining the new data stream naming scheme.
In the new data stream naming scheme the value of the data stream fields combine to the name of the actual data
stream in the following manner `{data_stream.type}-{data_stream.dataset}-{data_stream.namespace}`. This means the fields
can only contain characters that are valid as part of names of data streams. More details about this can be found in
this https://www.elastic.co/blog/an-introduction-to-the-elastic-data-stream-naming-scheme[blog post].
An Elasticsearch data stream consists of one or more backing indices, and a data stream name forms part of the backing indices names.
Due to this convention, data streams must also follow index naming restrictions. For example, data stream names cannot include \, /, *, ?, ", <, >, |, ` `.
Please see the Elasticsearch reference for additional https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-create-index.html#indices-create-api-path-params[restrictions].
fields:

- name: type
level: extended
type: constant_keyword
example: logs
# Any future values for `data_stream.type` should also adhere to the following restrictions (these are derived from the Elasticsearch index restrictions):
# * Must not contain `-`
# * Must not start with `+` or `_`
description: >
An overarching type for the data stream.
Currently allowed values are "logs" and "metrics". We expect to also add "traces" and "synthetics" in the near future.
short: An overarching type for the data stream.

- name: dataset
level: extended
type: constant_keyword
example: nginx.access
description: >
The field can contain anything that makes sense to signify the source of the data.
Examples include `nginx.access`, `prometheus`, `endpoint` etc. For data streams that otherwise fit, but that
do not have dataset set we use the value "generic" for the dataset value. `event.dataset` should have the
same value as `data_stream.dataset`.
Beyond the Elasticsearch data stream naming criteria noted above, the `dataset` value has additional restrictions:
* Must not contain `-`
* No longer than 100 characters
short: The field can contain anything that makes sense to signify the source of the data.

- name: namespace
level: extended
type: constant_keyword
example: production
description: >
A user defined namespace. Namespaces are useful to allow grouping of data.
Many users already organize their indices this way, and the data stream naming scheme now provides this
best practice as a default. Many users will populate this field with `default`. If no value is used, it falls back to `default`.
Beyond the Elasticsearch index naming criteria noted above, `namespace` value has the additional restrictions:
* Must not contain `-`
* No longer than 100 characters
short: A user defined namespace. Namespaces are useful to allow grouping of data.

0 comments on commit 4ab85fa

Please sign in to comment.