From 0c70118637dbb33ae2e4effe202f3198fa41b03c Mon Sep 17 00:00:00 2001 From: Greg Kalapos Date: Wed, 29 May 2024 17:46:04 +0200 Subject: [PATCH] Elasticsearch: adapt span name and use `db.namespace` --- .chloggen/align_es_spec.yaml | 5 +++-- docs/database/elasticsearch.md | 33 ++++++++++++++----------------- model/registry/db.yaml | 6 ------ model/registry/deprecated/db.yaml | 7 +++++++ model/trace/database.yaml | 5 +++-- 5 files changed, 28 insertions(+), 28 deletions(-) diff --git a/.chloggen/align_es_spec.yaml b/.chloggen/align_es_spec.yaml index f6692b5018..6391a924cc 100755 --- a/.chloggen/align_es_spec.yaml +++ b/.chloggen/align_es_spec.yaml @@ -4,14 +4,15 @@ # your pull request title with [chore] or use the "Skip Changelog" label. # One of 'breaking', 'deprecation', 'new_component', 'enhancement', 'bug_fix' -change_type: enhancement +change_type: breaking # The name of the area of concern in the attributes-registry, (e.g. http, cloud, db) component: db # A brief description of the change. Surround your text with quotes ("") if it needs to start with a backtick (`). note: > - Align Elasticsearch span name to the general database span name guidelines. + Align Elasticsearch span name to the general database span name guidelines. + Deprecates `b.elasticsearch.cluster.name` in favor of db.namespace. # Mandatory: One or more tracking issues related to the change. You can use the PR number here if no issue exists. # The values here must be integers. diff --git a/docs/database/elasticsearch.md b/docs/database/elasticsearch.md index 8ed83d3e7c..90be173cf4 100644 --- a/docs/database/elasticsearch.md +++ b/docs/database/elasticsearch.md @@ -14,13 +14,7 @@ described on this page. ## Span Name -The **span name** SHOULD be of the format `{db.operation.name} {db.collection.name}`. - -The elasticsearch endpoint identifier stored in `db.operation.name` is used instead of the url path in order to reduce the cardinality of the span -name, as the path could contain dynamic values. The endpoint id is the `name` field in the -[elasticsearch schema](https://raw.githubusercontent.com/elastic/elasticsearch-specification/main/output/schema/schema.json). -If `db.collection.name` is not available, the span name should be `{db.operation.name}`. -If `db.operation.name` is not available, the span name SHOULD be the `{db.system}`. +The **span name** follows the [general database span name guidelines](database-spans.md#name) with the endpoint identifier stored in `db.operation.name`, instead of the url path in order to reduce the cardinality of the span and the index stored in `db.collection.name`. ## Attributes @@ -37,14 +31,13 @@ If `db.operation.name` is not available, the span name SHOULD be the `{db.system | [`http.request.method`](/docs/attributes-registry/http.md) | string | HTTP request method. [2] | `GET`; `POST`; `HEAD` | `Required` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | | [`url.full`](/docs/attributes-registry/url.md) | string | Absolute URL describing a network resource according to [RFC3986](https://www.rfc-editor.org/rfc/rfc3986) [3] | `https://localhost:9200/index/_search?q=user.id:kimchy` | `Required` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | | [`db.elasticsearch.path_parts.`](/docs/attributes-registry/db.md) | string | A dynamic value in the url path. [4] | `db.elasticsearch.path_parts.index=test-index`; `db.elasticsearch.path_parts.doc_id=123` | `Conditionally Required` when the url has dynamic values | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| [`server.port`](/docs/attributes-registry/server.md) | int | Server port number. [5] | `80`; `8080`; `443` | `Conditionally Required` [6] | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | -| [`db.collection.name`](/docs/attributes-registry/db.md) | string | The index or data stream against which the query is executed. [7] | `my_index`; `index1, index2` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| [`db.elasticsearch.cluster.name`](/docs/attributes-registry/db.md) | string | Represents the identifier of an Elasticsearch cluster. | `e9106fc68e3044f0b1475b04bf4ffd5f` | `Recommended` [8] | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`error.type`](/docs/attributes-registry/error.md) | string | Describes a class of error the operation ended with. [5] | `timeout`; `java.net.UnknownHostException`; `server_certificate_invalid`; `500` | `Conditionally Required` If and only if the operation failed. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | +| [`server.port`](/docs/attributes-registry/server.md) | int | Server port number. [6] | `80`; `8080`; `443` | `Conditionally Required` [7] | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | +| [`db.collection.name`](/docs/attributes-registry/db.md) | string | The index or data stream against which the query is executed. [8] | `my_index`; `index1, index2` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | [`db.elasticsearch.node.name`](/docs/attributes-registry/db.md) | string | Represents the human-readable identifier of the node/instance to which a request was routed. | `instance-0000000001` | `Recommended` [9] | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| [`db.query.text`](/docs/attributes-registry/db.md) | string | The request body for a [search-type query](https://www.elastic.co/guide/en/elasticsearch/reference/current/search.html), as a json string. | `"{\"query\":{\"term\":{\"user.id\":\"kimchy\"}}}"` | `Recommended` [10] | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| [`network.peer.address`](/docs/attributes-registry/network.md) | string | Peer address of the database node where the operation was performed. [11] | `10.1.2.80`; `/tmp/my.sock` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | -| [`network.peer.port`](/docs/attributes-registry/network.md) | int | Peer port number of the network connection. | `65123` | `Recommended` if and only if `network.peer.address` is set. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | -| [`server.address`](/docs/attributes-registry/server.md) | string | Name of the database host. [12] | `example.com`; `10.1.2.80`; `/tmp/my.sock` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | +| [`db.namespace`](/docs/attributes-registry/db.md) | string | The name of the database, fully qualified within the server address and port. [10] | `customers`; `test.users` | `Recommended` [11] | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`db.query.text`](/docs/attributes-registry/db.md) | string | The request body for a [search-type query](https://www.elastic.co/guide/en/elasticsearch/reference/current/search.html), as a json string. | `"{\"query\":{\"term\":{\"user.id\":\"kimchy\"}}}"` | `Recommended` [12] | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`server.address`](/docs/attributes-registry/server.md) | string | Name of the database host. [13] | `example.com`; `10.1.2.80`; `/tmp/my.sock` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | **[1]:** This SHOULD be the endpoint identifier for the request. @@ -75,15 +68,19 @@ Tracing instrumentations that do so, MUST also set `http.request.method_original **[7]:** If using a port other than the default port for this DBMS and if `server.address` is set. -**[8]:** When communicating with an Elastic Cloud deployment, this should be collected from the "X-Found-Handling-Cluster" HTTP response header. +**[8]:** If the query targets multiple indices or data streams, then the name of those should be added as a comma separated list. If the query doesn't target a specific index, this field MUST NOT be set. **[9]:** When communicating with an Elastic Cloud deployment, this should be collected from the "X-Found-Handling-Instance" HTTP response header. -**[10]:** Should be collected by default for search-type queries and only if there is sanitization that excludes sensitive information. +**[10]:** If a database system has multiple namespace components, they SHOULD be concatenated (potentially using database system specific conventions) from most general to most specific namespace component, and more specific namespaces SHOULD NOT be captured without the more general namespaces, to ensure that "startswith" queries for the more general namespaces will be valid. +Semantic conventions for individual database systems SHOULD document what `db.namespace` means in the context of that system. +It is RECOMMENDED to capture the value as provided by the application without attempting to do any case normalization. + +**[11]:** The name of the Elasticsearch cluster which the client connects to. When communicating with an Elastic Cloud deployment, this should be collected from the "X-Found-Handling-Cluster" HTTP response header. -**[11]:** If a database operation involved multiple network calls (for example retries), the address of the last contacted node SHOULD be used. +**[12]:** Should be collected by default for search-type queries and only if there is sanitization that excludes sensitive information. -**[12]:** When observed from the client side, and when communicating through an intermediary, `server.address` SHOULD represent the server address behind any intermediaries, for example proxies, if it's available. +**[13]:** When observed from the client side, and when communicating through an intermediary, `server.address` SHOULD represent the server address behind any intermediaries, for example proxies, if it's available. diff --git a/model/registry/db.yaml b/model/registry/db.yaml index 18d12f7367..0ab451cd08 100644 --- a/model/registry/db.yaml +++ b/model/registry/db.yaml @@ -474,12 +474,6 @@ groups: brief: > This group defines attributes for Elasticsearch. attributes: - - id: elasticsearch.cluster.name - type: string - stability: experimental - brief: > - Represents the identifier of an Elasticsearch cluster. - examples: ["e9106fc68e3044f0b1475b04bf4ffd5f"] - id: elasticsearch.node.name type: string stability: experimental diff --git a/model/registry/deprecated/db.yaml b/model/registry/deprecated/db.yaml index 8e24430d20..054063c9ca 100644 --- a/model/registry/deprecated/db.yaml +++ b/model/registry/deprecated/db.yaml @@ -84,6 +84,13 @@ groups: brief: 'Deprecated, no general replacement at this time. For Elasticsearch, use `db.elasticsearch.node.name` instead.' deprecated: 'Deprecated, no general replacement at this time. For Elasticsearch, use `db.elasticsearch.node.name` instead.' examples: 'mysql-e26b99z.example.com' + - id: elasticsearch.cluster.name + type: string + stability: experimental + deprecated: Use `db.namesapce` instead. + brief: > + Represents the identifier of an Elasticsearch cluster. + examples: ["e9106fc68e3044f0b1475b04bf4ffd5f"] - id: registry.db.metrics.deprecated type: attribute_group diff --git a/model/trace/database.yaml b/model/trace/database.yaml index c01f042a9b..dee3240f8c 100644 --- a/model/trace/database.yaml +++ b/model/trace/database.yaml @@ -246,14 +246,15 @@ groups: brief: The index or data stream against which the query is executed. note: > If the query targets multiple indices or data streams, then the name of those should be added as a comma separated list. - If the query doesn't target a specific index, this field should remain empty. + If the query doesn't target a specific index, this field MUST NOT be set. examples: [ 'my_index', 'index1, index2' ] tag: tech-specific - ref: server.address - ref: server.port - - ref: db.elasticsearch.cluster.name + - ref: db.namespace requirement_level: recommended: > + The name of the Elasticsearch cluster which the client connects to. When communicating with an Elastic Cloud deployment, this should be collected from the "X-Found-Handling-Cluster" HTTP response header. - ref: db.elasticsearch.node.name requirement_level: