diff --git a/CHANGELOG.md b/CHANGELOG.md index 9c9576b4f9..9a8cf30653 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -107,6 +107,8 @@ Note: This is the first release of Semantic Conventions separate from the Specif ([#23](https://github.com/open-telemetry/semantic-conventions/pull/23)) - Add YAML definitions for log semantic conventions and define requirement levels ([#133](https://github.com/open-telemetry/semantic-conventions/pull/133)) +- Add markdown file for url semantic conventions + ([#174](https://github.com/open-telemetry/semantic-conventions/pull/174)) ## v1.20.0 (2023-04-07) diff --git a/docs/database/elasticsearch.md b/docs/database/elasticsearch.md index 5f1734dbfd..3a5c6bcaf5 100644 --- a/docs/database/elasticsearch.md +++ b/docs/database/elasticsearch.md @@ -38,7 +38,7 @@ in order to map the path part values to their names. | `http.request.method` | string | HTTP request method. [3] | `GET`; `POST`; `HEAD` | Required | | [`server.address`](../general/general-attributes.md) | string | Logical server hostname, matches server FQDN if available, and IP or socket address if FQDN is not known. | `example.com` | See below | | [`server.port`](../general/general-attributes.md) | int | Logical server port number | `80`; `8080`; `443` | Recommended | -| `url.full` | string | Absolute URL describing a network resource according to [RFC3986](https://www.rfc-editor.org/rfc/rfc3986) [4] | `https://localhost:9200/index/_search?q=user.id:kimchy` | Required | +| [`url.full`](../url/url.md) | string | Absolute URL describing a network resource according to [RFC3986](https://www.rfc-editor.org/rfc/rfc3986) [4] | `https://localhost:9200/index/_search?q=user.id:kimchy` | Required | **[1]:** When setting this to an SQL keyword, it is not recommended to attempt any client-side parsing of `db.statement` just to get this property, but it should be set if the operation name is provided by the library being instrumented. If the SQL statement has an ambiguous operation, or performs more than one operation, this value may be omitted. diff --git a/docs/http/http-metrics.md b/docs/http/http-metrics.md index 43b37ace2b..c4b7231803 100644 --- a/docs/http/http-metrics.md +++ b/docs/http/http-metrics.md @@ -81,7 +81,7 @@ of `[ 0, 0.005, 0.01, 0.025, 0.05, 0.075, 0.1, 0.25, 0.5, 0.75, 1, 2.5, 5, 7.5, | [`network.protocol.version`](../general/general-attributes.md) | string | Version of the application layer protocol used. See note below. [3] | `3.1.1` | Recommended | | [`server.address`](../general/general-attributes.md) | string | Name of the local HTTP server that received the request. [4] | `example.com` | Opt-In | | [`server.port`](../general/general-attributes.md) | int | Port of the local HTTP server that received the request. [5] | `80`; `8080`; `443` | Opt-In | -| `url.scheme` | string | The [URI scheme](https://www.rfc-editor.org/rfc/rfc3986#section-3.1) component identifying the used protocol. | `http`; `https` | Required | +| [`url.scheme`](../url/url.md) | string | The [URI scheme](https://www.rfc-editor.org/rfc/rfc3986#section-3.1) component identifying the used protocol. | `http`; `https` | Required | **[1]:** MUST NOT be populated when this is not supported by the HTTP server framework as the route attribute should have low-cardinality and the URI path can NOT substitute it. SHOULD include the [application root](/docs/http/http-spans.md#http-server-definitions) if there is one. @@ -155,7 +155,7 @@ This metric is optional. | `http.request.method` | string | HTTP request method. [1] | `GET`; `POST`; `HEAD` | Required | | [`server.address`](../general/general-attributes.md) | string | Name of the local HTTP server that received the request. [2] | `example.com` | Opt-In | | [`server.port`](../general/general-attributes.md) | int | Port of the local HTTP server that received the request. [3] | `80`; `8080`; `443` | Opt-In | -| `url.scheme` | string | The [URI scheme](https://www.rfc-editor.org/rfc/rfc3986#section-3.1) component identifying the used protocol. | `http`; `https` | Required | +| [`url.scheme`](../url/url.md) | string | The [URI scheme](https://www.rfc-editor.org/rfc/rfc3986#section-3.1) component identifying the used protocol. | `http`; `https` | Required | **[1]:** HTTP request method value SHOULD be "known" to the instrumentation. By default, this convention defines "known" methods as the ones listed in [RFC9110](https://www.rfc-editor.org/rfc/rfc9110.html#name-methods) @@ -228,7 +228,7 @@ This metric is optional. | [`network.protocol.version`](../general/general-attributes.md) | string | Version of the application layer protocol used. See note below. [3] | `3.1.1` | Recommended | | [`server.address`](../general/general-attributes.md) | string | Name of the local HTTP server that received the request. [4] | `example.com` | Opt-In | | [`server.port`](../general/general-attributes.md) | int | Port of the local HTTP server that received the request. [5] | `80`; `8080`; `443` | Opt-In | -| `url.scheme` | string | The [URI scheme](https://www.rfc-editor.org/rfc/rfc3986#section-3.1) component identifying the used protocol. | `http`; `https` | Required | +| [`url.scheme`](../url/url.md) | string | The [URI scheme](https://www.rfc-editor.org/rfc/rfc3986#section-3.1) component identifying the used protocol. | `http`; `https` | Required | **[1]:** MUST NOT be populated when this is not supported by the HTTP server framework as the route attribute should have low-cardinality and the URI path can NOT substitute it. SHOULD include the [application root](/docs/http/http-spans.md#http-server-definitions) if there is one. @@ -306,7 +306,7 @@ This metric is optional. | [`network.protocol.version`](../general/general-attributes.md) | string | Version of the application layer protocol used. See note below. [3] | `3.1.1` | Recommended | | [`server.address`](../general/general-attributes.md) | string | Name of the local HTTP server that received the request. [4] | `example.com` | Opt-In | | [`server.port`](../general/general-attributes.md) | int | Port of the local HTTP server that received the request. [5] | `80`; `8080`; `443` | Opt-In | -| `url.scheme` | string | The [URI scheme](https://www.rfc-editor.org/rfc/rfc3986#section-3.1) component identifying the used protocol. | `http`; `https` | Required | +| [`url.scheme`](../url/url.md) | string | The [URI scheme](https://www.rfc-editor.org/rfc/rfc3986#section-3.1) component identifying the used protocol. | `http`; `https` | Required | **[1]:** MUST NOT be populated when this is not supported by the HTTP server framework as the route attribute should have low-cardinality and the URI path can NOT substitute it. SHOULD include the [application root](/docs/http/http-spans.md#http-server-definitions) if there is one. diff --git a/docs/http/http-spans.md b/docs/http/http-spans.md index 553fdbf938..0be977dba8 100644 --- a/docs/http/http-spans.md +++ b/docs/http/http-spans.md @@ -198,7 +198,7 @@ For an HTTP client span, `SpanKind` MUST be `Client`. | [`server.socket.address`](../general/general-attributes.md) | string | Physical server IP address or Unix socket address. If set from the client, should simply use the socket's peer address, and not attempt to find any actual server IP (i.e., if set from client, this may represent some proxy server instead of the logical server). | `10.5.3.2` | Recommended: If different than `server.address`. | | [`server.socket.domain`](../general/general-attributes.md) | string | The domain name of an immediate peer. [5] | `proxy.example.com` | Recommended: If different than `server.address`. | | [`server.socket.port`](../general/general-attributes.md) | int | Physical server port. | `16456` | Recommended: If different than `server.port`. | -| `url.full` | string | Absolute URL describing a network resource according to [RFC3986](https://www.rfc-editor.org/rfc/rfc3986) [6] | `https://www.foo.bar/search?q=OpenTelemetry#SemConv`; `//localhost` | Required | +| [`url.full`](../url/url.md) | string | Absolute URL describing a network resource according to [RFC3986](https://www.rfc-editor.org/rfc/rfc3986) [6] | `https://www.foo.bar/search?q=OpenTelemetry#SemConv`; `//localhost` | Required | **[1]:** The resend count SHOULD be updated each time an HTTP request gets resent by the client, regardless of what was the cause of the resending (e.g. redirection, authorization failure, 503 Server Unavailable, network issues, or any other). @@ -225,7 +225,7 @@ Following attributes MUST be provided **at span creation time** (when provided a * [`server.address`](../general/general-attributes.md) * [`server.port`](../general/general-attributes.md) -* `url.full` +* [`url.full`](../url/url.md) Note that in some cases host and port identifiers in the `Host` header might be different from the `server.address` and `server.port`, in this case instrumentation MAY populate `Host` header on `http.request.header.host` attribute even if it's not enabled by user. @@ -331,9 +331,9 @@ If the route cannot be determined, the `name` attribute MUST be set as defined i | [`server.port`](../general/general-attributes.md) | int | Port of the local HTTP server that received the request. [5] | `80`; `8080`; `443` | Recommended: [6] | | [`server.socket.address`](../general/general-attributes.md) | string | Local socket address. Useful in case of a multi-IP host. | `10.5.3.2` | Opt-In | | [`server.socket.port`](../general/general-attributes.md) | int | Local socket port. Useful in case of a multi-port host. | `16456` | Opt-In | -| `url.path` | string | The [URI path](https://www.rfc-editor.org/rfc/rfc3986#section-3.3) component [7] | `/search` | Required | -| `url.query` | string | The [URI query](https://www.rfc-editor.org/rfc/rfc3986#section-3.4) component [8] | `q=OpenTelemetry` | Conditionally Required: If and only if one was received/sent. | -| `url.scheme` | string | The [URI scheme](https://www.rfc-editor.org/rfc/rfc3986#section-3.1) component identifying the used protocol. | `http`; `https` | Required | +| [`url.path`](../url/url.md) | string | The [URI path](https://www.rfc-editor.org/rfc/rfc3986#section-3.3) component [7] | `/search` | Required | +| [`url.query`](../url/url.md) | string | The [URI query](https://www.rfc-editor.org/rfc/rfc3986#section-3.4) component [8] | `q=OpenTelemetry` | Conditionally Required: If and only if one was received/sent. | +| [`url.scheme`](../url/url.md) | string | The [URI scheme](https://www.rfc-editor.org/rfc/rfc3986#section-3.1) component identifying the used protocol. | `http`; `https` | Required | **[1]:** MUST NOT be populated when this is not supported by the HTTP server framework as the route attribute should have low-cardinality and the URI path can NOT substitute it. SHOULD include the [application root](/docs/http/http-spans.md#http-server-definitions) if there is one. @@ -369,9 +369,9 @@ Following attributes MUST be provided **at span creation time** (when provided a * [`server.address`](../general/general-attributes.md) * [`server.port`](../general/general-attributes.md) -* `url.path` -* `url.query` -* `url.scheme` +* [`url.path`](../url/url.md) +* [`url.query`](../url/url.md) +* [`url.scheme`](../url/url.md) `http.route` MUST be provided at span creation time if and only if it's already available. If it becomes available after span starts, instrumentation MUST populate it anytime before span ends. diff --git a/docs/url/README.md b/docs/url/README.md new file mode 100644 index 0000000000..f3e3dc7c2c --- /dev/null +++ b/docs/url/README.md @@ -0,0 +1,11 @@ +# URL semantic conventions + +**Status**: [Experimental][DocumentStatus] + +This document defines semantic conventions for URLs. + +URL semantic conventions are defined for the following: + +* [URL](url.md): For describing URL and its components. + +[DocumentStatus]: https://github.com/open-telemetry/opentelemetry-specification/blob/v1.21.0/specification/document-status.md diff --git a/docs/url/url.md b/docs/url/url.md new file mode 100644 index 0000000000..99ba2af01c --- /dev/null +++ b/docs/url/url.md @@ -0,0 +1,47 @@ +# Semantic conventions for URL + +**Status**: [Experimental][DocumentStatus] + +This document defines semantic conventions that describe URL and its components. + +
+Table of Contents + + + +- [Attributes](#attributes) +- [Sensitive information](#sensitive-information) + + + +
+ +## Attributes + + +| Attribute | Type | Description | Examples | Requirement Level | +|---|---|---|---|---| +| `url.scheme` | string | The [URI scheme](https://www.rfc-editor.org/rfc/rfc3986#section-3.1) component identifying the used protocol. | `https`; `ftp`; `telnet` | Recommended | +| `url.full` | string | Absolute URL describing a network resource according to [RFC3986](https://www.rfc-editor.org/rfc/rfc3986) [1] | `https://www.foo.bar/search?q=OpenTelemetry#SemConv`; `//localhost` | Recommended | +| `url.path` | string | The [URI path](https://www.rfc-editor.org/rfc/rfc3986#section-3.3) component [2] | `/search` | Recommended | +| `url.query` | string | The [URI query](https://www.rfc-editor.org/rfc/rfc3986#section-3.4) component [3] | `q=OpenTelemetry` | Recommended | +| `url.fragment` | string | The [URI fragment](https://www.rfc-editor.org/rfc/rfc3986#section-3.5) component | `SemConv` | Recommended | + +**[1]:** For network calls, URL usually has `scheme://host[:port][path][?query][#fragment]` format, where the fragment is not transmitted over HTTP, but if it is known, it should be included nevertheless. +`url.full` MUST NOT contain credentials passed via URL in form of `https://username:password@www.example.com/`. In such case username and password should be redacted and attribute's value should be `https://REDACTED:REDACTED@www.example.com/`. +`url.full` SHOULD capture the absolute URL when it is available (or can be reconstructed) and SHOULD NOT be validated or modified except for sanitizing purposes. + +**[2]:** When missing, the value is assumed to be `/` + +**[3]:** Sensitive content provided in query string SHOULD be scrubbed when instrumentations can identify it. + + +## Sensitive information + +Capturing URL and its components MAY impose security risk. User and password information, when they are provided in [User Information](https://datatracker.ietf.org/doc/html/rfc3986#section-3.2.1) subcomponent, MUST NOT be recorded. + +Instrumentations that are aware of specific sensitive query string parameters MUST scrub their values before capturing `url.query` attribute. For example, native instrumentation of a client library that passes credentials or user location in URL, must scrub corresponding properties. + +_Note: Applications and telemetry consumers should scrub sensitive information from URL attributes on collected telemetry. In systems unable to identify sensitive information, certain attribute values may be redacted entirely._ + +[DocumentStatus]: https://github.com/open-telemetry/opentelemetry-specification/blob/v1.21.0/specification/document-status.md