-
Notifications
You must be signed in to change notification settings - Fork 272
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clarify and potentially remove InstrumentationLibraryMetrics #148
Comments
tigrannajaryan
pushed a commit
to tigrannajaryan/opentelemetry-proto
that referenced
this issue
May 8, 2020
Resolves open-telemetry#149 Resolves open-telemetry#148 # Problem ## Traces The traces proto currently contains InstrumentationLibrarySpans which does not have clearly defined semantics. InstrumentationLibrarySpans may contain a number of spans all associated with an InstrumentationLibrary. The nature of this association is not clear. The InstrumentationLibrary has a name and a version. It is not clear if these fields are part of Resource identity or are attributes of a Span. Presumably they should be interpreted as attributes of the Span. I am not aware of any other trace protocols or backends that have the equivalent of InstrumentationLibrary concept. However ultimately all span data produced by OpenTelemetry libraries will end up in a backend and the InstrumentationLibrary concept must be mapped to an existing concept. Span attributes seem to be the only concept that fit the bill. Using attributes from the start of the collection pipeline removes the need to deal with InstrumentationLibrary by all codebases that need to make a mapping decision (Collector, backend ingest points, etc). To illustrate the data structure that was needed before this commit, here is an example: ```yaml resource_spans: resource: ... instrumentation_library_spans: - instrumentation_library: name: io.opentelemetry.redis spans: - name: request start_time: 123 - instrumentation_library: name: io.opentelemetry.apache.httpd spans: - name: request start_time: 456 ``` See below what the data structure becomes after implementing the proposed solution. ## Metrics The metrics proto currently includes InstrumentationLibraryMetrics which does not have clearly defined semantics. InstrumentationLibraryMetrics may contain a number of metrics all associated with one InstrumentationLibrary. The nature of this association is not clear. The InstrumentationLibrary has a name and a version. It is not clear if these fields are part of metric identity. For example if I have 2 different InstrumentationLibrarys each having a different name and both containing a Metric that have the same MetricDescriptor.name are these 2 different timeseries or the same one? To illustrate the data structure that was needed before this commit, here is an example: ```yaml resource_metrics: resource: ... instrumentation_library_metrics: - instrumentation_library: name: io.opentelemetry.redis metrics: - metric_descriptor: name: request.count int64_data_points: - value: 10 - instrumentation_library: name: io.opentelemetry.apache.httpd metrics: - metric_descriptor: name: request.count int64_data_points: - value: 200 ``` See below what the data structure becomes after implementing the proposed solution. # Solution ## Traces This commit removes `InstrumentationLibrarySpans` message type from the protocol. We will add semantic conventions for recording instrumentation library in Span attributes. The benefits of this approach over using `InstrumentationLibrarySpans` are the following: - There is not need for a new concept and new message type at the protocol level. This adds unnecessary complexity to all codebases that need to read and write traces but don't care about instrumentation library concept (likely the majory of codebases). - It uses the general concept of attributes that already exists and is well understood and by doing so makes the semantics of instrumentation library name clear. After removing `InstrumentationLibrarySpans` concept we have this data structure: ```yaml resource_spans: resource: ... spans: - name: request start_time: 123 attributes: - key: instrumentation.library.name value: io.opentelemetry.redis - name: request start_time: 456 attributes: - key: instrumentation.library.name value: io.opentelemetry.apache.httpd ``` Once this commit is merged language SDKs will need to make a corresponding change and add "instrumentation.library.name" (or whatever name is accepted in semantic conventions) to Span attributes automatically. ## Metrics This commit removes `InstrumentationLibraryMetrics` message type from the protocol. We will add semantic conventions for recording instrumentation library in Span attributes. Semantically the name of `InstrumentationLibrary` is equivalent to a metric label so we will use metric labels to record the library name and version. The benefits of this approach over using `InstrumentationLibraryMetrics` are the following: - There is not need for a new concept and new message type at the protocol level. This adds unnecessary complexity to all codebases that need to read and write metrics but don't care about instrumentation library concept (likely the majority of codebases). - It uses the general concept of metric labels that already exists and is well understood and by doing so makes the semantics of instrumentation library name clear. The instrumentation library name is one of the labels that form timeseries identifier. - It makes mapping to other metric protocols and backend clearer. I am not aware of any other metric protocol or backend that have the equivalent of `InstrumentationLibrary` concept. However ultimately all metric data produced by OpenTelemetry libraries will end up in a backend and the `InstrumentationLibrary` concept must be mapped to an existing concept. Labels seem to be the only concept that fit the bill. Using labels from the start of the collection pipeline removes the need to deal with `InstrumentationLibrary` by all codebases that need to make a mapping or translation decision (Collector, backend ingest points, etc). After removing `InstrumentationLibraryMetrics` concept we have this data structure: ```yaml resource_metrics: resource: ... metrics: - metric_descriptor: name: "request.count" int64_data_points: - value: 10 labels: - key: instrumentation.library.name value: io.opentelemetry.redis - metric_descriptor: name: "request.count" int64_data_points: - value: 200 labels: - key: instrumentation.library.name value: io.opentelemetry.apache.httpd ```
tigrannajaryan
pushed a commit
to tigrannajaryan/opentelemetry-proto
that referenced
this issue
May 8, 2020
Resolves open-telemetry#149 Resolves open-telemetry#148 # Problem ## Traces The traces proto currently contains InstrumentationLibrarySpans which does not have clearly defined semantics. InstrumentationLibrarySpans may contain a number of spans all associated with an InstrumentationLibrary. The nature of this association is not clear. The InstrumentationLibrary has a name and a version. It is not clear if these fields are part of Resource identity or are attributes of a Span. Presumably they should be interpreted as attributes of the Span. I am not aware of any other trace protocols or backends that have the equivalent of InstrumentationLibrary concept. However ultimately all span data produced by OpenTelemetry libraries will end up in a backend and the InstrumentationLibrary concept must be mapped to an existing concept. Span attributes seem to be the only concept that fit the bill. Using attributes from the start of the collection pipeline removes the need to deal with InstrumentationLibrary by all codebases that need to make a mapping decision (Collector, backend ingest points, etc). To illustrate the data structure that was needed before this commit, here is an example: ```yaml resource_spans: resource: ... instrumentation_library_spans: - instrumentation_library: name: io.opentelemetry.redis spans: - name: request start_time: 123 - instrumentation_library: name: io.opentelemetry.apache.httpd spans: - name: request start_time: 456 ``` See below what the data structure becomes after implementing the proposed solution. ## Metrics The metrics proto currently includes InstrumentationLibraryMetrics which does not have clearly defined semantics. InstrumentationLibraryMetrics may contain a number of metrics all associated with one InstrumentationLibrary. The nature of this association is not clear. The InstrumentationLibrary has a name and a version. It is not clear if these fields are part of metric identity. For example if I have 2 different InstrumentationLibrarys each having a different name and both containing a Metric that have the same MetricDescriptor.name are these 2 different timeseries or the same one? To illustrate the data structure that was needed before this commit, here is an example: ```yaml resource_metrics: resource: ... instrumentation_library_metrics: - instrumentation_library: name: io.opentelemetry.redis metrics: - metric_descriptor: name: request.count int64_data_points: - value: 10 - instrumentation_library: name: io.opentelemetry.apache.httpd metrics: - metric_descriptor: name: request.count int64_data_points: - value: 200 ``` See below what the data structure becomes after implementing the proposed solution. # Solution ## Traces This commit removes `InstrumentationLibrarySpans` message type from the protocol. We will add semantic conventions for recording instrumentation library in Span attributes. The benefits of this approach over using `InstrumentationLibrarySpans` are the following: - There is not need for a new concept and new message type at the protocol level. This adds unnecessary complexity to all codebases that need to read and write traces but don't care about instrumentation library concept (likely the majory of codebases). - It uses the general concept of attributes that already exists and is well understood and by doing so makes the semantics of instrumentation library name clear. After removing `InstrumentationLibrarySpans` concept we have this data structure: ```yaml resource_spans: resource: ... spans: - name: request start_time: 123 attributes: - key: instrumentation.library.name value: io.opentelemetry.redis - name: request start_time: 456 attributes: - key: instrumentation.library.name value: io.opentelemetry.apache.httpd ``` Once this commit is merged language SDKs will need to make a corresponding change and add "instrumentation.library.name" (or whatever name is accepted in semantic conventions) to Span attributes automatically. ## Metrics This commit removes `InstrumentationLibraryMetrics` message type from the protocol. We will add semantic conventions for recording instrumentation library in Span attributes. Semantically the name of `InstrumentationLibrary` is equivalent to a metric label so we will use metric labels to record the library name and version. The benefits of this approach over using `InstrumentationLibraryMetrics` are the following: - There is not need for a new concept and new message type at the protocol level. This adds unnecessary complexity to all codebases that need to read and write metrics but don't care about instrumentation library concept (likely the majority of codebases). - It uses the general concept of metric labels that already exists and is well understood and by doing so makes the semantics of instrumentation library name clear. The instrumentation library name is one of the labels that form timeseries identifier. - It makes mapping to other metric protocols and backend clearer. I am not aware of any other metric protocol or backend that have the equivalent of `InstrumentationLibrary` concept. However ultimately all metric data produced by OpenTelemetry libraries will end up in a backend and the `InstrumentationLibrary` concept must be mapped to an existing concept. Labels seem to be the only concept that fit the bill. Using labels from the start of the collection pipeline removes the need to deal with `InstrumentationLibrary` by all codebases that need to make a mapping or translation decision (Collector, backend ingest points, etc). After removing `InstrumentationLibraryMetrics` concept we have this data structure: ```yaml resource_metrics: resource: ... metrics: - metric_descriptor: name: "request.count" int64_data_points: - value: 10 labels: - key: instrumentation.library.name value: io.opentelemetry.redis - metric_descriptor: name: "request.count" int64_data_points: - value: 200 labels: - key: instrumentation.library.name value: io.opentelemetry.apache.httpd ```
tigrannajaryan
pushed a commit
to tigrannajaryan/opentelemetry-proto
that referenced
this issue
Jun 22, 2020
Resolves open-telemetry#149 Resolves open-telemetry#148 # Problem ## Traces The traces proto currently contains InstrumentationLibrarySpans which does not have clearly defined semantics. InstrumentationLibrarySpans may contain a number of spans all associated with an InstrumentationLibrary. The nature of this association is not clear. The InstrumentationLibrary has a name and a version. It is not clear if these fields are part of Resource identity or are attributes of a Span. Presumably they should be interpreted as attributes of the Span. I am not aware of any other trace protocols or backends that have the equivalent of InstrumentationLibrary concept. However ultimately all span data produced by OpenTelemetry libraries will end up in a backend and the InstrumentationLibrary concept must be mapped to an existing concept. Span attributes seem to be the only concept that fit the bill. Using attributes from the start of the collection pipeline removes the need to deal with InstrumentationLibrary by all codebases that need to make a mapping decision (Collector, backend ingest points, etc). To illustrate the data structure that was needed before this commit, here is an example: ```yaml resource_spans: resource: ... instrumentation_library_spans: - instrumentation_library: name: io.opentelemetry.redis spans: - name: request start_time: 123 - instrumentation_library: name: io.opentelemetry.apache.httpd spans: - name: request start_time: 456 ``` See below what the data structure becomes after implementing the proposed solution. ## Metrics The metrics proto currently includes InstrumentationLibraryMetrics which does not have clearly defined semantics. InstrumentationLibraryMetrics may contain a number of metrics all associated with one InstrumentationLibrary. The nature of this association is not clear. The InstrumentationLibrary has a name and a version. It is not clear if these fields are part of metric identity. For example if I have 2 different InstrumentationLibrarys each having a different name and both containing a Metric that have the same MetricDescriptor.name are these 2 different timeseries or the same one? To illustrate the data structure that was needed before this commit, here is an example: ```yaml resource_metrics: resource: ... instrumentation_library_metrics: - instrumentation_library: name: io.opentelemetry.redis metrics: - metric_descriptor: name: request.count int64_data_points: - value: 10 - instrumentation_library: name: io.opentelemetry.apache.httpd metrics: - metric_descriptor: name: request.count int64_data_points: - value: 200 ``` See below what the data structure becomes after implementing the proposed solution. # Solution ## Traces This commit removes `InstrumentationLibrarySpans` message type from the protocol. We will add semantic conventions for recording instrumentation library in Span attributes. The benefits of this approach over using `InstrumentationLibrarySpans` are the following: - There is not need for a new concept and new message type at the protocol level. This adds unnecessary complexity to all codebases that need to read and write traces but don't care about instrumentation library concept (likely the majory of codebases). - It uses the general concept of attributes that already exists and is well understood and by doing so makes the semantics of instrumentation library name clear. After removing `InstrumentationLibrarySpans` concept we have this data structure: ```yaml resource_spans: resource: ... spans: - name: request start_time: 123 attributes: - key: instrumentation.library.name value: io.opentelemetry.redis - name: request start_time: 456 attributes: - key: instrumentation.library.name value: io.opentelemetry.apache.httpd ``` Once this commit is merged language SDKs will need to make a corresponding change and add "instrumentation.library.name" (or whatever name is accepted in semantic conventions) to Span attributes automatically. ## Metrics This commit removes `InstrumentationLibraryMetrics` message type from the protocol. We will add semantic conventions for recording instrumentation library in Span attributes. Semantically the name of `InstrumentationLibrary` is equivalent to a metric label so we will use metric labels to record the library name and version. The benefits of this approach over using `InstrumentationLibraryMetrics` are the following: - There is not need for a new concept and new message type at the protocol level. This adds unnecessary complexity to all codebases that need to read and write metrics but don't care about instrumentation library concept (likely the majority of codebases). - It uses the general concept of metric labels that already exists and is well understood and by doing so makes the semantics of instrumentation library name clear. The instrumentation library name is one of the labels that form timeseries identifier. - It makes mapping to other metric protocols and backend clearer. I am not aware of any other metric protocol or backend that have the equivalent of `InstrumentationLibrary` concept. However ultimately all metric data produced by OpenTelemetry libraries will end up in a backend and the `InstrumentationLibrary` concept must be mapped to an existing concept. Labels seem to be the only concept that fit the bill. Using labels from the start of the collection pipeline removes the need to deal with `InstrumentationLibrary` by all codebases that need to make a mapping or translation decision (Collector, backend ingest points, etc). After removing `InstrumentationLibraryMetrics` concept we have this data structure: ```yaml resource_metrics: resource: ... metrics: - metric_descriptor: name: "request.count" int64_data_points: - value: 10 labels: - key: instrumentation.library.name value: io.opentelemetry.redis - metric_descriptor: name: "request.count" int64_data_points: - value: 200 labels: - key: instrumentation.library.name value: io.opentelemetry.apache.httpd ```
As discussed in #150 the decision is to keep InstrumentationLibraryMetrics. Closing this as "won't do". |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
The metrics proto currently includes
InstrumentationLibraryMetrics
which doesnot have clearly defined semantics.
InstrumentationLibraryMetrics
may containa number of metrics all associated with one
InstrumentationLibrary
. The natureof this association is not clear.
The
InstrumentationLibrary
has aname
and aversion
. It is not clear ifthese fields are part of metric identity. For example if I have 2 different
InstrumentationLibrary
s each having a differentname
and both containing aMetric
that have the sameMetricDescriptor.name
are these 2 differenttimeseries or the same one?
Let's have a look at an example (pardon yaml syntax):
Presumably this data is about 2 different timeseries: one is the number of
database requests issued to Redis, the other is number of requests served by
Apache web server. They certainly need to be separate timeseries.
Semantically the name of
InstrumentationLibrary
appears to be equivalent to ametric label.
If this is true then we have the same result by simply recording the name of the
instrumentation library as a regular label on the metric, e.g.:
Alternatively if grouping by metric name is difficult for metric producer then
this data can produced:
Both of these approaches describe the data correctly using the generic labels
concept and still make it possible for interested parties to find out the
sending library by comparing the label key to "instrumentation.library.name"
(which should be added to our semantic conventions).
The benefits of this approach over using dedicated
InstrumentationLibrary
concept are the following:
There is not need for a new concept and new message type at the protocol
level. This adds unnecessary complexity to all codebases that need to read and
write metrics but don't care about instrumentation library concept (likely the
majority of codebases).
It uses the general concept of metric labels that already exists and is well
understood and by doing so makes the semantics of instrumentation library name
clear. The instrumentation library name is one of the labels that form
timeseries identifier.
It makes mapping to other metric protocols and backend clearer. I am not aware
of any other metric protocol or backend that have the equivalent of
InstrumentationLibrary
concept. However ultimately all metric data producedby OpenTelemetry libraries will end up in a backend and the
InstrumentationLibrary
concept must be mapped to an existing concept. Labelsseem to be the only concept that fit the bill. Using labels from the start of
the collection pipeline removes the need to deal with
InstrumentationLibrary
by all codebases that need to make a mapping or translation decision
(Collector, backend ingest points, etc).
I suggest to remove
InstrumentationLibrary
message type from the protocol andadd semantic conventions for recording instrumentation library in metric labels.
There is potentially a minor downside: using
InstrumentationLibrary
messagemay be slightly more efficient to encode/decode especially if there are multiple
metrics from the same instrumentation library, but the difference is small and I
think is not worth the complication.
If we want to retain the efficiency and deduplicate common labels we can put
common labels in the Metric. This was previously the case, we had labels in the
Metric and
they were removed
due to lack of clearly specified semantics. Assuming that we clearly require
that labels in the Metric don't overlap with labels in data points then we can
bring back common labels in the Metric.
The text was updated successfully, but these errors were encountered: