Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add more details to SDK MeterProvider and View #1730

Merged
merged 29 commits into from
Jul 27, 2021
Merged
Changes from 14 commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
4b866a8
Add View API
reyang May 28, 2021
b3259fc
fix link
reyang May 28, 2021
be9beb5
address comments
reyang Jun 24, 2021
c5c73aa
provide one example
reyang Jun 24, 2021
f82475e
update wording
reyang Jun 24, 2021
7359e9a
polish wording
reyang Jun 24, 2021
2df446b
update wording
reyang Jun 24, 2021
09501aa
mention histogram bucket in the view aggregation
reyang Jun 24, 2021
87e3bfe
resolve comments based on discussion during the SIG meeting
reyang Jun 27, 2021
0c32806
fix typo
reyang Jun 27, 2021
142a345
update the spec based on discussion during the SIG meeting
reyang Jun 29, 2021
8b9a7f0
add more diagrams
reyang Jul 10, 2021
d14d005
update the spec based on the discussion during Metrics SIG meeting
reyang Jul 10, 2021
bbe5b40
minor tweak diagram
reyang Jul 11, 2021
49bf1b1
address review comment
reyang Jul 13, 2021
7d8336d
update the default histogram buckets
reyang Jul 13, 2021
083c51c
remove pipeline from this PR
reyang Jul 20, 2021
4f00ac7
address comments regarding error handling
reyang Jul 20, 2021
3340239
Merge branch 'main' into reyang/metrics-sdk-view
reyang Jul 21, 2021
2d5a888
Update specification/metrics/sdk.md
reyang Jul 21, 2021
f1621e7
adjust wording
reyang Jul 22, 2021
e606fe5
Merge branch 'main' into reyang/metrics-sdk-view
reyang Jul 22, 2021
4d0913a
add instrument type and clarify the selection criteria
reyang Jul 26, 2021
6261507
update the view selection logic
reyang Jul 26, 2021
954102f
update the view selection logic
reyang Jul 26, 2021
5fc0500
fix broken link
reyang Jul 26, 2021
971e770
Merge branch 'main' into reyang/metrics-sdk-view
reyang Jul 26, 2021
4ab3465
update the TODO part by removing the details
reyang Jul 27, 2021
f9f47ed
Merge branch 'main' into reyang/metrics-sdk-view
reyang Jul 27, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
245 changes: 234 additions & 11 deletions specification/metrics/sdk.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,12 +33,235 @@ Table of Contents

## MeterProvider

TODO:
### Meter Creation

* Allow multiple pipelines (processors, exporters).
* Configure "Views".
* Configure timing (related to [issue
1432](https://github.com/open-telemetry/opentelemetry-specification/issues/1432)).
New `Meter` instances are always created through a `MeterProvider` (see
[API](./api.md#meterprovider)). The `name`, `version` (optional), and
`schema_url` (optional) arguments supplied to the `MeterProvider` MUST be used
to create an
[`InstrumentationLibrary`](https://github.com/open-telemetry/oteps/blob/main/text/0083-component.md)
instance which is stored on the created `Meter`.

Configuration (i.e., [MeasurementProcessors](#measurementprocessor),
[MetricProcessors](#metricprocessor), [MetricExporters](#metricexporter) and
[`Views`](#view)) MUST be managed solely by the `MeterProvider` and it MUST
reyang marked this conversation as resolved.
Show resolved Hide resolved
provide some way to configure all of them that are implemented in the SDK, at
least when creating or initializing it.
reyang marked this conversation as resolved.
Show resolved Hide resolved

The `MeterProvider` MAY provide methods to update the configuration. If
cijothomas marked this conversation as resolved.
Show resolved Hide resolved
configuration is updated (e.g., adding a `MetricProcessor`), the updated
configuration MUST also apply to all already returned `Meters` (i.e. it MUST NOT
matter whether a `Meter` was obtained from the `MeterProvider` before or after
the configuration change). Note: Implementation-wise, this could mean that
`Meter` instances have a reference to their `MeterProvider` and access
configuration only via this reference.

### Shutdown

TODO

### ForceFlush
reyang marked this conversation as resolved.
Show resolved Hide resolved

TODO

### Pipeline

A pipeline is a logical group of:

* An ordered list of [View](#view)s
* A [MeasurementProcessor](#measurementprocessor)
* An ordered list of [MetricProcessor](#metricprocessor)s
* A [MetricExporter](#metricexporter)

```text
+----------------------+ +-----------------+
| | | |
Measurements --> MeasurementProcessor +--> In-memory state +--+
reyang marked this conversation as resolved.
Show resolved Hide resolved
| | | | |
+----------------------+ +-----------------+ |
|
+-----------------------------------------------------------+
|
| +-------------------+ +-------------------+
| | | | |
+--> MetricProcessor 1 +--> ... ... -> MetricProcessor N +--+
| | | | |
+-------------------+ +-------------------+ |
|
+-----------------------------------------------------------+
|
| +----------------+
| | |
+--> MetricExporter +--> Another process
| |
+----------------+
```

`MeterProvider` MUST support multiple pipelines:
bogdandrutu marked this conversation as resolved.
Show resolved Hide resolved

```text
+------------------+
| MeterProvider |
| Meter A +-----> Pipeline 1
| Counter X |
| Histogram Y +-----> ...
| Meter B | ...
| Gauge Z |
| ... +-----> Pipeline N
| ... |
+------------------+
```

Each pipeline has an ordered list of [View](#view)s, which provides
configuration for:

* Whether the Measurements from a certain Instrument should go through the
pipeline or not.
* How should the Measurements turn into Metrics.

### View
reyang marked this conversation as resolved.
Show resolved Hide resolved

`View` gives the SDK users flexibility to customize the metrics they want. Here
are some examples when `View` is needed:
reyang marked this conversation as resolved.
Show resolved Hide resolved

* Customize which [Instrument](./api.md#instrument) to be processed/ignored. For
reyang marked this conversation as resolved.
Show resolved Hide resolved
example, an [instrumented library](../glossary.md#instrumented-library) can
provide both temperature and humidity, but the application developer only
wants temperature information.
reyang marked this conversation as resolved.
Show resolved Hide resolved
* Customize the aggregation - if the default aggregation associated with the
Instrument does not meet the expectation. For example, an HTTP client library
reyang marked this conversation as resolved.
Show resolved Hide resolved
might expose HTTP client request duration as [Histogram](./api.md#histogram)
by default, but the application developer only wants the total count of
reyang marked this conversation as resolved.
Show resolved Hide resolved
outgoing requests.
* Customize which attribute(s) to be reported as metrics dimension(s). For
reyang marked this conversation as resolved.
Show resolved Hide resolved
example, an HTTP server library might expose HTTP verb (e.g. GET, POST) and
HTTP status code (e.g. 200, 301, 404). The application developer might only
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if this belongs, but adjusting the sampling of Exemplars is likely an aggregator problem. I can write up my thoughts here but I investigated what Prometheus has been up to in this space and I think likely exemplar sampling should be tied to aggregator. If not, we should call it out as a thing users may want to change.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jsuereth Agree. One simple aggregator for exemplars would use an equal-probability reservoir sampler such as Algorithm R. Then, we can apply an adjusted_count attribute to each exemplar, and a downstream system can combine exemplars into a data set, which can be used to estimate histograms of the full-cardinality event stream.

care about HTTP status code (e.g. reporting the total count of HTTP requests
for each HTTP status code). There are also extreme scenario that the
application developer does not need any dimension (e.g. just get the total
reyang marked this conversation as resolved.
Show resolved Hide resolved
count of all incoming requests).
* Add additional dimension(s) from the [Context](../context/context.md). For
example, a [Baggage](../baggage/api.md) value might be available indicating
whether an HTTP request is coming from a bot/crawler or not. The application
developer might want this to be converted to a dimension for HTTP server
metrics (e.g. the request/second from bots vs. real users).
jmacd marked this conversation as resolved.
Show resolved Hide resolved

The SDK MUST provide the means to register Views with a [MeterProvider
Pipeline](#pipeline). Here are the inputs:

* The Instrument selection criteria (required), which covers:
* The `name` of the Instrument(s), with wildcard support (required).
reyang marked this conversation as resolved.
Show resolved Hide resolved
* The `name` of the Meter (optional).
* The `version` of the Meter (optional).
* The `schema_url` of the Meter (optional).
* Individual language client MAY choose to support more criteria. For example,
a strong typed language MAY support point type (e.g. allow the users to
select Instruments based on whether the underlying type is integer or
double).
* The `name` of the View (optional). If not provided, the Instrument `name`
reyang marked this conversation as resolved.
Show resolved Hide resolved
would be used by default. This will be used as the name of the [metrics
stream](./datamodel.md#events--data-stream--timeseries).
* The configuration for the resulting [metrics
stream](./datamodel.md#events--data-stream--timeseries):
* The `description`. If not provided, the Instrument `description` would be
jmacd marked this conversation as resolved.
Show resolved Hide resolved
used by default.
* The `attribute keys` (optional). If not provided, all the attribute keys
reyang marked this conversation as resolved.
Show resolved Hide resolved
will be used by default (TODO: once the Hint API is available, the default
behavior should respect the Hint if it is available).
* The `extra dimensions` which come from Baggage/Context (optional). If not
provided, no extra dimension will be used. Please note that this only
applies to [synchronous Instruments](./api.md#synchronous-instrument), any
`extra dimensions` configured on [asynchronous
Instruments](./api.md#asynchronous-instrument) will be considered as error.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI - regarding attribute key usage in the JVM prototype, I did the following: https://github.com/jsuereth/opentelemetry-java/blob/wip-metrics-api/sdk/metrics/src/main/java/io/opentelemetry/sdk/metrics/view/AttributesProcessors.java

There's an AttributesProcessor interface that's locked down and provides basic primitives and "join" operations to construct your behavior around altering attributes. Would this meet the specification?

Also what does "considered an error" mean in terms of how to handle this scenario? May want to link to error-handling-in-the-sdk section.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, the metrics SDK spec would want to give some room to the language clients so they can shoot for higher performance. @reyang to follow up with @jsuereth to change the wording in the spec to give that room.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Filter attributes from the Measurements.
Take extra attributes from Baggage/Context.
Add extra attributes (most likely to be static per view, e.g. "aggregation=MAX").

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error handling part is covered in the next section, I've removed the "... will be considered as error" part here to avoid duplication.

* The `aggregation` (optional) to be used. If not provided, a default
jmacd marked this conversation as resolved.
Show resolved Hide resolved
aggregation will be applied by the SDK. The default aggregation is a TODO
(e.g. first see if there is an explicitly specified aggregation from the
View, if not, see if there is a Hint, if not, fallback to the default
aggregation that is associated with the Instrument type), and for histogram,
the default buckets would be [0, 5.0], (5.0, 10.0], (10.0, 25.0], (25.0,
reyang marked this conversation as resolved.
Show resolved Hide resolved
50.0], (50.0, 75.0], (75.0, 100.0], (100.0, 250.0], (250.0, 500.0], (500.0,
1000.0], (1000.0, +∞).

The SDK SHOULD use the following logic to determine how to process an
Instrument:
reyang marked this conversation as resolved.
Show resolved Hide resolved

* Determine the MeterProvider which "owns" the Instrument.
* If the MeterProvider has no [Pipeline](#pipeline) registered, go to the END.
* For each Pipeline:
* If the Pipeline has no `View` registered, take the Instrument and apply the
default configuration.
* If the Pipeline has one or more `View`(s) registered, for each View:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, I was sharing views across all pipelines. In this instance, it implies that storage may need to be isolated per-pipeline, as the interaction between View (and specifically Aggregator) with storage might be an implementation detail.

A few questions:

  • Do all pipelines share the same MeasurementProcessor?
  • Do you expect different internal storage of "Metrics" or "Accumlulations" per-pipeline given this architecture?
  • How does this logic line up with the "late-binding" notion of instruments allowed in the API? e.g. for cases where instrumentation grabs an instrument PRIOR to SDK being fully set up, and then trying to backfill the instrument to flow to the configured SDK. Note: That's an "optional" choice for API/SDKs, but I think it has poor interaction here.
  • Do pipelines have to be registered all at once, or can they be added later?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, I think pipelines should be 100% independent of each other (at least, if the user wants to share a processor instance, or whatever across pipelines, that's their decision, not enforced by the SDK).
And, I think all pipelines should be fully configured & registered at MeterProvider creation time.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there are three layers in a pipeline, Measurement processing -> Metric processing -> Exporting, it seems that the question is whether we can configure multiple pipelines that share common prefixes, for efficiency. As far as the mechanism goes, we just need to support "Multi-" interfaces for each: Multi-measurement processor, multi-metric processor, and multi-exporter. These components would pass individual metric events, accumulations, and aggregations to multiple of the downstream components.

Now, the question seems to be how complicated a configuration language you want to recommend across all OTel SDKs, for users to set up optimized pipelines. I'd like to keep it simple. It would be nice if the user could configure 100% independent pipelines and, if you care to be fancy, the SDK can optimize that configuration into an efficient implementation by sharing pipeline prefixes. Let's not specify how that's done, in other words.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Do all pipelines share the same MeasurementProcessor?

Conceptually no. The SDK can be smart and optimize things if the optimization yields the same outcome from the user perspective.

  • Do you expect different internal storage of "Metrics" or "Accumlulations" per-pipeline given this architecture?

No - from the spec perspective. Whether it is shared or isolated, is an implementation detail.
Yes - from the implementation perspective, I expect some language clients will choose to combine/share the in-memory state for sake of memory efficiency and perf.

  • How does this logic line up with the "late-binding" notion of instruments allowed in the API?> * Do pipelines have to be registered all at once, or can they be added later?

For most of the language clients I would expect what @jkwatson mentioned (#1730 (comment)) - Pipelines are finalized at MeterProvider creation time. However individual implementation can support adding pipelines dynamically if they want, and the spec should allow that. We already covered it in this PR The MeterProvider MAY provide methods to update the configuration, and it is aligned with the tracing sdk spec as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm on board with this. Will have to adjust the Java implementation appropriately.

* If the Instrument could match the instrument selection criteria:
* Try to apply the View configuration. If there is an error (e.g. the View
asks for extra dimensions from the Baggage, but the Instrument is
[asynchronous](./api.md#asynchronous-instrument) which doesn't have
Context) or a conflict (e.g. the View requires to export the metrics
using a certain name, but the name is already used by another View),
provide a way to let the user know (e.g. expose [self-diagnostics
logs](../error-handling.md#self-diagnostics)).
* Stop processing the remaining Views (proceed to the next Pipeline).
jmacd marked this conversation as resolved.
Show resolved Hide resolved
* END.

Here are some examples:

```python
# Python
'''
+------------------+
| MeterProvider |
| Meter A |
| Counter X |
| Histogram Y |
| Meter B |
| Gauge Z |
+------------------+
'''

meter_provider.start_pipeline(
# all the metrics will be exported using the default configuration
pipeline: pipeline
.set_exporter(ConsoleExporter())
).start_pipeline(
# metrics from X and Y (reported as Foo and Bar) will be exported to Prometheus upon scraping
pipeline: pipeline
.add_view("X")
.add_view("Foo", instrument_name="Y")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is "Foo"? Here it says that the instrument name is required.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is the name of the view, replacing the name of instrument "Y" in the resulting metrics.

.add_view(
"Bar",
instrument_name="Y",
aggregation=HistogramAggregation(buckets=[5.0, 10.0, 25.0, 50.0, 100.0]))
.set_exporter(PrometheusExporter())
)
```

```python
meter_provider.start_pipeline(
# all the metrics will be exported using the default configuration
pipeline: pipeline
.add_view("*") # a wildcard view that matches everything
.set_exporter(ConsoleExporter())
)
```

```python
meter_provider.start_pipeline(
# Counter X will be exported as cumulative sum
pipeline: pipeline
.add_view("X", aggregation=SumAggregation(CUMULATIVE))
reyang marked this conversation as resolved.
Show resolved Hide resolved
.set_exporter(ConsoleExporter())
)
```

```python
meter_provider.start_pipeline(
# Counter X will be exported as delta sum
# Histogram Y and Gauge Z will be exported with 2 dimensions (a and b)
pipeline: pipeline
.add_view("X", aggregation=SumAggregation(DELTA))
.add_view("*", attribute_keys=["a", "b"])
.set_exporter(ConsoleExporter())
)
```

## MeasurementProcessor

Expand Down Expand Up @@ -69,13 +292,13 @@ active span](../trace/api.md#context-interaction)).
+------------------+
| MeterProvider | +----------------------+ +-----------------+
| Meter A | Measurements... | | Metrics... | |
| Instrument X |-----------------> MeasurementProcessor +------------> In-memory state |
| Instrument Y + | | | |
| Instrument X +-----------------> MeasurementProcessor +------------> In-memory state |
| Instrument Y | | | | |
| Meter B | +----------------------+ +-----------------+
| Instrument Z |
| ... | +----------------------+ +-----------------+
| ... | Measurements... | | Metrics... | |
| ... |-----------------> MeasurementProcessor +------------> In-memory state |
| ... +-----------------> MeasurementProcessor +------------> In-memory state |
| ... | | | | |
| ... | +----------------------+ +-----------------+
+------------------+
Expand All @@ -95,20 +318,20 @@ in the SDK:
```text
+-----------------+ +-----------------+ +-----------------------+
| | Metrics... | | Metrics... | |
> In-memory state | -----------> MetricProcessor | -----------> MetricExporter (push) |--> Another process
| In-memory state +------------> MetricProcessor +------------> MetricExporter (push) +--> Another process
| | | | | |
+-----------------+ +-----------------+ +-----------------------+

+-----------------+ +-----------------+ +-----------------------+
| | Metrics... | | Metrics... | |
> In-memory state |------------> MetricProcessor |------------> MetricExporter (pull) |--> Another process (scraper)
| In-memory state +------------> MetricProcessor +------------> MetricExporter (pull) +--> Another process (scraper)
| | | | | |
+-----------------+ +-----------------+ +-----------------------+
```

## MetricExporter

`MetricExporter` defines the interface that protocol-specific exporters must
`MetricExporter` defines the interface that protocol-specific exporters MUST
implement so that they can be plugged into OpenTelemetry SDK and support sending
of telemetry data.

Expand Down