Add the metrics API Hint #1753

reyang · 2021-06-10T00:04:27Z

Related to #1730.

Changes

Added Hint to the metrics API spec.

The SDK PR #1730 is a bit stuck because Instrument, View and Hint are entangled. As discussed in the 06/03/2021 Metric API/SDK SIG Meeting we've decided to make progress on the Hint API spec, so we can reference it in the SDK spec PR.

In this PR I try to focus on why (why do we need Hint at all) and what (what do we need from Hint). The actual "how" part will be addressed in the SDK spec (most likely I will need to update #1730).

Related oteps OTEP146.

yurishkuro · 2021-06-10T00:36:42Z

specification/metrics/api.md

+  side request duration. Without `Hint`, the users of this library will have to
+  figure out how to configure the histogram buckets, which is a non-trivial


How would the user of the library be able to override these buckets? The histogram object (a) would be already initialized with the hint, and (b) most likely would be private to the HTTP library.

From what I understand, from the API side, the instrumentation library can't specify the full configuration of the specifics of the aggregation of the Histogram. They can only provide these hints, and the SDK will use the hints, along with SDK configuration by the operator/application owner to implement the specifics of the aggregation for the instrument.

The SDK will provide View API for the application owners to override the behavior. This will be covered by #1730.

The SDK will be providing the histogram instrument and can hijack its creation. I can show you a demo of this in Java. I think it should work for most other languages too.

yurishkuro · 2021-06-10T18:37:20Z

specification/metrics/api.md

@@ -946,6 +953,69 @@ Asynchronous UpDownCounter is only intended for an asynchronous scenario. The
 only operation is provided by the `callback`, which is registered during the
 [Asynchronous UpDownCounter creation](#asynchronous-updowncounter-creation).

+## Hint


I think "defaults" would be more accurate term than "hint". A hint usually refers to some optimization suggestion that may or may not be used. A default cannot just be ignored, it can only be overriden by explicit view definition.

I think the official OpenTelemetry SDK will try to respect the Hint as much as possible.
3rd party implementation can choose to ignore the Hint completely or partially, and this is why Hint is preferred than Defaults.

I am open to see more feedback/suggestion here.

I'd even suggest the idea behind the hint API is that we can ignore it in the SDK too.

If you think of the hint API as "instrumentation author" suggestions, and SDK as "what will actually happen", I think it works.

One scenario to think through:

today we offer histogram instrument with only explicit bucketing

ideally, instrumentation authors can suggest explicit bucket sizes

in the future, we could offer a new histogram sketch algorithm with better performance and no need for explicit buckets. Can a user select this without having to rework/rewrite Instrumentation?

yurishkuro · 2021-06-10T18:49:13Z

specification/metrics/api.md

+
+[`Meter`](#meter) MUST provide a way to specify `Hint` during the `Instrument`
+creation. The actual `Hint` allowed for each instrument type could be different
+(e.g. buckets would only make sense for `Histogram`).


Why bundle different properties into a single "hint" if they may not be applicable to each type of instrument? Since we're already adding parameters to the API, we could either pass them individually, e.g. defaultBuckets for histogram, or as an optional but strongly typed struct like HistogramDefaults / CounterDefaults / ...

But an even better approach would be to reuse the same View API that the user can use to customize these same parameters, and keep the surface smaller.

I think strong typed languages can have derived types, for example:

BaseHint CounterHint HistogramHint

Having multiple optional parameters could be concerning in the long run:

We might end up adding too many parameters.

We will struggle with the order of these parameters.

Some hints might be entangled (e.g. if parameters A is provided, parameter B must be provided).

But an even better approach would be to reuse the same View API that the user can use to customize these same parameters, and keep the surface smaller.

I agree, and this is one of the goal (View will provide something that Hint doesn't have, but View should be a superset).

I have a similar initial impression as @yurishkuro that we may be creating an unnecessary distinction in categorizing these properties as hints. For example we've talked about using the View API to override the name, the description or the unit. If the defining characteristic of a 'hint' is that it is subject to being overridden (or ignored) then I would say that every existing property on instrument is also a hint.

I do think there could be value in stratifying commonly set properties from less commonly set ones, or stratifying properties that are common to all instruments from properties that are instrument type specific. For example Prometheus.NET separated configuration that was specific to one instrument type like this:

private static readonly Histogram OrderValueHistogram = Metrics .CreateHistogram("myapp_order_value_usd", "Histogram of received order values (in USD).", new HistogramConfiguration { // We divide measurements in 10 buckets of $100 each, up to $1000. Buckets = Histogram.LinearBuckets(start: 100, width: 100, count: 10) });

The principle we pick would affect how we name it and how we decide which properties go where. Personally I find HistogramConfiguration a more intuitive name than HistogramHint

If the defining characteristic of a 'hint' is that it is subject to being overridden (or ignored) then I would say that every existing property on instrument is also a hint.

I think there is a difference - whether a property/hint should be respected by the SDK, or the SDK has freedom to ignore/alter it.

For example, if the library author suggested that by default the histogram buckets should be (start: 10, width: 10, count: 1000), and the user didn't specify any extra configuration/view, the SDK could generate 1000 buckets (respect the hint) OR decide to only use 256 buckets due to the limitation from backend (partially respect the hint or completely ignore the hint).

The principle we pick would affect how we name it and how we decide which properties go where. Personally I find HistogramConfiguration a more intuitive name than HistogramHint.

I agree. Hint/View reminded me of SQL databases. It'll be great if we could consolidate to a single configuration story.
Let me explore this path.

OR decide to only use 256 buckets due to the limitation from backend (partially respect the hint or completely ignore the hint).

If I specified an instrument name that wasn't valid in the backend wouldn't the SDK also be free to alter it? There probably is some line that could be drawn based on exactly which conditions would cause the SDK to override/ignore a particular piece of data but I'd guess users wouldn't find it a meaningful distinction. My estimate of the user mental model is:

Instrument creation - Some data was provided when the instrument is defined.

SDK configuration - Some additional data may be provided when the app author configures the SDK collection behavior.

noahfalk · 2021-06-11T08:33:03Z

specification/metrics/api.md

+
+[`Meter`](#meter) MUST provide a way to specify `Hint` during the `Instrument`
+creation. The actual `Hint` allowed for each instrument type could be different
+(e.g. buckets would only make sense for `Histogram`).


I have a similar initial impression as @yurishkuro that we may be creating an unnecessary distinction in categorizing these properties as hints. For example we've talked about using the View API to override the name, the description or the unit. If the defining characteristic of a 'hint' is that it is subject to being overridden (or ignored) then I would say that every existing property on instrument is also a hint.

I do think there could be value in stratifying commonly set properties from less commonly set ones, or stratifying properties that are common to all instruments from properties that are instrument type specific. For example Prometheus.NET separated configuration that was specific to one instrument type like this:

private static readonly Histogram OrderValueHistogram = Metrics .CreateHistogram("myapp_order_value_usd", "Histogram of received order values (in USD).", new HistogramConfiguration { // We divide measurements in 10 buckets of $100 each, up to $1000. Buckets = Histogram.LinearBuckets(start: 100, width: 100, count: 10) });

The principle we pick would affect how we name it and how we decide which properties go where. Personally I find HistogramConfiguration a more intuitive name than HistogramHint

noahfalk · 2021-06-11T08:47:00Z

specification/metrics/api.md

+
+The following metadata SHOULD be supported by the `Hint`:
+
+* The `attribute keys` (optional).


Do we have well defined behavior for what data is collected when the set of attribute keys differs from the complete set of tags specified at the callsite? To give a very concrete example, do we know what result this code has:

ObservableGauge g = meter.CreateObservableGauge("MyGauge", ... , Keys = { "Color" }, () => GetGaugeValues()); Measurement[] GetGaugeValues() { return new Measurement[] { new Measurement(1, KeyValuePair.Create("Color", "red"), KeyValuePair.Create("Size", "6")); new Measurement(3, KeyValuePair.Create("Color", "red"), KeyValuePair.Create("Size", "9")); new Measurement(10, KeyValuePair.Create("Color", "blue"), KeyValuePair.Create("Size", "2")); }; }

I think this would be covered by View spec : https://github.com/open-telemetry/opentelemetry-specification/pull/1730/files#diff-32d5f289beed96900fa1e6e69a3ee9cc0b6503ce8883672011c44bc772c0fa2aR101-R103

In this example, the SDK will only pick the key "Color", as Hint exists and it says to pick "Color". If Hint did not exist, then both Color and Size would be picked by the SDK.
(Again, a View can override all these)

I see that the view spec defines attribute_keys, but I didn't notice anything in the spec that says how those keys should be interpreted. I assume the intent is that we do spatial aggregation (I didn't notice it written anywhere but maybe I missed it?). However that would only work if spatial aggregation is well defined for a given instrument and I don't know that ObservableGauge has ever defined that?

In this example, the SDK will only pick the key "Color"

Agreed there, I think the hard part is determining what value gets reported. MyGauge,Color="Red" in this reporting interval has value ____ ?

noahfalk · 2021-06-11T08:55:03Z

specification/metrics/api.md

+
+The following metadata SHOULD be supported by the `Hint`:
+
+* The `attribute keys` (optional).


What do folks think of a name like "default dimensions" or "default attributes"? To me "attribute keys" sounds like it accurately describes the type of the data but the intent may be unclear. If others think the intent is clear from the current name I'm happy to chalk it up as personal opinion, I don't feel too strongly about this one.

I picked "attributes" instead of "dimensions" for the following reason:

"dimensions" could come from attributes or somewhere else (e.g. Baggage).

I think the API user (normally library owners) can recommend which attributes to take since they have a better control (they write the code which generates the attributes), but they don't have good control on Context/Baggage/Resource as these could come from other components.

jonatan-ivanov · 2021-06-11T20:33:53Z

specification/metrics/api.md

+    unit="milliseconds",
+    value_type=float,
+    hint={
+      "attribute_keys": ["http.method", "http.status_code"],


I'm not sure I'm getting the use-case of this.
During the instrumentation, the attributes (key+value) will be attached. What extra feature does this provide?
Either the instrumentation attaches an attribute or not. If users want to filter attributes out, they can use the View API in the SDK, right?

My understanding is that it would allow libraries to provide additional attributes which are not enabled by default. (I don't have a good sense whether this is a compelling feature for us to offer)

EDIT: My explanation was confusing, refer to Cijo's explanation instead which is what I intended.

The hint is a "recommendation" given by the library author (who instruments the library), to the final application owner that "I produce a lot of dimensions. You may not want all of them. I recommend you keep these ones. You may ignore my recommendation."

My understanding is that it would allow libraries to provide additional attributes which are not enabled by default.

Mmm. I have a diff. understanding.. As per my understanding, view doesn't have anything to do with whether the library produce additional attributes or not. It always produces dimensions, say (d1,d2,d3,d4). Hint can say "d1,d2". This doesn't mean library only produced "d1,d2". It always produced "d1,d2,d3,d4", but the SDK by default ignores everything other than "d1,d2", SDK's by default obeys the hint. It may be overridden by application owner with explicit View definition.

Perhaps I didn't describe it well, I think my intent matched yours @cijothomas : ) I agree with your description.

My understanding is that it would allow libraries to provide additional attributes which are not enabled by default. (I don't have a good sense whether this is a compelling feature for us to offer)

This does not make too much sense to me since if I as an instrumenter don't add those attributes during the instrumentation, the values won't be there since the user won't be able to get the values.

I produce a lot of dimensions. You may not want all of them. I recommend you keep these ones. You may ignore my recommendation.

This makes much more sense to me. If my understanding is correct, the hint is some kind of static metadata that can be used later on the SDK side to configure things/filter out tags, etc. (as static as the name or the description of the instrument).

If this is correct, I guess the Hint representation will be type-safe for languages that are type-safe and it will not be just a Map/Dictionary of Strings, right?

Yes. The C# example below has Hint, with AttributeKeys (string array) and Buckets (double arrar) as fields.

This does not make too much sense to me

My explanation was intended to be the same as Cijo's, but clearly he did a better job conveying the idea. Pretend I said what he said : )

If you look at the semantic conventions now, there's a lot of optional parameters. Additionally for metrics, cardinality is still a major concern in many cases.

As an advanced use case, an instrumentation author could provide every conceivable label (so users get everything) while providing hints to allow for a "good default" behavior regarding cardinality.

We could go with a simpler approach of "one size fits all" labels and force all otel instrumentation to use a minimal set of labels for more consistent experience.

I understand why the hints are here, but I'd like confirmation from instrumentation authors that this is a real problem and something (an advanced) instrumentation author would want.

Note: we do not expect most end users to provide hints, afaik and we should not have this in "getting started" examples. End users will configure the SDK, this should just be for instrumentation authors.

yurishkuro · 2021-06-15T16:07:52Z

specification/metrics/api.md

+## Hint
+
+`Hint` allows extra metadata to be attached to an [Instrument](#instrument),
+which makes it easier to configure the [SDK](./sdk.md).


I asked already, but it was inline with other discussion. Why does Hint need to be different from the View API?

View != OpenCensus view. In otel, every instrument has a default aggregation that this hint would adjust. IIUC the current thinking for views is that an instrumentation author would not use them, but an end user would.

It's still not clear to me how a views API will interact with default aggregation, nor we've ironed out the rough edges. This comment is a great example. I'd ask first if you buy into the premise of the two different use cases

jsuereth

TL;DR from me:

if we assume that this target instrumentation authors and these are different than end users, the I think this API works
I'd like to hear from instrumentation authors (not API owners) in this thread to see if this is solving a known (and problematic) use case
we probably need to flesh out views SDK first so folks can see the interaction between the two.

jsuereth · 2021-06-19T15:37:17Z

specification/metrics/api.md

@@ -946,6 +953,69 @@ Asynchronous UpDownCounter is only intended for an asynchronous scenario. The
 only operation is provided by the `callback`, which is registered during the
 [Asynchronous UpDownCounter creation](#asynchronous-updowncounter-creation).

+## Hint


I'd even suggest the idea behind the hint API is that we can ignore it in the SDK too.

If you think of the hint API as "instrumentation author" suggestions, and SDK as "what will actually happen", I think it works.

One scenario to think through:

today we offer histogram instrument with only explicit bucketing

ideally, instrumentation authors can suggest explicit bucket sizes

in the future, we could offer a new histogram sketch algorithm with better performance and no need for explicit buckets. Can a user select this without having to rework/rewrite Instrumentation?

jsuereth · 2021-06-19T15:50:10Z

specification/metrics/api.md

+## Hint
+
+`Hint` allows extra metadata to be attached to an [Instrument](#instrument),
+which makes it easier to configure the [SDK](./sdk.md).


View != OpenCensus view. In otel, every instrument has a default aggregation that this hint would adjust. IIUC the current thinking for views is that an instrumentation author would not use them, but an end user would.

It's still not clear to me how a views API will interact with default aggregation, nor we've ironed out the rough edges. This comment is a great example. I'd ask first if you buy into the premise of the two different use cases

jsuereth · 2021-06-19T15:51:25Z

specification/metrics/api.md

+  side request duration. Without `Hint`, the users of this library will have to
+  figure out how to configure the histogram buckets, which is a non-trivial


The SDK will be providing the histogram instrument and can hijack its creation. I can show you a demo of this in Java. I think it should work for most other languages too.

jsuereth · 2021-06-19T15:58:57Z

specification/metrics/api.md

+    unit="milliseconds",
+    value_type=float,
+    hint={
+      "attribute_keys": ["http.method", "http.status_code"],


If you look at the semantic conventions now, there's a lot of optional parameters. Additionally for metrics, cardinality is still a major concern in many cases.

As an advanced use case, an instrumentation author could provide every conceivable label (so users get everything) while providing hints to allow for a "good default" behavior regarding cardinality.

We could go with a simpler approach of "one size fits all" labels and force all otel instrumentation to use a minimal set of labels for more consistent experience.

I understand why the hints are here, but I'd like confirmation from instrumentation authors that this is a real problem and something (an advanced) instrumentation author would want.

Note: we do not expect most end users to provide hints, afaik and we should not have this in "getting started" examples. End users will configure the SDK, this should just be for instrumentation authors.

jkwatson · 2021-06-19T17:13:16Z

I'd like to hear from instrumentation authors (not API owners) in this thread to see if this is solving a known (and problematic) use case

Summoning @iNikem @trask @anuraaga @mateuszrzeszutek

anuraaga · 2021-06-21T04:51:29Z

Is the hint conceptually similar to Micrometer histogram flavor?

https://github.com/micrometer-metrics/micrometer/blob/67c41db552cac463e44cd45545e5a7b8f3513f1b/implementations/micrometer-registry-prometheus/src/main/java/io/micrometer/prometheus/PrometheusDistributionSummary.java#L51

I've worked extensively on one HTTP framework before, Armeria, and we just use micrometer's default config which for prometheus users uses that histogram flavor. Haven't heard of any issues from users following that default.

But the objective of the hint maybe just to define that default similar to what Micrometer does. In that case, I don't think it should be in the API, or at least not like this. Default attributes, and default configuration like buckets, seem like semantic conventions. HTTP durations (or possibly even all durations) have reasonable defaults for buckets, it doesn't matter what framework (Spring, Armeria, gorilla, etc) is used, we'd expect the same defaults. This seems like the scope of the conventions. The conventions define metric names - so SDKs should be able to implement the mapping from metric name to those default values defined in the conventions without adding to the API I believe.

github-actions · 2021-07-01T03:18:24Z

This PR was marked stale due to lack of activity. It will be closed in 7 days.

github-actions · 2021-07-09T03:17:22Z

Closed as inactive. Feel free to reopen if this PR is still being worked on.

anuragagarwal561994 · 2022-12-28T01:20:54Z

@anuraaga it seems a good point that there should be a semantic convention for views like we have for attributes.

So a proposal that I have is like
SemanticView.HTTP_SERVER_RESPONSE_SIZE

This can then be clubbed with a meter for something like a hint API, something like

histogramBuilder.name("http.server.response_size").defaultView(SemanticView.HTTP_SERVER_RESPONSE_SIZE)

Now Views can have a toBuilder() method for a user to customise the semcov views and create their own which they can then assign to an Instrument and we can either choose the default view configured or supplied by the user.

Rest of the specification will follow without any changes and this will also allow users to include additional attributes in their aggregations as they would simply extend the semcov views.

asafm · 2022-12-29T07:33:56Z

@anuraaga @anuragagarwal561994 I see the two ideas suggested as perpendicular to each other:

I think we should have a way to define the histogram aggregation type (explicit buckets, exponential, summary, or custom type) at the site of the creation of that histogram. This definition would also include configuring it: explicit buckets need a bucket list, the summary needs a quantiles list, etc. Here's a code example using Java SDK:

meter.histogramBuilder()
	.setDescription()
	.setUnit()
	.ofLongs()
	// New addition of hints API
	.aggregationType(SummaryAgggregationType.builder()
					   	.setQuantiles(0.5, 0.75, 1.0)
					   	.build())
	.build()

We can have semantic conventions for certain types of metrics, as you said, like HTTP Response size, which would be like:

HTTP_RESPONSE_SIZE_AGGREGATION_TYPE = 
    SummaryAgggregationType.builder()
          .setQuantiles(0.5, 0.75, 1.0)
          .build()

I wouldn't use views in the API:

It's a complicated term. OTel is complicated as it is, and adding yet another terminology to learn on the API level is one step too far IMO for complexity.
View encompasses so much more - You can dictate default aggregation also per a selector (all instruments named HTTP.*) and change the name. Once you define the instrument, you give the name, description, and units, so the only thing left to set is the aggregation type. I would focus on that and not mix in something more extensive in scope, like the view to accomplish that.

There is an issue with this specific suggestion here

add the metrics API hint

4afd4d9

reyang requested review from a team June 10, 2021 00:04

reyang added area:api Cross language API specification issue spec:metrics Related to the specification/metrics directory labels Jun 10, 2021

github-actions bot assigned arminru Jun 10, 2021

reyang added this to the Metrics API/SDK Feature Freeze milestone Jun 10, 2021

update changelog

b12b2c3

yurishkuro reviewed Jun 10, 2021

View reviewed changes

noahfalk reviewed Jun 11, 2021

View reviewed changes

jonatan-ivanov reviewed Jun 11, 2021

View reviewed changes

cijothomas approved these changes Jun 15, 2021

View reviewed changes

yurishkuro reviewed Jun 15, 2021

View reviewed changes

jsuereth reviewed Jun 19, 2021

View reviewed changes

github-actions bot added the Stale label Jul 1, 2021

github-actions bot closed this Jul 9, 2021

reyang mentioned this pull request Jul 27, 2021

Move metrics API to Feature-freeze #1833

Merged

reyang deleted the reyang/metrics-api-hint branch October 4, 2021 17:33

reyang mentioned this pull request Jan 11, 2022

Configuring histograms in the API (hints?) #2229

Closed

reyang mentioned this pull request Mar 23, 2022

Drop histogram aggregation, default to explicit bucket histogram #2429

Merged

lalitb mentioned this pull request Jul 24, 2022

Add configuration options for Aggregation creation open-telemetry/opentelemetry-cpp#1513

Merged

3 tasks

jmacd mentioned this pull request Aug 1, 2022

Add Summary to the Aggregation in the SDK specifications #2704

Closed

anuragagarwal561994 mentioned this pull request Dec 28, 2022

Specify default metric view attributes in the API (hints?) #3061

Closed

jonatan-ivanov mentioned this pull request Jun 15, 2023

REQUEST: New membership for jonatan-ivanov open-telemetry/community#1548

Closed

6 tasks

noahfalk mentioned this pull request Oct 2, 2023

REQUEST: New membership for noahfalk open-telemetry/community#1706

Closed

6 tasks

		side request duration. Without `Hint`, the users of this library will have to
		figure out how to configure the histogram buckets, which is a non-trivial


		The following metadata SHOULD be supported by the `Hint`:

		* The `attribute keys` (optional).

Add the metrics API Hint #1753

Add the metrics API Hint #1753

Conversation

reyang commented Jun 10, 2021

Changes

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

noahfalk Jun 11, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

noahfalk Jun 11, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jsuereth left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jkwatson commented Jun 19, 2021

anuraaga commented Jun 21, 2021

github-actions bot commented Jul 1, 2021

github-actions bot commented Jul 9, 2021

anuragagarwal561994 commented Dec 28, 2022

asafm commented Dec 29, 2022 • edited Loading

noahfalk Jun 11, 2021 •

edited

Loading

noahfalk Jun 11, 2021 •

edited

Loading

asafm commented Dec 29, 2022 •

edited

Loading