Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add the metrics API Hint #1753

Closed
wants to merge 2 commits into from

Conversation

reyang
Copy link
Member

@reyang reyang commented Jun 10, 2021

Related to #1730.

Changes

  • Added Hint to the metrics API spec.

The SDK PR #1730 is a bit stuck because Instrument, View and Hint are entangled. As discussed in the 06/03/2021 Metric API/SDK SIG Meeting we've decided to make progress on the Hint API spec, so we can reference it in the SDK spec PR.

In this PR I try to focus on why (why do we need Hint at all) and what (what do we need from Hint). The actual "how" part will be addressed in the SDK spec (most likely I will need to update #1730).

Related oteps OTEP146.

@reyang reyang requested review from a team June 10, 2021 00:04
@reyang reyang added area:api Cross language API specification issue spec:metrics Related to the specification/metrics directory labels Jun 10, 2021
@reyang reyang added this to the Metrics API/SDK Feature Freeze milestone Jun 10, 2021
Comment on lines +964 to +965
side request duration. Without `Hint`, the users of this library will have to
figure out how to configure the histogram buckets, which is a non-trivial
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would the user of the library be able to override these buckets? The histogram object (a) would be already initialized with the hint, and (b) most likely would be private to the HTTP library.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I understand, from the API side, the instrumentation library can't specify the full configuration of the specifics of the aggregation of the Histogram. They can only provide these hints, and the SDK will use the hints, along with SDK configuration by the operator/application owner to implement the specifics of the aggregation for the instrument.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The SDK will provide View API for the application owners to override the behavior. This will be covered by #1730.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The SDK will be providing the histogram instrument and can hijack its creation. I can show you a demo of this in Java. I think it should work for most other languages too.

@@ -946,6 +953,69 @@ Asynchronous UpDownCounter is only intended for an asynchronous scenario. The
only operation is provided by the `callback`, which is registered during the
[Asynchronous UpDownCounter creation](#asynchronous-updowncounter-creation).

## Hint
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think "defaults" would be more accurate term than "hint". A hint usually refers to some optimization suggestion that may or may not be used. A default cannot just be ignored, it can only be overriden by explicit view definition.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the official OpenTelemetry SDK will try to respect the Hint as much as possible.
3rd party implementation can choose to ignore the Hint completely or partially, and this is why Hint is preferred than Defaults.

I am open to see more feedback/suggestion here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd even suggest the idea behind the hint API is that we can ignore it in the SDK too.

If you think of the hint API as "instrumentation author" suggestions, and SDK as "what will actually happen", I think it works.

One scenario to think through:

  • today we offer histogram instrument with only explicit bucketing
  • ideally, instrumentation authors can suggest explicit bucket sizes
  • in the future, we could offer a new histogram sketch algorithm with better performance and no need for explicit buckets. Can a user select this without having to rework/rewrite Instrumentation?


[`Meter`](#meter) MUST provide a way to specify `Hint` during the `Instrument`
creation. The actual `Hint` allowed for each instrument type could be different
(e.g. buckets would only make sense for `Histogram`).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why bundle different properties into a single "hint" if they may not be applicable to each type of instrument? Since we're already adding parameters to the API, we could either pass them individually, e.g. defaultBuckets for histogram, or as an optional but strongly typed struct like HistogramDefaults / CounterDefaults / ...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But an even better approach would be to reuse the same View API that the user can use to customize these same parameters, and keep the surface smaller.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think strong typed languages can have derived types, for example:

BaseHint
    CounterHint
    HistogramHint

Having multiple optional parameters could be concerning in the long run:

  1. We might end up adding too many parameters.
  2. We will struggle with the order of these parameters.
  3. Some hints might be entangled (e.g. if parameters A is provided, parameter B must be provided).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But an even better approach would be to reuse the same View API that the user can use to customize these same parameters, and keep the surface smaller.

I agree, and this is one of the goal (View will provide something that Hint doesn't have, but View should be a superset).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a similar initial impression as @yurishkuro that we may be creating an unnecessary distinction in categorizing these properties as hints. For example we've talked about using the View API to override the name, the description or the unit. If the defining characteristic of a 'hint' is that it is subject to being overridden (or ignored) then I would say that every existing property on instrument is also a hint.

I do think there could be value in stratifying commonly set properties from less commonly set ones, or stratifying properties that are common to all instruments from properties that are instrument type specific. For example Prometheus.NET separated configuration that was specific to one instrument type like this:

private static readonly Histogram OrderValueHistogram = Metrics
    .CreateHistogram("myapp_order_value_usd", "Histogram of received order values (in USD).",
        new HistogramConfiguration
        {
            // We divide measurements in 10 buckets of $100 each, up to $1000.
            Buckets = Histogram.LinearBuckets(start: 100, width: 100, count: 10)
        });

The principle we pick would affect how we name it and how we decide which properties go where. Personally I find HistogramConfiguration a more intuitive name than HistogramHint

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the defining characteristic of a 'hint' is that it is subject to being overridden (or ignored) then I would say that every existing property on instrument is also a hint.

I think there is a difference - whether a property/hint should be respected by the SDK, or the SDK has freedom to ignore/alter it.

For example, if the library author suggested that by default the histogram buckets should be (start: 10, width: 10, count: 1000), and the user didn't specify any extra configuration/view, the SDK could generate 1000 buckets (respect the hint) OR decide to only use 256 buckets due to the limitation from backend (partially respect the hint or completely ignore the hint).

The principle we pick would affect how we name it and how we decide which properties go where. Personally I find HistogramConfiguration a more intuitive name than HistogramHint.

I agree. Hint/View reminded me of SQL databases. It'll be great if we could consolidate to a single configuration story.
Let me explore this path.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OR decide to only use 256 buckets due to the limitation from backend (partially respect the hint or completely ignore the hint).

If I specified an instrument name that wasn't valid in the backend wouldn't the SDK also be free to alter it? There probably is some line that could be drawn based on exactly which conditions would cause the SDK to override/ignore a particular piece of data but I'd guess users wouldn't find it a meaningful distinction. My estimate of the user mental model is:

  1. Instrument creation - Some data was provided when the instrument is defined.
  2. SDK configuration - Some additional data may be provided when the app author configures the SDK collection behavior.


[`Meter`](#meter) MUST provide a way to specify `Hint` during the `Instrument`
creation. The actual `Hint` allowed for each instrument type could be different
(e.g. buckets would only make sense for `Histogram`).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a similar initial impression as @yurishkuro that we may be creating an unnecessary distinction in categorizing these properties as hints. For example we've talked about using the View API to override the name, the description or the unit. If the defining characteristic of a 'hint' is that it is subject to being overridden (or ignored) then I would say that every existing property on instrument is also a hint.

I do think there could be value in stratifying commonly set properties from less commonly set ones, or stratifying properties that are common to all instruments from properties that are instrument type specific. For example Prometheus.NET separated configuration that was specific to one instrument type like this:

private static readonly Histogram OrderValueHistogram = Metrics
    .CreateHistogram("myapp_order_value_usd", "Histogram of received order values (in USD).",
        new HistogramConfiguration
        {
            // We divide measurements in 10 buckets of $100 each, up to $1000.
            Buckets = Histogram.LinearBuckets(start: 100, width: 100, count: 10)
        });

The principle we pick would affect how we name it and how we decide which properties go where. Personally I find HistogramConfiguration a more intuitive name than HistogramHint


The following metadata SHOULD be supported by the `Hint`:

* The `attribute keys` (optional).
Copy link

@noahfalk noahfalk Jun 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have well defined behavior for what data is collected when the set of attribute keys differs from the complete set of tags specified at the callsite? To give a very concrete example, do we know what result this code has:

ObservableGauge g = meter.CreateObservableGauge("MyGauge", ... , Keys = { "Color" }, () => GetGaugeValues());

Measurement[] GetGaugeValues()
{
   return new Measurement[] 
   {
        new Measurement(1, KeyValuePair.Create("Color", "red"), KeyValuePair.Create("Size", "6"));
        new Measurement(3, KeyValuePair.Create("Color", "red"), KeyValuePair.Create("Size", "9"));
        new Measurement(10, KeyValuePair.Create("Color", "blue"), KeyValuePair.Create("Size", "2"));
   };
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this would be covered by View spec : https://github.com/open-telemetry/opentelemetry-specification/pull/1730/files#diff-32d5f289beed96900fa1e6e69a3ee9cc0b6503ce8883672011c44bc772c0fa2aR101-R103

In this example, the SDK will only pick the key "Color", as Hint exists and it says to pick "Color". If Hint did not exist, then both Color and Size would be picked by the SDK.
(Again, a View can override all these)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that the view spec defines attribute_keys, but I didn't notice anything in the spec that says how those keys should be interpreted. I assume the intent is that we do spatial aggregation (I didn't notice it written anywhere but maybe I missed it?). However that would only work if spatial aggregation is well defined for a given instrument and I don't know that ObservableGauge has ever defined that?

In this example, the SDK will only pick the key "Color"

Agreed there, I think the hard part is determining what value gets reported. MyGauge,Color="Red" in this reporting interval has value ____ ?


The following metadata SHOULD be supported by the `Hint`:

* The `attribute keys` (optional).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do folks think of a name like "default dimensions" or "default attributes"? To me "attribute keys" sounds like it accurately describes the type of the data but the intent may be unclear. If others think the intent is clear from the current name I'm happy to chalk it up as personal opinion, I don't feel too strongly about this one.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I picked "attributes" instead of "dimensions" for the following reason:

  1. "dimensions" could come from attributes or somewhere else (e.g. Baggage).
  2. I think the API user (normally library owners) can recommend which attributes to take since they have a better control (they write the code which generates the attributes), but they don't have good control on Context/Baggage/Resource as these could come from other components.

unit="milliseconds",
value_type=float,
hint={
"attribute_keys": ["http.method", "http.status_code"],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I'm getting the use-case of this.
During the instrumentation, the attributes (key+value) will be attached. What extra feature does this provide?
Either the instrumentation attaches an attribute or not. If users want to filter attributes out, they can use the View API in the SDK, right?

Copy link

@noahfalk noahfalk Jun 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that it would allow libraries to provide additional attributes which are not enabled by default. (I don't have a good sense whether this is a compelling feature for us to offer)

EDIT: My explanation was confusing, refer to Cijo's explanation instead which is what I intended.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The hint is a "recommendation" given by the library author (who instruments the library), to the final application owner that "I produce a lot of dimensions. You may not want all of them. I recommend you keep these ones. You may ignore my recommendation."

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that it would allow libraries to provide additional attributes which are not enabled by default.

Mmm. I have a diff. understanding.. As per my understanding, view doesn't have anything to do with whether the library produce additional attributes or not. It always produces dimensions, say (d1,d2,d3,d4). Hint can say "d1,d2". This doesn't mean library only produced "d1,d2". It always produced "d1,d2,d3,d4", but the SDK by default ignores everything other than "d1,d2", SDK's by default obeys the hint. It may be overridden by application owner with explicit View definition.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps I didn't describe it well, I think my intent matched yours @cijothomas : ) I agree with your description.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that it would allow libraries to provide additional attributes which are not enabled by default. (I don't have a good sense whether this is a compelling feature for us to offer)

This does not make too much sense to me since if I as an instrumenter don't add those attributes during the instrumentation, the values won't be there since the user won't be able to get the values.

I produce a lot of dimensions. You may not want all of them. I recommend you keep these ones. You may ignore my recommendation.

This makes much more sense to me. If my understanding is correct, the hint is some kind of static metadata that can be used later on the SDK side to configure things/filter out tags, etc. (as static as the name or the description of the instrument).

If this is correct, I guess the Hint representation will be type-safe for languages that are type-safe and it will not be just a Map/Dictionary of Strings, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. The C# example below has Hint, with AttributeKeys (string array) and Buckets (double arrar) as fields.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not make too much sense to me

My explanation was intended to be the same as Cijo's, but clearly he did a better job conveying the idea. Pretend I said what he said : )

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you look at the semantic conventions now, there's a lot of optional parameters. Additionally for metrics, cardinality is still a major concern in many cases.

As an advanced use case, an instrumentation author could provide every conceivable label (so users get everything) while providing hints to allow for a "good default" behavior regarding cardinality.

We could go with a simpler approach of "one size fits all" labels and force all otel instrumentation to use a minimal set of labels for more consistent experience.

I understand why the hints are here, but I'd like confirmation from instrumentation authors that this is a real problem and something (an advanced) instrumentation author would want.

Note: we do not expect most end users to provide hints, afaik and we should not have this in "getting started" examples. End users will configure the SDK, this should just be for instrumentation authors.

## Hint

`Hint` allows extra metadata to be attached to an [Instrument](#instrument),
which makes it easier to configure the [SDK](./sdk.md).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I asked already, but it was inline with other discussion. Why does Hint need to be different from the View API?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

View != OpenCensus view. In otel, every instrument has a default aggregation that this hint would adjust. IIUC the current thinking for views is that an instrumentation author would not use them, but an end user would.

It's still not clear to me how a views API will interact with default aggregation, nor we've ironed out the rough edges. This comment is a great example. I'd ask first if you buy into the premise of the two different use cases

Copy link
Contributor

@jsuereth jsuereth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TL;DR from me:

  • if we assume that this target instrumentation authors and these are different than end users, the I think this API works
  • I'd like to hear from instrumentation authors (not API owners) in this thread to see if this is solving a known (and problematic) use case
  • we probably need to flesh out views SDK first so folks can see the interaction between the two.

@@ -946,6 +953,69 @@ Asynchronous UpDownCounter is only intended for an asynchronous scenario. The
only operation is provided by the `callback`, which is registered during the
[Asynchronous UpDownCounter creation](#asynchronous-updowncounter-creation).

## Hint
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd even suggest the idea behind the hint API is that we can ignore it in the SDK too.

If you think of the hint API as "instrumentation author" suggestions, and SDK as "what will actually happen", I think it works.

One scenario to think through:

  • today we offer histogram instrument with only explicit bucketing
  • ideally, instrumentation authors can suggest explicit bucket sizes
  • in the future, we could offer a new histogram sketch algorithm with better performance and no need for explicit buckets. Can a user select this without having to rework/rewrite Instrumentation?

## Hint

`Hint` allows extra metadata to be attached to an [Instrument](#instrument),
which makes it easier to configure the [SDK](./sdk.md).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

View != OpenCensus view. In otel, every instrument has a default aggregation that this hint would adjust. IIUC the current thinking for views is that an instrumentation author would not use them, but an end user would.

It's still not clear to me how a views API will interact with default aggregation, nor we've ironed out the rough edges. This comment is a great example. I'd ask first if you buy into the premise of the two different use cases

Comment on lines +964 to +965
side request duration. Without `Hint`, the users of this library will have to
figure out how to configure the histogram buckets, which is a non-trivial
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The SDK will be providing the histogram instrument and can hijack its creation. I can show you a demo of this in Java. I think it should work for most other languages too.

unit="milliseconds",
value_type=float,
hint={
"attribute_keys": ["http.method", "http.status_code"],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you look at the semantic conventions now, there's a lot of optional parameters. Additionally for metrics, cardinality is still a major concern in many cases.

As an advanced use case, an instrumentation author could provide every conceivable label (so users get everything) while providing hints to allow for a "good default" behavior regarding cardinality.

We could go with a simpler approach of "one size fits all" labels and force all otel instrumentation to use a minimal set of labels for more consistent experience.

I understand why the hints are here, but I'd like confirmation from instrumentation authors that this is a real problem and something (an advanced) instrumentation author would want.

Note: we do not expect most end users to provide hints, afaik and we should not have this in "getting started" examples. End users will configure the SDK, this should just be for instrumentation authors.

@jkwatson
Copy link
Contributor

  • I'd like to hear from instrumentation authors (not API owners) in this thread to see if this is solving a known (and problematic) use case

Summoning @iNikem @trask @anuraaga @mateuszrzeszutek

@anuraaga
Copy link
Contributor

Is the hint conceptually similar to Micrometer histogram flavor?

https://github.com/micrometer-metrics/micrometer/blob/67c41db552cac463e44cd45545e5a7b8f3513f1b/implementations/micrometer-registry-prometheus/src/main/java/io/micrometer/prometheus/PrometheusDistributionSummary.java#L51

I've worked extensively on one HTTP framework before, Armeria, and we just use micrometer's default config which for prometheus users uses that histogram flavor. Haven't heard of any issues from users following that default.

But the objective of the hint maybe just to define that default similar to what Micrometer does. In that case, I don't think it should be in the API, or at least not like this. Default attributes, and default configuration like buckets, seem like semantic conventions. HTTP durations (or possibly even all durations) have reasonable defaults for buckets, it doesn't matter what framework (Spring, Armeria, gorilla, etc) is used, we'd expect the same defaults. This seems like the scope of the conventions. The conventions define metric names - so SDKs should be able to implement the mapping from metric name to those default values defined in the conventions without adding to the API I believe.

@github-actions
Copy link

github-actions bot commented Jul 1, 2021

This PR was marked stale due to lack of activity. It will be closed in 7 days.

@github-actions github-actions bot added the Stale label Jul 1, 2021
@github-actions
Copy link

github-actions bot commented Jul 9, 2021

Closed as inactive. Feel free to reopen if this PR is still being worked on.

@anuragagarwal561994
Copy link

@anuraaga it seems a good point that there should be a semantic convention for views like we have for attributes.

So a proposal that I have is like
SemanticView.HTTP_SERVER_RESPONSE_SIZE

This can then be clubbed with a meter for something like a hint API, something like

histogramBuilder.name("http.server.response_size").defaultView(SemanticView.HTTP_SERVER_RESPONSE_SIZE)

Now Views can have a toBuilder() method for a user to customise the semcov views and create their own which they can then assign to an Instrument and we can either choose the default view configured or supplied by the user.

Rest of the specification will follow without any changes and this will also allow users to include additional attributes in their aggregations as they would simply extend the semcov views.

@asafm
Copy link
Contributor

asafm commented Dec 29, 2022

@anuraaga @anuragagarwal561994 I see the two ideas suggested as perpendicular to each other:

  1. I think we should have a way to define the histogram aggregation type (explicit buckets, exponential, summary, or custom type) at the site of the creation of that histogram. This definition would also include configuring it: explicit buckets need a bucket list, the summary needs a quantiles list, etc. Here's a code example using Java SDK:
meter.histogramBuilder()
	.setDescription()
	.setUnit()
	.ofLongs()
	// New addition of hints API
	.aggregationType(SummaryAgggregationType.builder()
					   	.setQuantiles(0.5, 0.75, 1.0)
					   	.build())
	.build()				   	
  1. We can have semantic conventions for certain types of metrics, as you said, like HTTP Response size, which would be like:
HTTP_RESPONSE_SIZE_AGGREGATION_TYPE = 
    SummaryAgggregationType.builder()
          .setQuantiles(0.5, 0.75, 1.0)
          .build()

I wouldn't use views in the API:

  1. It's a complicated term. OTel is complicated as it is, and adding yet another terminology to learn on the API level is one step too far IMO for complexity.
  2. View encompasses so much more - You can dictate default aggregation also per a selector (all instruments named HTTP.*) and change the name. Once you define the instrument, you give the name, description, and units, so the only thing left to set is the aggregation type. I would focus on that and not mix in something more extensive in scope, like the view to accomplish that.

There is an issue with this specific suggestion here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:api Cross language API specification issue spec:metrics Related to the specification/metrics directory Stale
Projects
None yet
Development

Successfully merging this pull request may close these issues.