-
Notifications
You must be signed in to change notification settings - Fork 256
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add min and max values to the histogram data points #266
Comments
@rakyll I have some questions about the behavior of
So I think usually what you probably need is to have always "delta" for min/max in order for the backend to be able to calculate rollups. So I think this is a bit harder to support than it may sound, and also be useful. |
When we are reporting the histogram data points, min/max temporality should respect the histogram temporarily. I don't know any backends that allows to report min/max with different temporality, e.g. delta min/max for a cumulative histogram. For a cumulative histograms, this is a bit less useful because min/max will mostly stay the same but will still help the user to query the global min/max for a time series with full precision. For deltas, it enables most of the other use cases I mentioned above and the "99%tile of the last X seconds" you mentioned. |
I think this is not that useful but because I don't find this useful it does not mean it is not, I would like to hear others opinion on this.
One thing that maybe I did not clearly specify here is that majority of the backends (the once that I know all of them do this) for histograms, independent of the temporality that they are received, store deltas internally. By adding min/max to the cumulative histogram we gain 0 value, because data will be dropped since they cannot be transformed into delta as all the other fields in the Histogram. So the 1M$ question is: |
I like the idea of having separating min/max into different aggregations. Users can report min/max more frequently if it is decoupled from the histogram. The cons are:
|
Pros for including min/max:
Cons:
In my opinion, the pros outweigh the cons. Therefore, the DynaHist histogram implementation includes min/max. |
I'm in favor of encouraging "bundling" min/max with a histogram to make sure they are reported and we can do interesting things with the data. Misshapped histograms buckets are pretty annoying in practice, and this could help deal with that issue. I think this change is overall good assuming two things:
|
Following the discussion on Tuesday, we still have technical details to resolve before we can add min/max to the protocol. The discussion has circled around these points:
Looking toward Prometheus, there's at least one more concern. Data pulled from a /metrics endpoint has no (stateless) way to encode a delta, but it is equally undesirable to report the lifetime min and max of a series. Therefore, Prometheus Summary data types expose the min and max value (i.e., φ = 0 and φ = 1) over a fixed window of time. I believe the best outcome for a Prometheus user will be to emulate the behavior of the Summary data type's min and max fields for OTLP Histograms. This statement is practically the only option when OTLP's aggregation temporality is cumulative. For OTLP delta aggregation temporality, we have two options. The first option is for delta to behave exactly as cumulative temporality would. This allows change-of-temporality to be considered a true no-op for histograms. The second option is for delta to output min and max precisely, meaning to report actual min/max values over short time windows. I have a slight preference here, but I think both behaviors should be considered valid. |
Please review #279. |
Today, OpenTelemetry doesn't report min and max values collected for a histogram data point and is breaking the capability to query absolute min and max values from a time series. We propose to add min and max values to the histogram data types. When users set boundaries explicitly, not reporting precise min and max doesn't allow them to see whether their bucket boundaries are well-chosen and precisely set.
Here are some use cases where being able to query absolute min and max values from a series is critical:
The text was updated successfully, but these errors were encountered: