Histogram support multiple boundaries types vs different types of histogram aggregation #259

bogdandrutu · 2021-02-12T04:53:19Z

During the meeting on Thu, Feb 11, there was a discussion about our data model supporting new aggregation types, and the answer was clearly we do support that, but that comment triggered me a new idea.

Instead of deprecating the current explicit boundaries from our current definition of Histogram, what if we just add a new type ExponentialHistogram, and that combined with #256 and #257 will keep the number of types at a reasonable number.

Advantages of this solution is that #226 can be a simple addition to the current data model, and also can improve the way how counts are stored for large number of buckets (like using int64 instead of fixed64, or even using other encoding that does not require to have a value for 0 buckets, etc.).

Another advantage is that we don't have to deal with situation where points in the same histogram metric have different boundaries types.

I would like to propose this for discussion and make a decision about this. I am not 100% convinced on one or the other, but I try to propose this possibility.

/cc @jmacd @open-telemetry/specs-metrics-approvers

jmacd · 2021-02-24T07:16:35Z

I've started thinking about one more alternative to your suggestion, also motivated by the comments in the tail of #226, in particular
#226 (comment).

The idea is to combine the Discretization and Compaction fields into a single type, a oneof various Buckets. In the existing histogram we have

message ExplicitCountBuckets {
  repeated double explicit_bounds = 1;
  repeated fixed64 bucket_counts = 2;
}

These two fields determine both buckets (Discretization) and counts (Compaction). I've been thinking about one other equally explicit form (mentioned in today's Data Model meeting):

message ExplicitSampleBuckets {
  repeated double explicit_bounds = 1;
  repeated double bucket_counts = 2;
}

This type is identical to ExplicitCountBuckets, but counts are floating-point values. This has two uses:

Statsd sampled histograms can be conveyed correctly (for arbitrary sampling rates) using this encoding
Prometheus summaries can be expressed in this way (independent of differences in temporality)

In the first example, a statsd h-type metric with @0.4 sample rate (i.e., 40% sampling), a single observation should output a count of 2.5 in the histogram. (Note: I also need what is presently fixed64 count = 4; to become double count = 4.)

In the second example, a Prometheus summary for quantiles [0.25, 0.5, 0.75, 0.9, 0.95, 0.99] with total count C, total sum S.

Let the calculated quantile values equal the calculated values P25, P50, P75, P90, P95, P99. The output histogram is as follows:

count: C
sum: S
explicit_bounds: [P25, P50, P75, P90, P95, P99]
bucket_counts: [
  (0.25-0)*C,
  (0.5-0.25)*C,
  (0.75-0.5)*C,
  (0.9-0.75)*C,
  (0.95-0.9)*C,
  (0.99-0.95)*C,
  (1.0-0.99)*C, 
]

Note that the final count bucket is redundant, since it represents (P99,+Inf), equal to the total count minus the sum of all bucket counts. (By this logic we could shorten bucket_counts by one, if we pleased, in the present-day explicit-boundary histogram.)

The suggestion in @oertl's comment, bullets 1 and 2, could be summarized as follows:
(a) Each bucketing scheme should be able to decide whether it can merge with another of the same type but potentially different parameterization. As @postwait says, this should only be done when the schemes support non-lossy merging.
(b) Fancy schemes support auto-combining buckets with compatible parameters (e.g., UDDSketch) can be added later.

Anyway, please also consider this approach.

I'm looking forward to @yzhuge's continued work on this, #226 (comment), and want to thank @githomin for this reminder:

I think we should keep in mind the goal of providing histograms with relative accuracy guarantees where users don't have to set any parameters

yzhuge · 2021-03-03T23:52:13Z

I like the idea of each histogram types having its own count encoding. I made the bounds share one common count array in #226 only because I wanted to minimize changes. Since we haven't actually released the protocol, minimizing changes or backward compatibility is not a big concern. Thus we should be more forward looking than backward looking. To be clear, future version could look like:

oneof buckets {
ExplicitCountBuckets explicitCountBuckets;
ExplicitSampleBuckets explicitSampleBuckets;
ExponentialBuckets exponentialBuckets;
}

Each "Buckets" message will have its own bounds and counts.

jmacd · 2021-03-04T00:18:46Z

@yzhuge I am currently working on a PR that does this basic factoring. I am going to propose that we ignore the difference between ExplicitCountBuckets and ExplicitSampleBuckets at this point and let all counts be doubles, but I will do that in a separate PR. Also I will never again mention this idea about Summary metrics. 😀

Fixes: open-telemetry#259 Signed-off-by: Bogdan Drutu <bogdandrutu@gmail.com>

Fixes: #259 Signed-off-by: Bogdan Drutu <bogdandrutu@gmail.com>

bogdandrutu changed the title ~~Histogram support multiple boundaries types vs different types of histogram aggregatio~~ Histogram support multiple boundaries types vs different types of histogram aggregation Feb 12, 2021

bogdandrutu added area:data-model spec:metrics labels Feb 12, 2021

bogdandrutu mentioned this issue Feb 12, 2021

Deprecate old explicit bounds, add message for explicit bounds #255

Closed

jmacd mentioned this issue Feb 24, 2021

Add more histogram types #226

Closed

This was referenced Mar 4, 2021

Create a oneof for Histogram buckets; add ExplicitBuckets; deprecate DoubleHistogram #272

Closed

Decouple Number Type from Point Kind in OTLP protocol #264

Closed

bogdandrutu added a commit to bogdandrutu/opentelemetry-proto that referenced this issue Apr 1, 2021

Fix histogram dapa point comment about supported boundaries

48809a5

Fixes: open-telemetry#259 Signed-off-by: Bogdan Drutu <bogdandrutu@gmail.com>

bogdandrutu mentioned this issue Apr 1, 2021

Fix histogram data point comment about supported boundaries #291

Merged

bogdandrutu added a commit to bogdandrutu/opentelemetry-proto that referenced this issue Apr 2, 2021

Fix histogram dapa point comment about supported boundaries

f68570d

Fixes: open-telemetry#259 Signed-off-by: Bogdan Drutu <bogdandrutu@gmail.com>

bogdandrutu closed this as completed in #291 Apr 12, 2021

bogdandrutu added a commit that referenced this issue Apr 12, 2021

Fix histogram dapa point comment about supported boundaries (#291)

493f9f8

Fixes: #259 Signed-off-by: Bogdan Drutu <bogdandrutu@gmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Histogram support multiple boundaries types vs different types of histogram aggregation #259

Histogram support multiple boundaries types vs different types of histogram aggregation #259

bogdandrutu commented Feb 12, 2021 •

edited

Loading

jmacd commented Feb 24, 2021

yzhuge commented Mar 3, 2021

jmacd commented Mar 4, 2021

Histogram support multiple boundaries types vs different types of histogram aggregation #259

Histogram support multiple boundaries types vs different types of histogram aggregation #259

Comments

bogdandrutu commented Feb 12, 2021 • edited Loading

jmacd commented Feb 24, 2021

yzhuge commented Mar 3, 2021

jmacd commented Mar 4, 2021

bogdandrutu commented Feb 12, 2021 •

edited

Loading