Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update default exemplar reservoir size for base-2 exponential histogram buckets #3674

Closed
Tracked by #3756
MrAlias opened this issue Aug 24, 2023 · 2 comments · Fixed by #3760
Closed
Tracked by #3756

Update default exemplar reservoir size for base-2 exponential histogram buckets #3674

MrAlias opened this issue Aug 24, 2023 · 2 comments · Fixed by #3760
Assignees
Labels
spec:metrics Related to the specification/metrics directory triaged-accepted The issue is triaged and accepted by the OTel community, one can proceed with creating a PR proposal

Comments

@MrAlias
Copy link
Contributor

MrAlias commented Aug 24, 2023

Originally posted by @jsuereth in #3670 (comment)

I'd suggest limiting to a range of 5-20 exemplars (or some division of the exponential histogram's max size) purely from an overhead standpoint on tracking these points. I don't have hard numbers to back this up, just some anecdotes.

Here's some VERY lazy math with naive assumptions:

Explicit Bucket histograms

  • Default 15 boundaries, 16 buckets
  • 1:1 exemplar to bucket
  • Likely storage per bucket count = 8 bytes
  • Max, min, sum storage = 8*3 = 24 bytes
  • Memory overhead without Exemplars: 168 + 83 = 152 bytes
  • Likely storage per exemplar (best-case pointer to shared attributes, pointer to shared context + measurement + timestamp) = 32 bytes
  • Memory head just for Exemplars = 512 bytes
  • Total overhead / histogram = 664 bytes

Exponential Bucket Histogram

  • Default 320 buckets + overflow (Let's assume negative measurements don't exist)
  • Likely storage per bucket count = ~2 bytes (assuming we're using "adapting integer" algorithm where we scale arrays as counts grow, and we're in a sweet spot of measurement-counts)
  • max, min, sum storage = 8*3 bytes
  • Memory overhead without exemplars - ~640 + 24 = 664 bytes

Given the key size differential, I don't think we should be spending quite as much memory overhead on exemplars. Particularly because Exponential Buckets are intended to give us very high accuracy percentiles. I think we're better of attempting to grab lower number but higher entropy exemplars.

@reyang
Copy link
Member

reyang commented Aug 24, 2023

Another thing for us to consider - maybe most folks would care more about exemplars for the lowest (including the Zero Counts for base2 exponential buckets histogram) and the highest buckets?

@MrAlias
Copy link
Contributor Author

MrAlias commented Aug 31, 2023

I think we could recommend a weighted sampling algorithm for the exponential histogram exemplar reservoirs.

One of the only things we will still need to nail down is the weighting distribution.

@tigrannajaryan tigrannajaryan added the spec:metrics Related to the specification/metrics directory label Nov 1, 2023
@tigrannajaryan tigrannajaryan added the triaged-accepted The issue is triaged and accepted by the OTel community, one can proceed with creating a PR proposal label Nov 1, 2023
jsuereth added a commit to jsuereth/opentelemetry-specification that referenced this issue Nov 10, 2023
- Update fixed-size defaults to account for memory
  contention/optimisation.
- Fix open-telemetry#3674: Set a default for exponential histograms.
jmacd added a commit that referenced this issue Dec 1, 2023
Fixes #2205
Fixes #3674 
Fixes #3669
Partially fixes #2421

## Changes

- Update example exemplar algorithm to account for initial reservoir
fill
- Update fixed-size defaults to account for memory contention /
optimization in Java impl
- Set a default for exponential histogram aggregation
- Clarify that ExemplarFilter should be configured on MeterProvider
- Make it clear that ONE reservoir is create PER timeseries datapoint
(not one reservoir per view or metric name).
- Allow flexibility in Reservoir `offer` definition based on feedback
from Go impl.

* Related issues #3756

---------

Co-authored-by: David Ashpole <dashpole@google.com>
Co-authored-by: Joshua MacDonald <jmacd@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
spec:metrics Related to the specification/metrics directory triaged-accepted The issue is triaged and accepted by the OTel community, one can proceed with creating a PR proposal
Projects
None yet
3 participants