Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce Histogram Cardinality #595

Merged
merged 7 commits into from
Jan 8, 2023

Conversation

leoparente
Copy link
Contributor

No description provided.

@leoparente leoparente self-assigned this Jan 6, 2023
@leoparente
Copy link
Contributor Author

leoparente commented Jan 6, 2023

PR Explanation:

Problem: Current Histogram implementation has a high cardinality due the way that datapoints are calculated. Moreover, the different datapoints almost every bucket do not allow to proper plot histogram over time.

Solutions:

First attempt: Use Opentelemetry Exponential Histogram - after investigating it, notice that there is no way to map exponential histogram to prometheus structure.. It is not supported by prometheus and it would not be good to do a implementation exclusively to Opentelemetry (without support to prometheus and json).

This PR attempt: Prometheus histogram high cardinality problem seems to be a common issue in the community. One custom implementation to that is provided by VictoriaMetrics

The solution — VictoriaMetrics histogram
We at VictoriaMetrics decided fixing these issues, went to drawing board, designed human-friendly easy-to-use Histogram and added it to the lightweight client library for exposing Prometheus-compatible metrics — github.com/VictoriaMetrics/metrics. The Histogram just works:
1- There is no need in thinking about bucket ranges and the number of buckets per histogram, since buckets are created on demand.
2- There is no need in worrying about high cardinality, since only buckets with non-zero values are exposed to Prometheus. Usually real-world values are located on quite small range, so they are covered by small number of histogram buckets.
3- There is no need in re-configuring buckets over time, since bucket configuration is static. This allows performing cross-histogram calculations.

It can be found in this article https://valyala.medium.com/improving-histogram-usability-for-prometheus-and-grafana-bc7e5df0e350

I've adapted the solution for our datasketches approach

@weyrick
Copy link
Contributor

weyrick commented Jan 6, 2023

Interesting - seems like a good goal, my only concern is whether we know that existing data/graphs won't change in a significant way if this rolls out? I see you had to change the test to remove some answers, for example

@leoparente
Copy link
Contributor Author

Interesting - seems like a good goal, my only concern is whether we know that existing data/graphs won't change in a significant way if this rolls out? I see you had to change the test to remove some answers, for example

Histogram Metric is only used yet by Netprobe Handler and both feature(Histogram and Netprobe) were released (4.2.0) as BETA:

New Features in BETA (interfaces may still change)
• Flow Support (SFLOW/Netflow/IPFIX). How To Configure and Policies Advanced.
• Netprobe support. Docs.
• Histogram Metric #526
• DNS Handler Version 2.0 - focus on dns transactions (docs soon)

@leoparente leoparente merged commit 288fd71 into develop Jan 8, 2023
@leoparente leoparente deleted the feature/reduce-histogram-cardinality branch January 8, 2023 17:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants