-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[metricbeat]independent events based on le
for prometheus histograms
#12446
Comments
Wouldn't this explode the number of events we have to store? Basically meaning we have 1 event per entry? What does @odacremolbap Could you share some of the queries you want to run on the data? |
It would definitely create more events (for histogram metric type only). We did this change in Prometheus collector already, this change would align the helper with that.
This change should allow to perform |
getting each bucket expanded increases storage requirements and also CPU/mem/time when processing. before measuring it for posting here for consideration, I am trying to come up with an alternative, no luck so far:
that structure above is not searchable as is because of how (non nested) arrays work, but I'm wondering if a visualization internally retrieves the doc by timestamp, and then parses the content, in which case a solution would be close at hand just expanding inside one event vs sending one event per expanded value at the histogram/summary I'm ingesting at elasticsearch using this template:
At the visualization the terms agregation by there are also some glitches with a 0 I'll try changing types for all |
at the kubernetes apiserver metricset |
ey @odacremolbap could you add some more detail? what do you mean by document? I guess this is the index size? |
that's the size of this file using both layouts |
We should definitively compare the size in Elasticsearch (index size after refresh etc.). |
if this was merged, is there a way in kibana to manage histograms? we would need to:
Is that possible at all? |
Size wise, these are the results of 5 minutes of monitoring apiserver, 10s freq:
3.6 x number of documents indexed The metricset contains 2 metrics
|
Thank you for doing the numbers. Something that brought my attention is the number of documents that a single fetch creates, even with the standard layout, it sounds like it creates ~450 docs. I'm wondering what's the cause for this (I think I remember api server is quite verbose, as it provides detailed info per client & path). From the other metricsets you are working on, is this amount of data that common? |
i think the reason is the number of labels and the cardinality of those labels. As the usage increases at a production environment, the number of metrics might go up. |
Yeah, I'm guessing this is not the general case, what about the other histograms you saw? |
kubeproxy and kubescheduler are kept in the low side |
I understand with lines you mean documents. Yeah, I can see how apiserver/kubecontroller can become a problem, even wit the standard layout, 2MB per 5 mins sounds like a lot of data, we should decide if it's worth it, if so, probably make them optional (disabled by default?) |
prometheus metrics lines for histograms we will generate an event for each 8 lines when all |
closing, we are selecting a reduced set of histogram buckets at visualizations as a work around |
Describe the enhancement:
metricbeat prometheus helper is gathering bucket information in a single event, using a structure similar to:
The values under
bucket
are hard to work with at visualizations. We mostly rely on count and sum to calculate averages, dismissing all other data.Describe a specific use case for the enhancement or feature:
Expanding the data above into multiple events, each one containing the
le
key and value as provided by the prometheus metric, would make it way more flexible visualize, at the cost of storage...
The text was updated successfully, but these errors were encountered: