-
Notifications
You must be signed in to change notification settings - Fork 800
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unwanted _created variables #438
Comments
Ah. And this is in fact the explanation for why some of my timeseries weren't working anymore; the variable itself has now got a _total suffix. My mixed deployment now ends up with different variable names and I'm missing data as a result. Sad to see such a backward-incompatible change. :-\ |
This is part of the OpenMetrics data model, and was explicitly called out in the 0.4.0 release notes. |
It's not in any Debian changelogs at least and that's the kind of stuff that ~end users would read at best.. :-/ But yeah, @SuperQ gave me some detail via IM as well yes, thank you. I personally really do not see the value of a _created variable at least. It doubles the amount of data exported and ingested (though I guess compression makes the latter pretty meaningless). And it's not really any better than a daemon uptime variable since even though maybe variables don't get created at the exact same times during initialisation, they likely do start getting incremented around the same time, when the thing starts serving. Also, the only use for it that I can think of is calculating long-term mean values which I didn't think anyone is doing when instant or at most short-term rates are the only interesting thing to monitor? (Anyway, I understand that I'm likely just rehashing a stale debate..) |
My team has a Prometheus instance that ingests tens of thousands of metrics. That instance uses a ton of memory. I found a lot of the metrics are _created metrics from this library. Since we don't use them I wanted to try to disable them to see if it helps with resource usage, but from looking through the code for this library I can't find a way to easily toggle them off. I tried to find where this format is specified in the OpenMetrics data model but couldn't find it. I can find the Prometheus format which has optional timestamps after the metric value instead of as a completely separate metric with a _created suffix, like What's the use of the timestamp? It seems like it's doubling our metrics for no good reason, but maybe there's a whole of PromQL queries I'm not imagining that are made possible by it. Also if it's the same value repeated, would removing it even help with resource usage? Or would Prometheus' tsdb compress those in some way so that they're not nearly as expensive as other metrics whose values constantly change? I was going to try to use metricRelabelings to match these by regex and drop them, but then found this issue and thought I would ask this here to try to learn more first. I don't understand why these are here and would like to know more. |
That's a small Prometheus.
Some (other) monitoring systems have use for it. The samples will compress very well, as they're constant.. |
I just stumbled over the same thing. Imho, there should be a possibility to disable it for users that don't need it. Maybe via a kwarg to the Counter constructor? Willing to open a PR for this! Also, some small tests of mine produced the following output with the Multiprocess collector:
The I think it is because client_python/prometheus_client/metrics.py Lines 55 to 59 in a8f5c80
is false, and hence client_python/prometheus_client/metrics.py Lines 235 to 238 in a8f5c80
will never be called for the labeled counter. |
I understand that Prometheus storage is barely affected thanks to
compression, to me it's mostly namespace pollution + bandwidth waste (I
do scrapes over a few low-bandwidth links).
That, and I think the variables are misleading. Example: A web server
exports a per-response code counter variable. It takes an hour before the
first 418 response is served. As a result, the creation timestamp for
that one will be 1h after the one for 200.
But do I care about the number of 418 responses served since the first
one was served? No, I care about the number of responses of any kind
that were served since the server (re)started...
(I understand that my issue here is with OpenMetrics and not with
Prometheus and especially the Python client library... And that, had
this idea come up during the one OpenMetrics meeting I've attended, I
hope I would've pointed this out..)
|
There's no plans to allow this. This is currently the reference library for OpenMetrics, so we're going to support as much of the format as possible. It's up to the consumer of the metrics if this is useful or not, not the producer.
That sounds like incorrect usage of the multiprocess mode, but this would be better tracked in its own bug. |
I don't think there's anything to do here, we're the reference OpenMetrics client so need to support the spec as fully as possible. |
Given that all other prometheus exporters don't emit For example, assuming a system like ours which has many different processes and using a variety of clients, the
That is regarding the server, not nodes generating the metrics and associated traffic to the server. |
In addition, while the metrics may not take up much space, they greatly impact indexing as they add to the overall metrics load of the server. In a recent OpenMetrics discussion, I calculated a 50% increase in metrics capacity (head block memory) would be needed in order to handled the new |
I happen to have PRs out to massively reduce non-head block memory, so this may have much less impact than you'd think. |
This is the top search result for these
The change was introduced in #973 and is available in version |
Even better, the next release of Prometheus (After 2.54.x) will support parsing and handling The new handling will not ingest the |
I'm not sure why I didn't see this earlier but #14738 is where we use the functionality implemented in #14356. The latter PR doesn't enable CT ingestion in the scrapeloop |
I've recently upgraded to the python-prometheus-client that comes with Debian Buster and just noticed a lot of variables I don't use in a small Python-based exporter I wrote, mostly the _created suffix ones drawing my attention. This seems to be a timestamp of when a variable (and in case of mapped/label variables, every individual label under it) got created.
Looking at the source, there seems to be no proper way to disable this near-2× explosion of data that I have no need for at all?
The text was updated successfully, but these errors were encountered: