Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unwanted _created variables #438

Closed
Wilm0r opened this issue Jul 10, 2019 · 15 comments
Closed

Unwanted _created variables #438

Wilm0r opened this issue Jul 10, 2019 · 15 comments

Comments

@Wilm0r
Copy link

Wilm0r commented Jul 10, 2019

I've recently upgraded to the python-prometheus-client that comes with Debian Buster and just noticed a lot of variables I don't use in a small Python-based exporter I wrote, mostly the _created suffix ones drawing my attention. This seems to be a timestamp of when a variable (and in case of mapped/label variables, every individual label under it) got created.

Looking at the source, there seems to be no proper way to disable this near-2× explosion of data that I have no need for at all?

@Wilm0r
Copy link
Author

Wilm0r commented Jul 10, 2019

Ah. And this is in fact the explanation for why some of my timeseries weren't working anymore; the variable itself has now got a _total suffix. My mixed deployment now ends up with different variable names and I'm missing data as a result.

Sad to see such a backward-incompatible change. :-\

@brian-brazil
Copy link
Contributor

This is part of the OpenMetrics data model, and was explicitly called out in the 0.4.0 release notes.

@Wilm0r
Copy link
Author

Wilm0r commented Jul 10, 2019

It's not in any Debian changelogs at least and that's the kind of stuff that ~end users would read at best.. :-/ But yeah, @SuperQ gave me some detail via IM as well yes, thank you.

I personally really do not see the value of a _created variable at least. It doubles the amount of data exported and ingested (though I guess compression makes the latter pretty meaningless). And it's not really any better than a daemon uptime variable since even though maybe variables don't get created at the exact same times during initialisation, they likely do start getting incremented around the same time, when the thing starts serving.

Also, the only use for it that I can think of is calculating long-term mean values which I didn't think anyone is doing when instant or at most short-term rates are the only interesting thing to monitor?

(Anyway, I understand that I'm likely just rehashing a stale debate..)

@bdrupieski
Copy link

My team has a Prometheus instance that ingests tens of thousands of metrics. That instance uses a ton of memory. I found a lot of the metrics are _created metrics from this library. Since we don't use them I wanted to try to disable them to see if it helps with resource usage, but from looking through the code for this library I can't find a way to easily toggle them off.

I tried to find where this format is specified in the OpenMetrics data model but couldn't find it. I can find the Prometheus format which has optional timestamps after the metric value instead of as a completely separate metric with a _created suffix, like http_requests_total{method="post",code="200"} 1027 1395066363000.

What's the use of the timestamp? It seems like it's doubling our metrics for no good reason, but maybe there's a whole of PromQL queries I'm not imagining that are made possible by it.

Also if it's the same value repeated, would removing it even help with resource usage? Or would Prometheus' tsdb compress those in some way so that they're not nearly as expensive as other metrics whose values constantly change?

I was going to try to use metricRelabelings to match these by regex and drop them, but then found this issue and thought I would ask this here to try to learn more first. I don't understand why these are here and would like to know more.

@brian-brazil
Copy link
Contributor

tens of thousands of metrics

That's a small Prometheus.

What's the use of the timestamp?

Some (other) monitoring systems have use for it.

The samples will compress very well, as they're constant..

@martinitus
Copy link

martinitus commented Aug 6, 2019

Some (other) monitoring systems have use for it.

I just stumbled over the same thing. Imho, there should be a possibility to disable it for users that don't need it. Maybe via a kwarg to the Counter constructor? Willing to open a PR for this!

Also, some small tests of mine produced the following output with the Multiprocess collector:

# HELP count_total Multiprocess metric
# TYPE count_total counter
count_total{process="p1"} 446.0
count_total{process="p2"} 446.0
count_total{process="main"} 1.0
# HELP other_count_total Multiprocess metric
# TYPE other_count_total counter
other_count_total{snr="A280004100",worker="p1"} 446.0
other_count_total{snr="A280004100",worker="p2"} 446.0
# HELP count_total Description
# TYPE count_total counter
count_total{process="main"} 1.0
# TYPE count_created gauge
count_created{process="main"} 1.5651151507526343e+09

The count_created metric only exists for the main label. Is that intentional? I would expect similar count_created metrics also for the other labels.

I think it is because

def _is_observable(self):
# Whether this metric is observable, i.e.
# * a metric without label names and values, or
# * the child of a labelled metric.
return not self._labelnames or (self._labelnames and self._labelvalues)

is false, and hence

def _metric_init(self):
self._value = values.ValueClass(self._type, self._name, self._name + '_total', self._labelnames,
self._labelvalues)
self._created = time.time()

will never be called for the labeled counter.

@Wilm0r
Copy link
Author

Wilm0r commented Aug 6, 2019 via email

@brian-brazil
Copy link
Contributor

Imho, there should be a possibility to disable it for users that don't need it.

There's no plans to allow this. This is currently the reference library for OpenMetrics, so we're going to support as much of the format as possible. It's up to the consumer of the metrics if this is useful or not, not the producer.

Also, some small tests of mine produced the following output with the Multiprocess collector:

That sounds like incorrect usage of the multiprocess mode, but this would be better tracked in its own bug.

@brian-brazil
Copy link
Contributor

I don't think there's anything to do here, we're the reference OpenMetrics client so need to support the spec as fully as possible.

@belm0
Copy link

belm0 commented Dec 10, 2019

Given that all other prometheus exporters don't emit *_created it seems reasonable to make it a library option.

For example, assuming a system like ours which has many different processes and using a variety of clients, the *_created metrics are of no use (since they aren't generated consistently and hence no backend could rely on them), and are merely taking up critical bandwidth between client and server.

The samples will compress very well, as they're constant..

That is regarding the server, not nodes generating the metrics and associated traffic to the server.

@SuperQ
Copy link
Member

SuperQ commented Dec 11, 2019

In addition, while the metrics may not take up much space, they greatly impact indexing as they add to the overall metrics load of the server. In a recent OpenMetrics discussion, I calculated a 50% increase in metrics capacity (head block memory) would be needed in order to handled the new _created data.

@brian-brazil
Copy link
Contributor

I happen to have PRs out to massively reduce non-head block memory, so this may have much less impact than you'd think.

@estheruary
Copy link

estheruary commented Aug 22, 2024

This is the top search result for these _created metrics so good news everyone, this has been solved.

def disable_created_metrics():

The change was introduced in #973 and is available in version v0.20.0 and above.

@SuperQ
Copy link
Member

SuperQ commented Aug 23, 2024

Even better, the next release of Prometheus (After 2.54.x) will support parsing and handling _created OpenMetrics timestamp values from Python. So we can finally get the value that these metrics were intended to provide.

The new handling will not ingest the _created metrics as new series. It will insert a zero value, as originally intended, into the series. This solves several edge cases for the original Prometheus format data model.

prometheus/prometheus#14356

@Maniktherana
Copy link

Maniktherana commented Sep 18, 2024

Even better, the next release of Prometheus (After 2.54.x) will support parsing and handling _created OpenMetrics timestamp values from Python. So we can finally get the value that these metrics were intended to provide.

The new handling will not ingest the _created metrics as new series. It will insert a zero value, as originally intended, into the series. This solves several edge cases for the original Prometheus format data model.

prometheus/prometheus#14356

I'm not sure why I didn't see this earlier but #14738 is where we use the functionality implemented in #14356. The latter PR doesn't enable CT ingestion in the scrapeloop

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants