Prometheus remote write exporter is failing with "out of order samples" under load #2315

rakyll · 2020-12-21T21:03:49Z

The Prometheus remote writer exporter is failing with the following errors under load:

out of order samples
duplicate sample for timestamp

These issues can be potentially caused by the concurrent writes at https://github.com/open-telemetry/opentelemetry-collector/blob/a16c599d26b6db7a95489568ca890b06380463cc/exporter/prometheusremotewriteexporter/exporter.go and/or samples produced for the same metric type at the same replica of the target without pod labels.

I'm consistently able to reproduce the problem in a Kubernetes cluster. To reproduce, run a single replica of the collector and try to collect from 50/60 targets or more and produce 40-50 metrics at each target .

bogdandrutu · 2021-01-04T18:38:28Z

May this be backend error? Does the Prometheus remote write require samples to come in a specific ordering?

rakyll · 2021-01-04T21:22:20Z

This is a backend error, but we must make sure our fan-out logic at the exporter is causing this error. Prometheus is an append-only system, so you can't ingest older data points to a series after appending relatively newer ones.

The second error might be potentially cased by a failure to label data points correctly. Assume you have N replicas of the same program exporting the same metric. If the scraper fails to label the data points with the source program's unique identifier, we might be sending duplicate samples for the same metric at the same timestamp.

andrewhsu · 2021-01-06T17:07:00Z

talked about this at the collector sig mtg today, @rakyll can you provide more info on what you expect to see?

jmacd · 2021-01-28T23:23:44Z

There's something here about how metrics are meant to be "single source". It should not be possible to get out-of-order samples in a correctly configured system, since each metric should come from a single source of truth. I wrote about how "single-homed" metrics are always easier in this document: https://docs.google.com/document/d/1hed5J_Lk_l1qz5rei4V7kDSzcIlr4b0qKG4m7VfbS0I/edit?usp=sharing

rakyll · 2021-04-13T07:15:45Z

We produce remote write requests not in the choronological order:

first req ============> timeseries:<labels:<name:"name" value:"test_histogram0_count" > samples:<timestamp:1618297785608 > > timeseries:<labels:<name:"name" value:"test_histogram0_bucket" > labels:<name:"le" value:"0.1" > samples:<timestamp:1618297785608 > > timeseries:<labels:<name:"name" value:"test_summary0_count" > samples:<timestamp:1618297785608 > > timeseries:<labels:<name:"name" value:"test_summary0" > labels:<name:"quantile" value:"0.1" > samples:<value:0.4774478099834713 timestamp:1618297785608 > > timeseries:<labels:<name:"name" value:"test_summary0" > labels:<name:"quantile" value:"0.5" > samples:<value:0.8713202045000787 timestamp:1618297785608 > > timeseries:<labels:<name:"name" value:"test_gauge0" > samples:<value:5.723645357596498 timestamp:1618297785608 > > timeseries:<labels:<name:"name" value:"test_histogram0_sum" > samples:<timestamp:1618297785608 > > timeseries:<labels:<name:"name" value:"test_histogram0_bucket" > labels:<name:"le" value:"0.005" > samples:<timestamp:1618297785608 > > timeseries:<labels:<name:"name" value:"test_histogram0_bucket" > labels:<name:"le" value:"1" > samples:<timestamp:1618297785608 > > timeseries:<labels:<name:"name" value:"test_histogram0_bucket" > labels:<name:"le" value:"+Inf" > samples:<timestamp:1618297785608 > > timeseries:<labels:<name:"name" value:"test_summary0_sum" > samples:<timestamp:1618297785608 > > timeseries:<labels:<name:"name" value:"test_summary0" > labels:<name:"quantile" value:"0.99" > samples:<value:0.9977474185974964 timestamp:1618297785608 > > timeseries:<labels:<name:"name" value:"test_counter0_total" > samples:<timestamp:1618297785608 > >

second req ============> timeseries:<labels:<name:"name" value:"test_histogram0_bucket" > labels:<name:"le" value:"0.005" > samples:<timestamp:1618297785602 > > timeseries:<labels:<name:"name" value:"test_histogram0_bucket" > labels:<name:"le" value:"1" > samples:<value:17 timestamp:1618297785602 > > timeseries:<labels:<name:"name" value:"test_histogram0_bucket" > labels:<name:"le" value:"+Inf" > samples:<value:17 timestamp:1618297785602 > > timeseries:<labels:<name:"name" value:"test_summary0_sum" > samples:<value:16.256622262869485 timestamp:1618297785602 > > timeseries:<labels:<name:"name" value:"test_summary0" > labels:<name:"quantile" value:"0.5" > samples:<value:0.920758717698257 timestamp:1618297785602 > > timeseries:<labels:<name:"name" value:"test_gauge0" > samples:<value:5.41446628663609 timestamp:1618297785602 > > timeseries:<labels:<name:"name" value:"test_histogram0_sum" > samples:<value:16.256622262869485 timestamp:1618297785602 > > timeseries:<labels:<name:"name" value:"test_histogram0_bucket" > labels:<name:"le" value:"0.1" > samples:<timestamp:1618297785602 > > timeseries:<labels:<name:"name" value:"test_summary0_count" > samples:<value:17 timestamp:1618297785602 > > timeseries:<labels:<name:"name" value:"test_summary0" > labels:<name:"quantile" value:"0.1" > samples:<value:0.5881942475189074 timestamp:1618297785602 > > timeseries:<labels:<name:"name" value:"test_summary0" > labels:<name:"quantile" value:"0.99" > samples:<value:0.9984683987641793 timestamp:1618297785602 > > timeseries:<labels:<name:"name" value:"test_counter0_total" > samples:<value:0.06368703615839877 timestamp:1618297785602 > > timeseries:<labels:<name:"name" value:"test_histogram0_count" > samples:<value:17 timestamp:1618297785602 > >

2021-04-13T07:09:45.676Z error exporterhelper/queued_retry.go:279 Exporting failed. The error is not retryable. Dropping data. {"kind": "exporter", "name": "awsprometheusremotewrite", "error": "Permanent error: server returned HTTP status 400 Bad Request: user=516699956539_ws-ce1f837b-8384-42dc-8c55-4e3834673ad5: err: out of order sample. timestamp=2021-04-13T07:09:45.602Z, series={name="test_summary0_sum"} ", "dropped_items": 4}
go.opentelemetry.io/collector/exporter/exporterhelper.(*retrySender).send
/Users/jbd/opentelemetry-collector/exporter/exporterhelper/queued_retry.go:279
go.opentelemetry.io/collector/exporter/exporterhelper.(*metricsSenderWithObservability).send
/Users/jbd/opentelemetry-collector/exporter/exporterhelper/metricshelper.go:120
go.opentelemetry.io/collector/exporter/exporterhelper.(*queuedRetrySender).start.func1
/Users/jbd/opentelemetry-collector/exporter/exporterhelper/queued_retry.go:152
github.com/jaegertracing/jaeger/pkg/queue.ConsumerFunc.Consume
/Users/jbd/go/pkg/mod/github.com/jaegertracing/jaeger@v1.22.0/pkg/queue/bounded_queue.go:104
github.com/jaegertracing/jaeger/pkg/queue.(*BoundedQueue).StartConsumersWithFactory.func1
/Users/jbd/go/pkg/mod/github.com/jaegertracing/jaeger@v1.22.0/pkg/queue/bounded_queue.go:83

Second request proto has an earlier timestamp than the first one.

rakyll · 2021-04-13T07:25:19Z

In the exporter, the internal type we use for the incoming metric data (map[string]*prompb.TimeSeries{}) is a map and shuffles the order of the data. We need to preserve the order.

@rakyll

…t of order errors Ensures that before a prompb.WriteRequest is created, that the TimeSeries contained have their Sample values sorted chronologically by Prometheus to avoid out of order errors reported by Prometheus barfing. Thanks to @rakyll for a reproducer and for diagnosing the problem, which helped distill the issue from a very complex setup that required super expensive Kubernetes clusters with many replicas, but essentially the problem became more apparently when the number of TimeSeries grew. It is important to note that the presence of such a bug signifies that with a large number of replicas are being scraped from, this stalls scraping and takes a long time which means that targets scraped in a round-robin fashion experience staleness when many. This might be even more reasons for setups to adopt a push model as opposed to scrape endpoints. Fixes open-telemetry#2315 Fixes open-telemetry/prometheus-interoperability-spec#10

@rakyll

…t of order errors Ensures that before a prompb.WriteRequest is created, that the TimeSeries contained have their Sample values sorted chronologically by Prometheus to avoid out of order errors reported by Prometheus barfing. Thanks to @rakyll for a reproducer and for diagnosing the problem, which helped distill the issue from a very complex setup that required super expensive Kubernetes clusters with many replicas, but essentially the problem became more apparently when the number of TimeSeries grew. It is important to note that the presence of such a bug signifies that with a large number of replicas are being scraped from, this stalls scraping and takes a long time which means that targets scraped in a round-robin fashion experience staleness when many. This might be even more reasons for setups to adopt a push model as opposed to scrape endpoints. Fixes open-telemetry#2315 Fixes open-telemetry/prometheus-interoperability-spec#10

@rakyll

…t of order errors Ensures that before a prompb.WriteRequest is created, that the TimeSeries contained have their Sample values sorted chronologically by Prometheus to avoid out of order errors reported by Prometheus barfing. Thanks to @rakyll for a reproducer and for diagnosing the problem, which helped distill the issue from a very complex setup that required super expensive Kubernetes clusters with many replicas, but essentially the problem became more apparently when the number of TimeSeries grew. It is important to note that the presence of such a bug signifies that with a large number of replicas are being scraped from, this stalls scraping and takes a long time which means that targets scraped in a round-robin fashion experience staleness when many. This might be even more reasons for setups to adopt a push model as opposed to scrape endpoints. Fixes open-telemetry#2315 Fixes open-telemetry/prometheus-interoperability-spec#10

@rakyll

…t of order errors Ensures that before a prompb.WriteRequest is created, that the TimeSeries contained have their Sample values sorted chronologically by Prometheus to avoid out of order errors reported by Prometheus barfing. Thanks to @rakyll for a reproducer and for diagnosing the problem, which helped distill the issue from a very complex setup that required super expensive Kubernetes clusters with many replicas, but essentially the problem became more apparently when the number of TimeSeries grew. It is important to note that the presence of such a bug signifies that with a large number of replicas are being scraped from, this stalls scraping and takes a long time which means that targets scraped in a round-robin fashion experience staleness when many. This might be even more reasons for setups to adopt a push model as opposed to scrape endpoints. Fixes open-telemetry#2315 Fixes open-telemetry/prometheus-interoperability-spec#10

@rakyll

…t of order errors (#2941) Ensures that before a prompb.WriteRequest is created, that the TimeSeries contained have their Sample values sorted chronologically by Prometheus to avoid out of order errors reported by Prometheus barfing. Thanks to @rakyll for a reproducer and for diagnosing the problem, which helped distill the issue from a very complex setup that required super expensive Kubernetes clusters with many replicas, but essentially the problem became more apparently when the number of TimeSeries grew. It is important to note that the presence of such a bug signifies that with a large number of replicas are being scraped from, this stalls scraping and takes a long time which means that targets scraped in a round-robin fashion experience staleness when many. This might be even more reasons for setups to adopt a push model as opposed to scrape endpoints. Fixes #2315 Fixes open-telemetry/prometheus-interoperability-spec#10

…pen-telemetry#2315) Bumps [go.uber.org/zap](https://github.com/uber-go/zap) from 1.23.0 to 1.24.0. - [Release notes](https://github.com/uber-go/zap/releases) - [Changelog](https://github.com/uber-go/zap/blob/master/CHANGELOG.md) - [Commits](uber-go/zap@v1.23.0...v1.24.0) --- updated-dependencies: - dependency-name: go.uber.org/zap dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

open-telemetry#2315)

rakyll added the bug Something isn't working label Dec 21, 2020

rakyll changed the title ~~prometheusremotewriter is failing with "out of order samples" under load~~ Prometheus remote write exporter is failing with "out of order samples" under load Dec 21, 2020

andrewhsu added area:exporter need more info labels Jan 6, 2021

alolita mentioned this issue Feb 5, 2021

Fix the “out of order samples” and “duplicate sample for timestamp” issues open-telemetry/prometheus-interoperability-spec#10

Closed

odeke-em mentioned this issue Apr 14, 2021

exporter/prometheusremotewriter: sort Sample by Timestamp to avoid out of order errors #2941

Merged

bogdandrutu closed this as completed in #2941 Apr 15, 2021

amoscatelli mentioned this issue Jun 22, 2022

[exporter/prometheusremotewrite] - out of order sample open-telemetry/opentelemetry-collector-contrib#11438

Closed

Troels51 pushed a commit to Troels51/opentelemetry-collector that referenced this issue Jul 5, 2024

Revert "Bump codecov/codecov-action from 3 to 4 (open-telemetry#2314)" (

049f7e8

open-telemetry#2315)

mx-psi added waiting-for-author and removed need more info labels Sep 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prometheus remote write exporter is failing with "out of order samples" under load #2315

Prometheus remote write exporter is failing with "out of order samples" under load #2315

rakyll commented Dec 21, 2020

bogdandrutu commented Jan 4, 2021 •

edited

Loading

rakyll commented Jan 4, 2021 •

edited

Loading

andrewhsu commented Jan 6, 2021

jmacd commented Jan 28, 2021

rakyll commented Apr 13, 2021 •

edited

Loading

rakyll commented Apr 13, 2021 •

edited

Loading

Prometheus remote write exporter is failing with "out of order samples" under load #2315

Prometheus remote write exporter is failing with "out of order samples" under load #2315

Comments

rakyll commented Dec 21, 2020

bogdandrutu commented Jan 4, 2021 • edited Loading

rakyll commented Jan 4, 2021 • edited Loading

andrewhsu commented Jan 6, 2021

jmacd commented Jan 28, 2021

rakyll commented Apr 13, 2021 • edited Loading

rakyll commented Apr 13, 2021 • edited Loading

bogdandrutu commented Jan 4, 2021 •

edited

Loading

rakyll commented Jan 4, 2021 •

edited

Loading

rakyll commented Apr 13, 2021 •

edited

Loading

rakyll commented Apr 13, 2021 •

edited

Loading