Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[receiver/Prometheus] inconsistent timestamps on metric points error #32186

Open
venkateshsredhat opened this issue Apr 5, 2024 · 15 comments
Open
Assignees
Labels
bug Something isn't working receiver/prometheus Prometheus receiver Stale

Comments

@venkateshsredhat
Copy link

Component(s)

receiver/prometheus

What happened?

Description

OpenTelemetryCollector pod shows below error :

2024-04-03T06:55:00.063Z warn internal/transaction.go:149 failed to add datapoint {"kind": "receiver", "name": "prometheus", "data_type": "metrics", "error": "inconsistent timestamps on metric points for metric prober_probe_total", "metric_name": "prober_probe_total", "labels": "{name="prober_probe_total", _source="", env="int", instance="", job="prometheus-self"}"}

otel collecotr config:

receivers: prometheus: config: global: external_labels: _source: "**example**" scrape_configs: - job_name: prometheus-self scrape_interval: 1m scrape_timeout: 10s metrics_path: /federate scheme: http honor_labels: false enable_http2: true kubernetes_sd_configs: - role: service namespaces: own_namespace: false names: - **example** selectors: - role: service label: "***example***" params: 'match[]': - '{__name__="prober_probe_total"}'

Is this a known bug ? Currently this warning is just present all over the log even though the metric gets exported fine but we would like to solve this issue .

Expected Result

No error warnings in the logs .

Actual Result

internal/transaction.go:149 failed to add datapoint {"kind": "receiver", "name": "prometheus", "data_type": "metrics", "error": "inconsistent timestamps on metric points for metric prober_probe_total", "metric_name": "prober_probe_total", "labels": "{name="prober_probe_total", _source="", env="int", instance="", job="prometheus-self"}"}

Collector version

v0.93.0

Environment information

Environment

OS: (e.g., "Ubuntu 20.04")
Compiler(if manually compiled): (e.g., "go 14.2")

OpenTelemetry Collector configuration

receivers:
        prometheus:
          config:
            global:
              external_labels:
                _source: "**example**"
            scrape_configs:
              - job_name: prometheus-self
                scrape_interval: 1m
                scrape_timeout: 10s
                metrics_path: /federate
                scheme: http
                honor_labels: false
                enable_http2: true
                kubernetes_sd_configs:
                  - role: service
                    namespaces:
                      own_namespace: false
                      names:
                        - **example**
                    selectors:
                      - role: service
                        label: "***example***"
                params:
                  'match[]':
                    - '{__name__="prober_probe_total"}'

Log output

No response

Additional context

No response

@venkateshsredhat venkateshsredhat added bug Something isn't working needs triage New item requiring triage labels Apr 5, 2024
@crobert-1 crobert-1 added the receiver/prometheus Prometheus receiver label Apr 5, 2024
Copy link
Contributor

github-actions bot commented Apr 5, 2024

Pinging code owners for receiver/prometheus: @Aneurysm9 @dashpole. See Adding Labels via Comments if you do not have permissions to add labels yourself.

@crobert-1
Copy link
Member

Potentially related: #22096

Copy link
Contributor

github-actions bot commented Jun 5, 2024

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@github-actions github-actions bot added the Stale label Jun 5, 2024
@dashpole dashpole removed the Stale label Jun 5, 2024
@dashpole
Copy link
Contributor

dashpole commented Jun 5, 2024

I haven't seen it before. Can you share more about your setup? Is prober_probe_total from the kubelet? Are you scraping another Prometheus server?

@dashpole dashpole removed the needs triage New item requiring triage label Jun 5, 2024
@dashpole dashpole self-assigned this Jun 5, 2024
@dashpole
Copy link
Contributor

dashpole commented Jun 5, 2024

Is the issue happening for any other metrics?

@imrajdas
Copy link

imrajdas commented Jun 5, 2024

Hi Team, I am also facing the same issue

2024-06-05T14:39:06.885Z	warn	internal/transaction.go:149	failed to add datapoint	{"kind": "receiver", "name": "prometheus", "data_type": "metrics", "error": "inconsistent timestamps on metric points for metric container_cpu_usage_seconds_total", "metric_name": "container_cpu_usage_seconds_total", "labels": "{__name__=\"container_cpu_usage_seconds_total\", cpu=\"total\", env=\"production\", id=\"/system.slice/containerd.service\", instance=\"iro-agent:8080\", job=\"opentelemetry-collector\"}"}
2024-06-05T14:39:06.885Z	warn	internal/transaction.go:149	failed to add datapoint	{"kind": "receiver", "name": "prometheus", "data_type": "metrics", "error": "inconsistent timestamps on metric points for metric container_cpu_usage_seconds_total", "metric_name": "container_cpu_usage_seconds_total", "labels": "{__name__=\"container_cpu_usage_seconds_total\", cpu=\"total\", env=\"production\", id=\"/system.slice/kubelet.service\", instance=\"iro-agent:8080\", job=\"opentelemetry-collector\"}"}
2024-06-05T14:39:06.886Z	warn	internal/transaction.go:149	failed to add datapoint	{"kind": "receiver", "name": "prometheus", "data_type": "metrics", "error": "inconsistent timestamps on metric points for metric container_memory_failures_total", "metric_name": "container_memory_failures_total", "labels": "{__name__=\"container_memory_failures_total\", env=\"production\", failure_type=\"pgfault\", id=\"/\", instance=\"iro-agent:8080\", job=\"opentelemetry-collector\", scope=\"container\"}"}
2024-06-05T14:39:06.886Z	warn	internal/transaction.go:149	failed to add datapoint	{"kind": "receiver", "name": "prometheus", "data_type": "metrics", "error": "inconsistent timestamps on metric points for metric container_memory_failures_total", "metric_name": "container_memory_failures_total", "labels": "{__name__=\"container_memory_failures_total\", env=\"production\", failure_type=\"pgfault\", id=\"/\", instance=\"iro-agent:8080\", job=\"opentelemetry-collector\", scope=\"hierarchy\"}"}

Note: I am fetching the CAdvisor metrics from kubelet API

@dashpole
Copy link
Contributor

dashpole commented Jun 5, 2024

What collector versions are you using?

@dashpole
Copy link
Contributor

dashpole commented Jun 5, 2024

@imrajdas can you share your prometheus receiver config?

@dashpole
Copy link
Contributor

dashpole commented Jun 5, 2024

This might be an issue with metrics that set an explicit timestamp in the exposition.

@imrajdas
Copy link

imrajdas commented Jun 5, 2024

Collector version- otel/opentelemetry-collector-contrib:0.96.0 (I have used the latest one also- 0.102.0, still the same issue)

exporters:
  prometheusremotewrite:
    external_labels:
      tenant: 78566289-1450-4a0c-b9ca-26a419058d86
    endpoint: <THANOS_RECEIVER>
extensions:
  health_check:
    endpoint: ${env:MY_POD_IP}:13133
receivers:
  prometheus:
    config:
      scrape_configs:
      - job_name: opentelemetry-collector
        scrape_interval: 10s
        static_configs:
        - targets:
          - 'agent:8080'
      - job_name: 'kube-api-blackbox'
        metrics_path: /probe
        params:
          module: [http_2xx]
        static_configs:
        - targets:
            - https://www.google.com
            - http://www.example.com
            - https://prometheus.io
        - targets:
            - http://paymentservice.bofa.svc.cluster.local:50051
            - http://frontend.bofa.svc.cluster.local:80
            - http://test.bofa.svc.cluster.local:80
          labels:
            service: 'internal-services'
        relabel_configs:
        - source_labels: [__address__]
          target_label: __param_target
        - source_labels: [__param_target]
          target_label: instance
        - target_label: __address__
          replacement: blackbox-exporter.imrajdaschi.svc.cluster.local:9115

service:
  extensions:
  - health_check
  pipelines:
    metrics:
      exporters:
      - prometheusremotewrite
      receivers:
      - prometheus

@imrajdas
Copy link

imrajdas commented Jun 5, 2024

I have built a custom exporter called the Kubelet API to get the CAdvisor metrics and expose them.

Copy link
Contributor

github-actions bot commented Aug 5, 2024

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@github-actions github-actions bot added the Stale label Aug 5, 2024
@github-actions github-actions bot removed the Stale label Sep 10, 2024
@flenoir
Copy link

flenoir commented Oct 1, 2024

Hi, having same issue. Does anyone had solved this to get rid of error message ?

@flenoir
Copy link

flenoir commented Oct 4, 2024

After having a discussion in slack . it pointed me to the right direction. As explained in message, a "container_id" label was dropped. In my case, i also found that an "id" label as dropped. As i removed this labeldrop, the issue disappeared.

Copy link
Contributor

github-actions bot commented Dec 4, 2024

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@github-actions github-actions bot added the Stale label Dec 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working receiver/prometheus Prometheus receiver Stale
Projects
None yet
Development

No branches or pull requests

5 participants