Duplicated metrics for restarted Pods #2844

mrliptontea · 2021-04-08T16:06:54Z

I believe I have encountered a bug where multiple values are exported for the same Pod at the same point in time when that Pod has been restarted.

I have been doing some load tests against an app in K8s and I noticed something. The Pod had a limit set to 1Gi and while I was attacking the app with requests the Pods restarted a few times.

When I looked at the graphs in Grafana, it seemed like Pods are using way over 2.6GiB of memory. That didn't make much sense so I investigated the query, which led me to finding this issue.

Querying the container_memory_working_set_bytes metric in Prometheus I got the following:

container_memory_working_set_bytes{namespace="my-app", container="my-app"}
# result:
container_memory_working_set_bytes{container="my-app", endpoint="https-metrics", id="/kubepods/burstable/pod80ea171a-411f-48cf-a603-04bd71a03c1e/372804cd3c321ca3f908265296407d3e9f6bd06568aab5f7be7f7047081dd7dc", image="my-image", instance="172.22.218.162:10250", job="kubelet", metrics_path="/metrics/cadvisor", name="k8s_my-app_web-5dfd4896b4-zc7ll_my-app_80ea171a-411f-48cf-a603-04bd71a03c1e_7", namespace="my-app", node="node1", pod="web-5dfd4896b4-zc7ll"}	1038004224
container_memory_working_set_bytes{container="my-app", endpoint="https-metrics", id="/kubepods/burstable/pod80ea171a-411f-48cf-a603-04bd71a03c1e/8d37738784860ae70513378dba0df77acc15e52001cae5d73ab6799533d06a4d", image="my-image", instance="172.22.218.162:10250", job="kubelet", metrics_path="/metrics/cadvisor", name="k8s_my-app_web-5dfd4896b4-zc7ll_my-app_80ea171a-411f-48cf-a603-04bd71a03c1e_8", namespace="my-app", node="node1", pod="web-5dfd4896b4-zc7ll"}	815841280
container_memory_working_set_bytes{container="my-app", endpoint="https-metrics", id="/kubepods/burstable/pod80ea171a-411f-48cf-a603-04bd71a03c1e/89d3021410ad8d22f36b53ac714ad962c34a5debb4f6de60c7056620d0721156", image="my-image", instance="172.22.218.162:10250", job="kubelet", metrics_path="/metrics/cadvisor", name="k8s_my-app_web-5dfd4896b4-zc7ll_my-app_80ea171a-411f-48cf-a603-04bd71a03c1e_9", namespace="my-app", node="node1", pod="web-5dfd4896b4-zc7ll"}	948023296
container_memory_working_set_bytes{container="my-app", endpoint="https-metrics", id="/kubepods/burstable/pod80ea171a-411f-48cf-a603-04bd71a03c1e/4300c0a162a494e32ab02c5b3741d91cc6f4b22cc9049a8cdc3d008b57e2dd8b", image="my-image", instance="172.22.218.162:10250", job="kubelet", metrics_path="/metrics/cadvisor", name="k8s_my-app_web-5dfd4896b4-zc7ll_my-app_80ea171a-411f-48cf-a603-04bd71a03c1e_10", namespace="my-app", node="node1", pod="web-5dfd4896b4-zc7ll"}	85966848
container_memory_working_set_bytes{container="my-app", endpoint="https-metrics", id="/kubepods/burstable/podb7ecf824-6797-4183-9566-434142df3757/e384a937574526eebeee4438132a0e882c0517bce4ca84db14497d598cd93a09", image="my-image", instance="172.22.106.15:10250", job="kubelet", metrics_path="/metrics/cadvisor", name="k8s_my-app_web-5dfd4896b4-x4nsr_my-app_b7ecf824-6797-4183-9566-434142df3757_6", namespace="my-app", node="node2", pod="web-5dfd4896b4-x4nsr"}	980393984
container_memory_working_set_bytes{container="my-app", endpoint="https-metrics", id="/kubepods/burstable/podb7ecf824-6797-4183-9566-434142df3757/e5c6aec2cc6cf7abbd2163cd195b181954df28fc134478e2cfd27283c2a7838f", image="my-image", instance="172.22.106.15:10250", job="kubelet", metrics_path="/metrics/cadvisor", name="k8s_my-app_web-5dfd4896b4-x4nsr_my-app_b7ecf824-6797-4183-9566-434142df3757_7", namespace="my-app", node="node2", pod="web-5dfd4896b4-x4nsr"}	915144704
container_memory_working_set_bytes{container="my-app", endpoint="https-metrics", id="/kubepods/burstable/podb7ecf824-6797-4183-9566-434142df3757/fcce62cda4d8300c82f4552ddd069b59e8de30c31ece187a6349fe786f182e7a", image="my-image", instance="172.22.106.15:10250", job="kubelet", metrics_path="/metrics/cadvisor", name="k8s_my-app_web-5dfd4896b4-x4nsr_my-app_b7ecf824-6797-4183-9566-434142df3757_8", namespace="my-app", node="node2", pod="web-5dfd4896b4-x4nsr"}	799879168

Notice how individual starts of the Pods are recorded for the same point in time - see the name labels ending _7, _8, _9, _10 for example. These are four instances of the same exact Pod but in reality they never ran at the same time (they're restarts). If I add together these values it will give 2.6GiB which is the number I saw in Grafana. I can confirm this from other graphs that memory usage on the nodes never registered a 2.6GiB increase but they saw 1GiB, which is my limit.

I use Amazon EKS with Kubernetes version v1.19.6-eks-49a6c0 which I believe uses cAdvisor v0.37.3.

The text was updated successfully, but these errors were encountered:

ZhangSIming-blyq · 2022-03-24T07:20:12Z

What's the issue going on? I met the same problem.

See the picture above, the overlapping parts means pod is restarting, and there are two metrics during that time which only have a difference in the field "id"(cgroup path), if I use Promesql like:

sum(rate(container_cpu_usage_seconds_total{xxx}[$interval])) by (pod)

To give a summary, the value will be double. Please tell me if you have already fixed it, thanks

JoeAshworth · 2022-11-29T14:04:04Z

Did you ever solve this? We're experiencing the exact same.

ZhangSIming-blyq · 2022-11-29T14:07:41Z

I have solved this by change

sum(rate(container_cpu_usage_seconds_total{xxx}[$interval])) by (pod)

to

max(rate(container_cpu_usage_seconds_total{xxx}[$interval])) by (pod)

Now that I think it actually makes kind of sense...

LeoHsiao1 · 2023-05-10T08:22:45Z

When a container within a Pod terminates due to OOM, k8s will automatically create a new container based on the restartPolicy.

It seems that cAdvisor will cache the monitoring metrics of the old container for 4 minutes, so sum(container_memory_working_set_bytes{container!~"POD|", pod="xx"}) will calculate the total memory of the new container and the old container. It looks like this Pod is using twice as much memory as usual.

To avoid the above problem, I can use this query expression to monitor the memory overhead of each Pod.

container_memory_working_set_bytes{container='', pod!=''}

container='' is to query the root Cgroup node of each Pod, which has a resource overhead equal to the sum of the individual containers.

mrliptontea · 2023-05-10T13:04:36Z

I never actually managed to find a solution, just whenever dealing with OOM kills I'm wary of this problem so never trust peak memory usages.

But I suppose something like sum(max by (container) (...)) would work for single- and multi-container (sidecars) pods.

jtnz · 2023-11-01T21:55:39Z

I think I'm seeing something similar. When using Karpenter (AWS EKS) for auto-scaling, Karpenter will add the label node.kubernetes.io/exclude-from-external-load-balancers as the node is about to go. This cause duplicate metrics to show up in these cadvisor/kubelet container metrics which is a nightmare when doing prometheus queries, especially if performing query operators (e.g. /, *, etc). Your query might be fine, then you'll hit a window where there's duplicates and have errors about needing group_left/group_right to deal with one-to-many or many-to-one metrics. Hack solution is to always do some aggregation, e.g. max, sum, or something of the like.

jtnz · 2023-11-02T22:30:03Z

It seems that cAdvisor will cache the monitoring metrics of the old container for 4 minutes

@LeoHsiao1 Just curious where you found this? I'm trying understand the situation better and if this is expected behavior or a bug. I tried to reproduce it locally by running a cadvisor container, having /metrics open, then starting and stopping an nginx container. When it stopped the metrics disappeared right away, so I'm not sure if this has the same behavior as cadvisor built into kubelet.

LeoHsiao1 · 2023-11-03T12:44:58Z

@jtnz Hi
This 4 minutes is my speculation based on prometheus charts, and has no theoretical basis.

renepupil · 2023-12-07T17:29:58Z

How should I interpret a reporting for the same time and the same container but with DIFFERENT names?

Is the used memory the sum of these two (915144704 + 799879168) or the maximum of these two values (max(915144704, 799879168))?

My guess is the maximum and it's just a duplicate reporting with the same timestamp but different microseconds.
But another possibility would be a container restarting and both memories belong to "different" containers during overlapping, then a sum formula would be correct, as both containers actually use the reported memory bytes...

container_memory_working_set_bytes{container="my-app", endpoint="https-metrics", id="/kubepods/burstable/podb7ecf824-6797-4183-9566-434142df3757/e5c6aec2cc6cf7abbd2163cd195b181954df28fc134478e2cfd27283c2a7838f", image="my-image", instance="172.22.106.15:10250", job="kubelet", metrics_path="/metrics/cadvisor", name="k8s_my-app_web-5dfd4896b4-x4nsr_my-app_b7ecf824-6797-4183-9566-434142df3757_7", namespace="my-app", node="node2", pod="web-5dfd4896b4-x4nsr"}	915144704
container_memory_working_set_bytes{container="my-app", endpoint="https-metrics", id="/kubepods/burstable/podb7ecf824-6797-4183-9566-434142df3757/fcce62cda4d8300c82f4552ddd069b59e8de30c31ece187a6349fe786f182e7a", image="my-image", instance="172.22.106.15:10250", job="kubelet", metrics_path="/metrics/cadvisor", name="k8s_my-app_web-5dfd4896b4-x4nsr_my-app_b7ecf824-6797-4183-9566-434142df3757_8", namespace="my-app", node="node2", pod="web-5dfd4896b4-x4nsr"}	799879168

I appreciate your feedback.

lerminou · 2024-02-28T15:32:33Z

Same here, any news on this ?

This was referenced Apr 8, 2021

Inconsistent & stale metrics reported for a docker container which was restarted #2791

Open

Reporting memory usage with restarted Pods kubernetes-monitoring/kubernetes-mixin#585

Open

Creatone self-assigned this Apr 10, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Duplicated metrics for restarted Pods #2844

Duplicated metrics for restarted Pods #2844

mrliptontea commented Apr 8, 2021

ZhangSIming-blyq commented Mar 24, 2022 •

edited

Loading

JoeAshworth commented Nov 29, 2022

ZhangSIming-blyq commented Nov 29, 2022 •

edited

Loading

LeoHsiao1 commented May 10, 2023 •

edited

Loading

mrliptontea commented May 10, 2023

jtnz commented Nov 1, 2023 •

edited

Loading

jtnz commented Nov 2, 2023

LeoHsiao1 commented Nov 3, 2023

renepupil commented Dec 7, 2023 •

edited

Loading

lerminou commented Feb 28, 2024

Duplicated metrics for restarted Pods #2844

Duplicated metrics for restarted Pods #2844

Comments

mrliptontea commented Apr 8, 2021

ZhangSIming-blyq commented Mar 24, 2022 • edited Loading

JoeAshworth commented Nov 29, 2022

ZhangSIming-blyq commented Nov 29, 2022 • edited Loading

LeoHsiao1 commented May 10, 2023 • edited Loading

mrliptontea commented May 10, 2023

jtnz commented Nov 1, 2023 • edited Loading

jtnz commented Nov 2, 2023

LeoHsiao1 commented Nov 3, 2023

renepupil commented Dec 7, 2023 • edited Loading

lerminou commented Feb 28, 2024

ZhangSIming-blyq commented Mar 24, 2022 •

edited

Loading

ZhangSIming-blyq commented Nov 29, 2022 •

edited

Loading

LeoHsiao1 commented May 10, 2023 •

edited

Loading

jtnz commented Nov 1, 2023 •

edited

Loading

renepupil commented Dec 7, 2023 •

edited

Loading