Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Duplicated metrics for restarted Pods #2844

Open
mrliptontea opened this issue Apr 8, 2021 · 10 comments
Open

Duplicated metrics for restarted Pods #2844

mrliptontea opened this issue Apr 8, 2021 · 10 comments
Assignees

Comments

@mrliptontea
Copy link

I believe I have encountered a bug where multiple values are exported for the same Pod at the same point in time when that Pod has been restarted.

I have been doing some load tests against an app in K8s and I noticed something. The Pod had a limit set to 1Gi and while I was attacking the app with requests the Pods restarted a few times.

When I looked at the graphs in Grafana, it seemed like Pods are using way over 2.6GiB of memory. That didn't make much sense so I investigated the query, which led me to finding this issue.

Querying the container_memory_working_set_bytes metric in Prometheus I got the following:

container_memory_working_set_bytes{namespace="my-app", container="my-app"}
# result:
container_memory_working_set_bytes{container="my-app", endpoint="https-metrics", id="/kubepods/burstable/pod80ea171a-411f-48cf-a603-04bd71a03c1e/372804cd3c321ca3f908265296407d3e9f6bd06568aab5f7be7f7047081dd7dc", image="my-image", instance="172.22.218.162:10250", job="kubelet", metrics_path="/metrics/cadvisor", name="k8s_my-app_web-5dfd4896b4-zc7ll_my-app_80ea171a-411f-48cf-a603-04bd71a03c1e_7", namespace="my-app", node="node1", pod="web-5dfd4896b4-zc7ll"}	1038004224
container_memory_working_set_bytes{container="my-app", endpoint="https-metrics", id="/kubepods/burstable/pod80ea171a-411f-48cf-a603-04bd71a03c1e/8d37738784860ae70513378dba0df77acc15e52001cae5d73ab6799533d06a4d", image="my-image", instance="172.22.218.162:10250", job="kubelet", metrics_path="/metrics/cadvisor", name="k8s_my-app_web-5dfd4896b4-zc7ll_my-app_80ea171a-411f-48cf-a603-04bd71a03c1e_8", namespace="my-app", node="node1", pod="web-5dfd4896b4-zc7ll"}	815841280
container_memory_working_set_bytes{container="my-app", endpoint="https-metrics", id="/kubepods/burstable/pod80ea171a-411f-48cf-a603-04bd71a03c1e/89d3021410ad8d22f36b53ac714ad962c34a5debb4f6de60c7056620d0721156", image="my-image", instance="172.22.218.162:10250", job="kubelet", metrics_path="/metrics/cadvisor", name="k8s_my-app_web-5dfd4896b4-zc7ll_my-app_80ea171a-411f-48cf-a603-04bd71a03c1e_9", namespace="my-app", node="node1", pod="web-5dfd4896b4-zc7ll"}	948023296
container_memory_working_set_bytes{container="my-app", endpoint="https-metrics", id="/kubepods/burstable/pod80ea171a-411f-48cf-a603-04bd71a03c1e/4300c0a162a494e32ab02c5b3741d91cc6f4b22cc9049a8cdc3d008b57e2dd8b", image="my-image", instance="172.22.218.162:10250", job="kubelet", metrics_path="/metrics/cadvisor", name="k8s_my-app_web-5dfd4896b4-zc7ll_my-app_80ea171a-411f-48cf-a603-04bd71a03c1e_10", namespace="my-app", node="node1", pod="web-5dfd4896b4-zc7ll"}	85966848
container_memory_working_set_bytes{container="my-app", endpoint="https-metrics", id="/kubepods/burstable/podb7ecf824-6797-4183-9566-434142df3757/e384a937574526eebeee4438132a0e882c0517bce4ca84db14497d598cd93a09", image="my-image", instance="172.22.106.15:10250", job="kubelet", metrics_path="/metrics/cadvisor", name="k8s_my-app_web-5dfd4896b4-x4nsr_my-app_b7ecf824-6797-4183-9566-434142df3757_6", namespace="my-app", node="node2", pod="web-5dfd4896b4-x4nsr"}	980393984
container_memory_working_set_bytes{container="my-app", endpoint="https-metrics", id="/kubepods/burstable/podb7ecf824-6797-4183-9566-434142df3757/e5c6aec2cc6cf7abbd2163cd195b181954df28fc134478e2cfd27283c2a7838f", image="my-image", instance="172.22.106.15:10250", job="kubelet", metrics_path="/metrics/cadvisor", name="k8s_my-app_web-5dfd4896b4-x4nsr_my-app_b7ecf824-6797-4183-9566-434142df3757_7", namespace="my-app", node="node2", pod="web-5dfd4896b4-x4nsr"}	915144704
container_memory_working_set_bytes{container="my-app", endpoint="https-metrics", id="/kubepods/burstable/podb7ecf824-6797-4183-9566-434142df3757/fcce62cda4d8300c82f4552ddd069b59e8de30c31ece187a6349fe786f182e7a", image="my-image", instance="172.22.106.15:10250", job="kubelet", metrics_path="/metrics/cadvisor", name="k8s_my-app_web-5dfd4896b4-x4nsr_my-app_b7ecf824-6797-4183-9566-434142df3757_8", namespace="my-app", node="node2", pod="web-5dfd4896b4-x4nsr"}	799879168

Notice how individual starts of the Pods are recorded for the same point in time - see the name labels ending _7, _8, _9, _10 for example. These are four instances of the same exact Pod but in reality they never ran at the same time (they're restarts). If I add together these values it will give 2.6GiB which is the number I saw in Grafana. I can confirm this from other graphs that memory usage on the nodes never registered a 2.6GiB increase but they saw 1GiB, which is my limit.

I use Amazon EKS with Kubernetes version v1.19.6-eks-49a6c0 which I believe uses cAdvisor v0.37.3.

@ZhangSIming-blyq
Copy link

ZhangSIming-blyq commented Mar 24, 2022

What's the issue going on? I met the same problem.
image
See the picture above, the overlapping parts means pod is restarting, and there are two metrics during that time which only have a difference in the field "id"(cgroup path), if I use Promesql like:

sum(rate(container_cpu_usage_seconds_total{xxx}[$interval])) by (pod)

To give a summary, the value will be double. Please tell me if you have already fixed it, thanks

@JoeAshworth
Copy link

Did you ever solve this? We're experiencing the exact same.

@ZhangSIming-blyq
Copy link

ZhangSIming-blyq commented Nov 29, 2022

I have solved this by change

sum(rate(container_cpu_usage_seconds_total{xxx}[$interval])) by (pod)

to

max(rate(container_cpu_usage_seconds_total{xxx}[$interval])) by (pod)

Now that I think it actually makes kind of sense...

@LeoHsiao1
Copy link

LeoHsiao1 commented May 10, 2023

When a container within a Pod terminates due to OOM, k8s will automatically create a new container based on the restartPolicy.
image

It seems that cAdvisor will cache the monitoring metrics of the old container for 4 minutes, so sum(container_memory_working_set_bytes{container!~"POD|", pod="xx"}) will calculate the total memory of the new container and the old container. It looks like this Pod is using twice as much memory as usual.
image

To avoid the above problem, I can use this query expression to monitor the memory overhead of each Pod.

container_memory_working_set_bytes{container='', pod!=''}

container='' is to query the root Cgroup node of each Pod, which has a resource overhead equal to the sum of the individual containers.

@mrliptontea
Copy link
Author

I never actually managed to find a solution, just whenever dealing with OOM kills I'm wary of this problem so never trust peak memory usages.

But I suppose something like sum(max by (container) (...)) would work for single- and multi-container (sidecars) pods.

@jtnz
Copy link

jtnz commented Nov 1, 2023

I think I'm seeing something similar. When using Karpenter (AWS EKS) for auto-scaling, Karpenter will add the label node.kubernetes.io/exclude-from-external-load-balancers as the node is about to go. This cause duplicate metrics to show up in these cadvisor/kubelet container metrics which is a nightmare when doing prometheus queries, especially if performing query operators (e.g. /, *, etc). Your query might be fine, then you'll hit a window where there's duplicates and have errors about needing group_left/group_right to deal with one-to-many or many-to-one metrics. Hack solution is to always do some aggregation, e.g. max, sum, or something of the like.

@jtnz
Copy link

jtnz commented Nov 2, 2023

It seems that cAdvisor will cache the monitoring metrics of the old container for 4 minutes

@LeoHsiao1 Just curious where you found this? I'm trying understand the situation better and if this is expected behavior or a bug. I tried to reproduce it locally by running a cadvisor container, having /metrics open, then starting and stopping an nginx container. When it stopped the metrics disappeared right away, so I'm not sure if this has the same behavior as cadvisor built into kubelet.

@LeoHsiao1
Copy link

@jtnz Hi
This 4 minutes is my speculation based on prometheus charts, and has no theoretical basis.

@renepupil
Copy link

renepupil commented Dec 7, 2023

How should I interpret a reporting for the same time and the same container but with DIFFERENT names?

Is the used memory the sum of these two (915144704 + 799879168) or the maximum of these two values (max(915144704, 799879168))?

My guess is the maximum and it's just a duplicate reporting with the same timestamp but different microseconds.
But another possibility would be a container restarting and both memories belong to "different" containers during overlapping, then a sum formula would be correct, as both containers actually use the reported memory bytes...

container_memory_working_set_bytes{container="my-app", endpoint="https-metrics", id="/kubepods/burstable/podb7ecf824-6797-4183-9566-434142df3757/e5c6aec2cc6cf7abbd2163cd195b181954df28fc134478e2cfd27283c2a7838f", image="my-image", instance="172.22.106.15:10250", job="kubelet", metrics_path="/metrics/cadvisor", name="k8s_my-app_web-5dfd4896b4-x4nsr_my-app_b7ecf824-6797-4183-9566-434142df3757_7", namespace="my-app", node="node2", pod="web-5dfd4896b4-x4nsr"}	915144704
container_memory_working_set_bytes{container="my-app", endpoint="https-metrics", id="/kubepods/burstable/podb7ecf824-6797-4183-9566-434142df3757/fcce62cda4d8300c82f4552ddd069b59e8de30c31ece187a6349fe786f182e7a", image="my-image", instance="172.22.106.15:10250", job="kubelet", metrics_path="/metrics/cadvisor", name="k8s_my-app_web-5dfd4896b4-x4nsr_my-app_b7ecf824-6797-4183-9566-434142df3757_8", namespace="my-app", node="node2", pod="web-5dfd4896b4-x4nsr"}	799879168

I appreciate your feedback.

@lerminou
Copy link

Same here, any news on this ?
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants