-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
k8sattributesprocessor missing to consistently fetch pod attributes #13119
Comments
Hi @rajatvig. Thanks for reporting the issue. It's possible that the issue is caused by #8465 . It changed representation of the internal cache keys. @rajatvig if you can provide any more guidance on how to reproduce the issue, that would be awesome. @sumo-drosiek please let us know if you have a chance to look into this. Otherwise I can take it. |
@dmitryax What we have a very straightforward Otel Collector Setup with Collectors running as DaemonSets with passThrough on, which send to collectors running as a Gateway that add the rest. The RBAC is as documentation and there is very little load in some clusters. Podinfo is the sample application we have deployed and we see misses for one of the two pods almost all the time until we restart the collector or the pod. The configuration as I noted is just
I suspect it is the internal cache as I see the requests for both the pods augmentation come (in different collectors) and one is always found and the other never is. For larger clusters, the miss rate is higher and is spread across all gateway collector pods. |
Thank you for providing the details, @rajatvig . I will take a look into this |
@dmitryax I was off until Today and will be off next week, so not much time to investigate :/ |
I can confirm this issue and we had to convert back to 0.54 as probably 2/3 off our traces didn't have k8s attributes attached. |
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping Pinging code owners: See Adding Labels via Comments if you do not have permissions to add labels yourself. |
This is fixed on our end. |
@rajatvig |
We are on 0.64 now and the issue isn't there anymore. |
This issue appears to crop up even for the latest versions. Should this be closed? |
Describe the bug
A clear and concise description of what the bug is.
Seems like post 0.55.0 release of the Collector, the k8sattributesprocessor is behaving a little inconsistently when it comes to capturing resource metrics from pods. It was working fine in the 0.54.0 release.
Looking at the commit log between the 2 releases - #8465 - is the only one I can see that could have changed the behaviour but I am unable to pin point at what exactly is wrong.
For the a deployment with 2 pods, one collector always looks up the metadata for one pod and the other pod goes to another collector but does not get anything.
Restarting the collector pod or the deployment pod moves the problem to another collector pod and it just keeps bouncing that way.
Turning on the debugging logs, I do see the requests land at the collector but it is not able to get it from the in memory list. RBAC is fine as some pods are getting the attributes and the configuration did not change between the releases.
Tried adding pod_association rules but that did not help either.
What version did you use?
Version: 0.55.0
We upgraded to 0.57.2 and it has not helped though the misses are slightly fewer.
What config did you use?
Environment
Using it on GKE as vanilla upstream docker image.
The text was updated successfully, but these errors were encountered: