-
Notifications
You must be signed in to change notification settings - Fork 748
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Couldn't get resource list for external.metrics.k8s.io/v1beta1 #486
Comments
This appears to be happening in 1.5 too. |
I noticed that after installing the CloudWatch adapter ( |
Same issue with 1.14 |
Same issue with |
Any updates on this issue? |
What version of the CNI?
I'd recommend updating to v1.5.5. |
Happens on v1.5.5 too, but does not seem to affect anything. |
1.6 is out. Has anyone verified if this issue is resolved? |
I've deployed version 1.6 on our clusters and it seems to have been resolved! Maybe someone can verify this for their situation? |
Just updated the plugin in all 4 of my clusters and I can confirm 1.6 resolves this issue |
Undersigned. All my clusters no longer present the issue with 1.6. This ticket can be closed. |
Thanks for the feedback, folks! Closing out. |
I'm seeing this when going from 1.6.1 to 1.6.2 and rolling back to 1.6.1 seemed to have no effect I've tried extending the probes as described here #872 as I noticed the same timeout 1s issue, but same effect. Theres no eksctl here.
|
Identified our issue was our IAM role with the CNI policy has an incorrect target, thus was unable to use that role/permissions. |
We're still getting this on 1.6.3. Any ideas? |
I am using 1.6.3 and for me as well the issue persists. And only happening on my tainted nodes |
@nesl247 and @anupash147 is ipamd not starting up at all on these affected nodes or does it eventually starts up? |
@mogren lets reopen this and investigate the issue further. I'm not sure on what we actually did to mitigate the issue |
We have to understand two things
Not sure if (1) affects (2), but (1) might give some hints for answering (2). |
For 1, it would probably be one of 2 things:
@SaranBalaji90 It is starting. This doesn't seem to cause any issues that we can tell. |
This error ( I agree that vpc-cni doesn't actually need this information, which is why the OP found that vpc-cni continued just fine despite this "error" message. |
We have the same issue, and for us, the v1beta1.external.metrics.k8s.io apiservice has been created for the datadog cluster agent. I'm not sure where to go from here. |
Just ignore the "error"? It's purely noise. (I think the "failed to start" issue above was found to be unrelated?) Confusion is bad and we should absolutely fix the code to not report this "error" - which will require a change to client-go. ... but at the same time, there's no actual functionality impact here, so it's going to be a low priority for everyone. If you have this error yourself, and you want it to go away: Your best bet is going to be to fix the metrics service (or whatever aggregated api server is reported in the error) or remove the relevant apiservice registration ( |
Just as an FYI, the metrics services, in our case metrics-server and datadog, are not reporting any errors just in case this is ever determined to actually be an issue. |
It will be really helpful if someone can paste the output of For example,
|
Error indicates that there are no custom metrics and the resource list for that was empty. As @anguslees pointed out, shouldn't have any functional impact and I see this error log issue for empty resource list is addressed/removed in K8S 1.14+. Interested to know your K8S version @hden @homme @nesl247 @anupash147 |
For us it's been 1.14-1.17. |
From #1055 (comment) looks like reading docker socket is taking time. Would like to see what others are seeing.. |
If anyone is seeing If the |
same issue on EKS
EDIT: For me the errors are caused by outdated aws-cni deployments. You might have an outdated version (like me), check here and update your cluster accordingly. This solved the error and now everything works okay. |
Since issue #950 has been closed, will update here.
Our deployment file is latest according to example for 1.7 config |
I'm seeing this in CNI version 1.7.5 also:
There is no such resource as Interestingly I don't see this error on a newer 1.17 cluster. Only on 1.17 cluster that has existed since 1.14. |
@max-rocket-internet I'm also seeing something like you, but in my case it is a 1.17 created cluster.
Kubernetes version: aws-node: kube-proxy: |
CauseThe issue happens when the external metrics API is enabled (likely prometheus adaptor deployed) but no external metric declared. This is a totally valid scenario but is not handled correctly. You can check if you're in this situation by running the following command and checking that the
This behaviour comes from the k8s/client-go and was fixed in november 2018 by this commit: kubernetes/client-go@5865254#diff-8cc59cc0342ea488b3db0c5a07f4bffda3edb171ca58c6fc0265d10e515018dfL134
Line 50 in 0d5bc7e
How to fixTo fix this issue one has to update the k8s/client-go to a version that contains the commit mentioned earlier. WorkaroundIf you don't want the prometheus adapterProperly remove the prometheus-adapter chart, it will unregister external and custom metrics apis from the apiserver. If you do want the prometheus adpaterWe don't control the CNI version AWS deploys when we request an EKS cluster, so as long as the library is not updated we'll have to live with it. The easiest way to make sure the error stops popping is to declare a dummy external metric. For example with the prometheus-adapter chart one could do:
|
Client-go is updated and operator-sdk dependency is removed - #1419. Release 1.8 has these fixed. Closing the issue for now please try and let us know if the issue still exists. |
Image:
602401143452.dkr.ecr.ap-northeast-1.amazonaws.com/amazon-k8s-cni:v1.4.1
Symptoms
All nodes are ready and healthy. Secondary IPs are begin allocated. But the
aws-node
pod periodically prints the following error message.Why is it trying to access the external metrics anyway?
Extra informations
The text was updated successfully, but these errors were encountered: