-
Notifications
You must be signed in to change notification settings - Fork 748
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
aws-node pod does not start correctly the first time #1702
Comments
Can you please share the node logs? You can run this script sudo bash /opt/cni/bin/aws-cni-support.sh. |
Hi, |
I'd suggest reading https://medium.com/keikoproj/rapid-auto-scaling-on-eks-part-1-bb4de84fc599 - it might be that kube-proxy hasn't started yet by the time the CNI tries to start, in which case it can't connect to the control plane. |
Yes as @backjo mentioned, kube-proxy is taking time here Aws-node successfully started at -
An option is to set "--bindAddress" to 127.0.0.1 and this determines the address family as v4 and kube-proxy won't wait for getting the node IP but this will break for v6. ref : #1078 (comment) and https://gist.github.com/M00nF1sh/84d380b4e08017a5bc958658f7010914. We are working on including this in the default kube-proxy manifest. |
@xtroncode - Did the above workaround work for you? |
Hi @jayanthvn, I haven't been able to try it out yet. Will try it out today and let you know. Thanks. |
Hi @jayanthvn , I tried setting |
Hi @xtroncode, I assume you are using kube-proxy managed add on? We are working on making it part of the default manifest. |
Ok..thanks. Any workaround for this until it is added as default? |
@xtroncode - workaround for now is to include this -
|
Will close this issue for now, please reach out if you need any more information. |
Hi @jayanthvn - Is there any update on when this will be included as default? Thanks |
@ChrisRamsayITV - I will check with the team and get back to you next week. |
Hi @jayanthvn is there any update on when |
The EKS addon for kube-proxy introduced regressions of #124 and #209. We will apply the recommended overrides from aws/containers-roadmap#657 and aws/amazon-vpc-cni-k8s#1702 in the manifests until the EKS addon applies these by default or allows you to override config in the addon.
The EKS addon for kube-proxy introduced regressions of #124 and #209. We will apply the recommended overrides from aws/containers-roadmap#657 and aws/amazon-vpc-cni-k8s#1702 in the manifests until the EKS addon applies these by default or allows you to override config in the addon.
@jayanthvn Any news on that ? Some topics we can vote for to push the fix ? :) |
1.22 default kube-proxy manifest will have this change. Release calendar can be found here - https://docs.aws.amazon.com/eks/latest/userguide/kubernetes-versions.html |
What happened:
When a new node is started in a nodegroup the node takes a lot of time to be marked as Ready because aws-node (cni) pod does not start correctly the first time and has to undergo 1-2 restarts. Also the restarts are delayed because the initial delay for liveness probe is set to 60 sec. If we are increasing the failure threshold for liveness then the aws-node pod is not marked ready even after 7-8 minutes (may even be longer, it does not seem to run at all as readiness probes never succeed).
What you expected to happen:
We expect the nodes to be ready in under a minute
How to reproduce it (as minimally and precisely as possible):
We are facing this on a newly created eks cluster so should be easily reproducable
Environment:
Kubernetes version (use
kubectl version
):Server Version: version.Info{Major:"1", Minor:"21+", GitVersion:"v1.21.2-eks-0389ca3", GitCommit:"8a4e27b9d88142bbdd21b997b532eb6d493df6d2", GitTreeState:"clean", BuildDate:"2021-07-31T01:34:46Z", GoVersion:"go1.16.5", Compiler:"gc", Platform:"linux/amd64"}
CNI Version : v1.9.1-eksbuild.1
OS (e.g:
cat /etc/os-release
): Amazon Linux 2Kernel (e.g.
uname -a
): 5.4.149-73.259.amzn2.x86_64The text was updated successfully, but these errors were encountered: