-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
amazonvpc is not working with Ubuntu 22.04(Jammy) #15720
Comments
Thank you, I got it. |
Let's keep it open for some time. |
This still seems to be happening with: ubuntu/images/hvm-ssd/ubuntu-jammy-22.04-amd64-server-20230919 |
@colathro this will continue happening until someone from AWS fixes aws/amazon-vpc-cni-k8s#2103. |
It also prevents new kops cluster with networking=amazonvpc to come up healthy. In my case core-dns-xx and ebs-csi-node pods kept crashing. For core-dns the log read: plugin/error timeout when trying to connect to Amazon Provided DNS server. For ebs-csi-node the error was related to unable to get the Node (was trying on 100.64. - not sure why). The workaround is to use 20.04 image instead. The error messages are so cryptic that it took me a while to figure out. |
I ran into this as well while upgrading a test cluster from kubernetes 1.26.5 to 1.27.6 using kops 1.28. The error from ebs-plugin container of the ebs-csi-node running on Ubuntu 22.04 is shown below. Reverting the node images to Ubuntu 20.04 (ubuntu-focal-20.04-amd64-server-20230502) allowed the rolling-restart with --cloudonly to cleanly restart the affected control-plane nodes.
|
Can't kops work around this issue by simply NOT updating to Ubuntu 22.04 for instances running in AWS? Seems silly to keep breaking everyone's clusters like this. |
kOps is not just about clusters using AWS VPC CNI. All other CNIs and components work fine with Ubuntu 22.04. Probably it is a good idea to add something that locks clusters with AWS VPC CNI to Ubuntu 20.04. |
It worth marking it as un-stable (https://github.com/kubernetes/kops/blob/master/docs/operations/images.md) as we tried to upgrade ubuntu version and faced issues in the cluster |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues. This bot triages un-triaged issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
/remove-lifecycle stale |
/cc @moshevayner |
This is related to #16255 |
/assign |
/kind bug
**1. What
kops
version are you running? The commandkops version
, will displayI tried the same thing with the master branch (a8fa895).
2. What Kubernetes version are you running?
kubectl version
will print theversion if a cluster is running or provide the Kubernetes version specified as
a
kops
flag.3. What cloud provider are you using?
AWS
4. What commands did you run? What is the simplest way to reproduce this issue?
5. What happened after the commands executed?
A Kubernetes cluster is created, but some pods are not working. So
kops validate
command fails.For example, cert-manager and ebs-csi-node say errors.
ebs-csi-node:
cert-manager-webhook:
6. What did you expect to happen?
All pods work fine, and
kops validate
succeed.7. Please provide your cluster manifest. Execute
kops get --name my.example.com -o yaml
to display your cluster manifest.You may want to remove your cluster name and other sensitive information.
8. Please run the commands with most verbose logging by adding the
-v 10
flag.Paste the logs into this report, or in a gist and provide the gist link here.
9. Anything else do we need to know?
If I specify
cilium
as CNI innetworking
, it works fine (I tried Cilium with ENI, and it also works fine).If I change the image to Ubuntu-20.04 (I tried
099720109477/ubuntu/images/hvm-ssd/ubuntu-focal-20.04-amd64-server-20211015
), it works fine withamazonvpc
.In conclusion, I suspect the combination of
amazonvpc
and Ubuntu 22.04.The text was updated successfully, but these errors were encountered: