Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use external cloud provider for EKS Local deployments #1111

Merged
merged 1 commit into from
Jan 11, 2023

Conversation

vpineda1996
Copy link
Contributor

@vpineda1996 vpineda1996 commented Nov 23, 2022

Issue #, if available:

#1096

Description of changes:
This change introduces a new variable called KUBELET_CLOUD_PROVIDER in the kubelet systemd unit. It will be an environment variable, specified on the kubelet.d systemd config files. It will either be external for EKS local clusters or aws for regular EKS clusters.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

Testing Done

In order to check that all of the changes applied worked as intended I created a EKS Local cluster and a normal EKS cluster. Using make 1.22, I created an AMI with 1.22. (ami-041739dd141250db3)

EKS Local Cluster
  1. Used the following user-data to bootstrap worker node:
#!/bin/bash
set -o xtrace
/etc/eks/bootstrap.sh aws-REDACTED-provider-integ-vpineda1996-basic-5cc917cd --b64-cluster-ca REDACTED --apiserver-endpoint https://REDACTED.aws.dev:443 --enable-local-outpost true --cluster-id REDACTED --container-runtime containerd
  1. After making sure the node joined the cluster, I validated that the arguments in the kubelet indicated the cloud provider as aws.
sh-4.2$ hostname
ip-10-0-24-148.us-west-2.compute.internal
sh-4.2$ ps aux | grep kubelet
root      3283  0.9  1.3 1670144 109452 ?      Ssl  17:59   0:07 /usr/bin/kubelet --cloud-provider external --config /etc/kubernetes/kubelet/kubelet-config.json --kubeconfig /var/lib/kubelet/kubeconfig --container-runtime remote --container-runtime-endpoint unix:///run/containerd/containerd.sock --node-ip=10.0.24.148 --pod-infra-container-image=602401143452.dkr.ecr.us-west-2.amazonaws.com/eks/pause:3.5 --v=2 --bootstrap-kubeconfig /var/lib/kubelet/bootstrap-kubeconfig
ssm-user  7790  0.0  0.0 119428   964 pts/0    S+   18:12   0:00 grep kubelet
sh-4.2$ systemctl status kubelet
● kubelet.service - Kubernetes Kubelet
   Loaded: loaded (/etc/systemd/system/kubelet.service; enabled; vendor preset: disabled)
  Drop-In: /etc/systemd/system/kubelet.service.d
           └─10-kubelet-args.conf, 20-kubelet-cloud-provider.conf, 30-kubelet-extra-args.conf
   Active: active (running) since Wed 2022-11-23 17:59:07 UTC; 13min ago
     Docs: https://github.com/kubernetes/kubernetes
  Process: 3231 ExecStartPre=/sbin/iptables -P FORWARD ACCEPT -w 5 (code=exited, status=0/SUCCESS)
 Main PID: 3283 (kubelet)
    Tasks: 12
   Memory: 87.2M
   CGroup: /system.slice/kubelet.service
           └─3283 /usr/bin/kubelet --cloud-provider external --config /etc/kubernetes/kubelet/kubelet-config.json --kubeconfig /var/lib/kubelet/kubeconfig --container-ru...
sh-4.2$

bash-4.2# k get nodes
NAME                                        STATUS     ROLES                  AGE   VERSION
ip-10-0-24-105.us-west-2.compute.internal   Ready      <none>                 14m   v1.22.15-eks-fb459a0
ip-10-0-24-148.us-west-2.compute.internal   Ready      <none>                 14m   v1.22.15-eks-fb459a0
ip-10-0-24-55.us-west-2.compute.internal    NotReady   control-plane,master   19m   v1.22.10-eks-7dc61e8
ip-10-0-25-240.us-west-2.compute.internal   NotReady   control-plane,master   16m   v1.22.10-eks-7dc61e8
ip-10-0-25-247.us-west-2.compute.internal   NotReady   control-plane,master   23m   v1.22.10-eks-7dc61e8
ip-10-0-25-70.us-west-2.compute.internal    Ready      <none>                 14m   v1.22.15-eks-fb459a0
bash-4.2#
EKS Cluster
  1. Used the following user-data to bootstrap the EKS node:
sudo /etc/eks/bootstrap.sh --apiserver-endpoint 'https://REDACTED.gr7.us-west-2.eks.amazonaws.com' --b64-cluster-ca 'REDACTED' 'test-vpineda1996-cluster'
  1. After making sure the node joined the cluster, I validated that the arguments in the kubelet indicated the cloud provider as aws.
[ec2-user@ip-172-31-55-157 ~]$ ps aux | grep kubelet
root     13311  0.8  2.7 1823836 106448 ?      Ssl  17:15   0:05 /usr/bin/kubelet --cloud-provider aws --config /etc/kubernetes/kubelet/kubelet-config.json --kubeconfig /var/lib/kubelet/kubeconfig --container-runtime docker --network-plugin cni --node-ip=172.31.55.157 --pod-infra-container-image=602401143452.dkr.ecr.us-west-2.amazonaws.com/eks/pause:3.5 --v=2

image

See this guide for recommended testing for PRs. Some tests may not apply. Completing tests and providing additional validation steps are not required, but it is recommended and may reduce review time and time to merge.

@vpineda1996
Copy link
Contributor Author

If you need me to run additional tests please let me know!

@vpineda1996 vpineda1996 changed the title [DRAFT][DO NOT MERGE] Use external cloud provider for EKS Local deployments [DO NOT MERGE] Use external cloud provider for EKS Local deployments Nov 23, 2022
@vpineda1996 vpineda1996 marked this pull request as ready for review November 23, 2022 18:17
@bwagner5
Copy link
Contributor

How does the Node's provider ID get filled in with the external cloud-provider? I thought you had to pass the provider ID to kubelet when using the external CP?

@vpineda1996
Copy link
Contributor Author

vpineda1996 commented Nov 28, 2022

Nope, you don't need to pass in the provider ID. What the kubelet does need is the Node's private DNS name. The code path is as follows:

  1. CCM's cloud-node controller finds out that there is a new node and it starts "reconciling" it (for lack of a better word)
  2. CCM Notices that there is no ProviderId so it calls GetInstanceProviderId method that delgates to the AWS cloud provider
  3. Inside the AWS cloud provider, we will try to get the Provider ID through the Node object but (obviously) fail to do so as the node has no Provider ID set to it yet. So we will fallback to using the private EC2 DNS name of the node using a DescribeInstances call

Note that there is a big "IF" condition for all this logic to work out of the box, specifically on the function that gets a node's private DNS name:

// mapNodeNameToPrivateDNSName maps a k8s NodeName to an AWS Instance PrivateDNSName
// This is a simple string cast

// It is not safe to assume so for --cloud-provider=external kubelets. Because
// then kubelet dictates its own node name with its OS hostname (or
// --hostname-override) and that hostname won't always be private DNS name.
// This AWS cloud provider can initialize a node so long as the node's name
// satisfies its InstanceID implementation, i.e. as long as the instance id can
// be derived from the node name.

However, for EKS AMI we do have control over the --node-ip that we assign to the k8s node and the hostname comes from AL2. Hence this use-case works perfectly for all of EKS Local deployments.

@vpineda1996 vpineda1996 changed the title [DO NOT MERGE] Use external cloud provider for EKS Local deployments Use external cloud provider for EKS Local deployments Dec 12, 2022
@cartermckinnon
Copy link
Member

I'm fine with merging this given it only applies to Outpost, but @vpineda1996 can you change the description so that #1096 isn't closed by this? Or update that issue to be specific to Outpost?

@vpineda1996
Copy link
Contributor Author

I will leave #1096 open because it's still a valid issue as EKS should support the new cloud manager eventually.

@cartermckinnon cartermckinnon merged commit b95c3e6 into awslabs:master Jan 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants