The k8s API server's cert will expire every year, and will cause OpenPAI cluster not available. For more details, please refer to Certificate Management with kubeadm.
If the admin setup the cert expiration checker in alert manager, the admin email receiver will got the alert email before the cert expired.
If the cert already expired, the user will got the error in web portal:
On the master node use the following commands generate the new certificates:
# On master - See https://kubernetes.io/docs/setup/certificates/#all-certificates
sudo kubeadm alpha certs renew apiserver
sudo kubeadm alpha certs renew apiserver-etcd-client
sudo kubeadm alpha certs renew apiserver-kubelet-client
sudo kubeadm alpha certs renew front-proxy-client
On the master node use the following commands generate the new kube-configs:
sudo kubeadm alpha kubeconfig user --org system:masters --client-name kubernetes-admin > admin.conf
sudo kubeadm alpha kubeconfig user --client-name system:kube-controller-manager > controller-manager.conf
sudo kubeadm alpha kubeconfig user --org system:nodes --client-name system:node:$(hostname) > kubelet.conf
sudo kubeadm alpha kubeconfig user --client-name system:kube-scheduler > scheduler.conf
# chown and chmod so they match existing files
# please replace <user> to your current user name (e.g. root, core)
sudo chown <user> {admin,controller-manager,kubelet,scheduler}.conf
sudo chmod 600 {admin,controller-manager,kubelet,scheduler}.conf
# Move to replace existing kubeconfigs
sudo mv admin.conf /etc/kubernetes/
sudo mv controller-manager.conf /etc/kubernetes/
sudo mv kubelet.conf /etc/kubernetes/
sudo mv scheduler.conf /etc/kubernetes/
# Restart the master components
sudo kill -s SIGHUP $(pidof kube-apiserver)
sudo kill -s SIGHUP $(pidof kube-controller-manager)
sudo kill -s SIGHUP $(pidof kube-scheduler)
# Verify master component certificates - should all be 1 year in the future
# Cert from api-server
echo -n | openssl s_client -connect localhost:6443 2>&1 | sed -ne '/-BEGIN CERTIFICATE-/,/-END CERTIFICATE-/p' | openssl x509 -text -noout | grep Not
# Cert from controller manager
echo -n | openssl s_client -connect localhost:10257 2>&1 | sed -ne '/-BEGIN CERTIFICATE-/,/-END CERTIFICATE-/p' | openssl x509 -text -noout | grep Not
# Cert from scheduler
echo -n | openssl s_client -connect localhost:10259 2>&1 | sed -ne '/-BEGIN CERTIFICATE-/,/-END CERTIFICATE-/p' | openssl x509 -text -noout | grep Not
On the master node use the following commands generate the new kubelet.conf file:
sudo kubeadm alpha kubeconfig user --org system:nodes --client-name system:node:$(hostname) > kubelet.conf
# please replace <user> to your current user name (e.g. root, core)
sudo chown <user> kubelet.conf
sudo chmod 600 kubelet.conf
# Stop kubelet
sudo systemctl stop kubelet
# Delete files
sudo rm /var/lib/kubelet/pki/*
# Copy file
sudo mv kubelet.conf /etc/kubernetes/
# Restart
sudo systemctl start kubelet
# Uncordon
kubectl uncordon $(hostname)
# Check kubelet
echo -n | openssl s_client -connect localhost:10250 2>&1 | sed -ne '/-BEGIN CERTIFICATE-/,/-END CERTIFICATE-/p' | openssl x509 -text -noout | grep Not
On the master node use the following commands generate the new token:
sudo kubeadm token create
Use a playbook to do update the certs in batch. Create a file named renew-worker-certs.yaml
and replace <The generated token in above step>
:
---
- hosts: all
tasks:
- name: join k8s
shell: |
systemctl stop kubelet
rm /etc/kubernetes/kubelet.conf
rm /var/lib/kubelet/pki/*
sed -i "s/token: .*/token: <The generated token in above step>/" /etc/kubernetes/bootstrap-kubelet.conf
systemctl start kubelet
If you don't have the hosts.yml
file, please run the commands in OpenPAI source code to generate one:
contrib/kubespray/script/k8s_generator.py -l layout.yaml -c config.yaml -o <output_folder>
And run the following commands:
ansible-playbook -i hosts.yml --limit '!master-node' --become --become-user root renew-worker-cert.yaml
Delete the token on master node, The token will expire in 24h if we don't do this step.
# On master node
sudo kubeadm token delete TOKEN-FROM-CREATION-ON-MASTER