This guide walks you through on the standard Kubernetes installation with kubeadm and mainly focuses to official Kubernetes documentation.
It is already assumed that you are familiar a bit with Linux and have basic understanding about Kubernetes.
Beside the standard installation it also covers the following components:
- Install Calico as CNI plugin
- Install MetalLB as bare metal Loadbalancer
- Configure NFS share on the master node and deploys the NFS volume provisioner
The guide just gives an educational overview about the different components and can be used only for learning or lab purposes but far from production environment! Really if you found this guide via Google, don't use it to setup your production cluster!
This guide will show 1 master and 1 worker node hence it requires 2 machines. Either they are bare metal machines or virtual machines.
If you deploy VMs then follow this minimal specification per VM:
- 2 vCPU (in Azure B2s is perfect)
- 4 GB RAM
- 30 GB vHDD
- Ubuntu 22.04 LTS (go with this version)
- Shared subnet
- SSH access
If you deploy into a cloud provider like Azure then it is better if you deploy into a dedicated vNet so we can avoid to mess up something.
Once you have your VMs/BMs then install Ubuntu 22.04 and setup networking with fixed IP addresses. Then follow the following steps to prepare the nodes.
Execute these steps on BOTH machines!
- Use your favorite SSH client (like Putty) and login to both VMs.
- Do a network design. You have a subnet where you deployed your BMs/VMs. In my case this is 192.168.0.0/24.
- Your BMs and VMs have their own IP address. Note them (you can also use the
ip a
command to get the IPs) - My Master node's IP is 192.168.0.128
- My Worker node's IP is 192.168.0.129
- We will also need some IP addresses to our exposed services. I reserve this range: 192.168.0.140 - 192.168.0.149
- Note that if you deployed your VMs in cloud then these services won't be automatically available and won't work in the vNet but that is okay in this lesson.
- Your BMs and VMs have their own IP address. Note them (you can also use the
- Create environmental variables based on the IP addresses what we found out.
- Create the "MasterIP" with the Master node's IP address
MasterIP="192.168.0.128"
- Create the "MasterName" and add a readable name for it
MasterName="kube-master"
- Create the "IngressRange" and add the IP range what we reserved to the K8s services
IngressRange="192.168.0.140-192.168.0.149"
- Create the "NFSCIDR" with the subnet CIDR
NFSCIDR="192.168.0.128/29"
- Create the "PodCIDR" with a random subnet CIDR which doesn't overlap with your network
PodCIDR="172.16.0.0/16"
- Create the "ServiceCIDR" with a random subnet CIDR which doesn't overlap with your network
ServiceCIDR="172.17.0.0/16"
- Finally create the "K8sVersion" which contains the Kubernetes version what we wish to install
K8sVersion="v1.28"
- Create the "MasterIP" with the Master node's IP address
- Activate auto service restart without notification and install updates on both machines:
sudo sed -i 's/#$nrconf{restart} = '"'"'i'"'"';/$nrconf{restart} = '"'"'a'"'"';/g' /etc/needrestart/needrestart.conf sudo apt update sudo apt upgrade -y
- Install basic tools
sudo apt install -y \ apt-transport-https \ ca-certificates \ gnupg2 \ lsb-release \ mc \ curl \ software-properties-common \ net-tools \ nfs-common \ dstat \ git \ curl \ htop \ nano \ bash-completion \ vim \ jq
- Kubelet doesn't properly work with memory swap and also blocks the standard installation hence we need to disable it.
sudo sed -i '/swap/ s/^\(.*\)$/#\1/g' /etc/fstab sudo swapoff -a
- In production environment we would have a nice DNS but here we just update the /etc/hosts file so the nodes will be able to resolve the master node's name and IP.
echo "$MasterIP $MasterName" | sudo tee -a /etc/hosts
- For the overlay networking we need 2 kernel module what we need to load and also make it permanent hence we add them to the /etc/modules file
sudo modprobe overlay sudo modprobe br_netfilter echo overlay | sudo tee -a /etc/modules echo br_netfilter | sudo tee -a /etc/modules
- Some kernel fine-tuning is also needed to make the networking working properly.
sudo tee /etc/sysctl.d/kubernetekubs.conf<<EOF net.bridge.bridge-nf-call-ip6tables = 1 net.bridge.bridge-nf-call-iptables = 1 net.ipv4.ip_forward = 1 fs.inotify.max_user_instances=524288 EOF sudo sysctl --system
At this point the Linux is prepared to install the Kubernetes cluster related components.
Kubernetes supports several container runtime. This guide will install and configure containerd as one of the most widely used runtime. Containerd is free and opensource.
Execute these steps on BOTH machines!
- Add the repository and install it
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg echo \ "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu \ $(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null sudo apt update sudo apt install -y containerd.io
- Containerd comes with a default configuration but that doesn't fully good for us. Both systemd and containerd can manage the cgroups but it is better to have a single manager. We need to leave the cgroup management to systemd and change this configuration in containerd's config file. First we generate a default config file then we change the exact parameter and then we reload the agent.
sudo bash -c "containerd config default > /etc/containerd/config.toml" sudo sed -i "s+SystemdCgroup = false+SystemdCgroup = true+g" /etc/containerd/config.toml sudo systemctl daemon-reload sudo systemctl restart containerd sudo systemctl enable containerd
Containerd is up and running. It is time to install the Kubernetes components. Note, each K8s minor version has its own repository.
Execute these steps on BOTH machines!
- Add the repository and install the 3 tools
curl -fsSL https://pkgs.k8s.io/core:/stable:/${K8sVersion}/deb/Release.key | sudo gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg echo "deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/${K8sVersion}/deb/ /" | sudo tee /etc/apt/sources.list.d/kubernetes.list sudo apt update sudo apt -y install kubelet kubeadm kubectl
- We also would like to avoid the accidentally updates and container restarts hence we put these tools on hold. (If you run your daily
apt upgrade
command then it will ignore these components. Because you run it daily, RIGHT??? :) )sudo apt-mark hold kubelet kubeadm kubectl
- There is an auto completion tool for kubectl which is a must have. (Note that one of the command will fail which is okay)
echo 'source <(kubectl completion bash)' >> /home/*/.bashrc echo 'source <(kubectl completion zsh)' >> /home/*/.zshrc
These were the last components what we needed on all machines. Now we put the workers to the parking lane and focus on the Master node.
Helm is a package manager tool what we will use for some components to the easy deployments hence we need to install it on the Master node.
Execute these steps on only the Master node!
- Add the repository and install the tool.
curl https://baltocdn.com/helm/signing.asc | gpg --dearmor | sudo tee /usr/share/keyrings/helm.gpg > /dev/null echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/helm.gpg] https://baltocdn.com/helm/stable/debian/ all main" | sudo tee /etc/apt/sources.list.d/helm-stable-debian.list sudo apt-get update sudo apt-get install -y helm
In this section we install the Kubernetes core components with the help of kubeadm.
Execute these steps on only the Master node!
-
Enable kubelet just to be sure (on the Worker nodes the kubeadm will restart it so it is not needed there).
sudo systemctl enable kubelet
-
Pull the base images for the K8s components
sudo kubeadm config images pull
-
Create a configuration file which containers the ServiceCIDR, the PodCIDR and sets the systemd as cgroup driver. This config file will be used only for the installation.
cat << EOF > kubeadm.conf kind: ClusterConfiguration apiVersion: kubeadm.k8s.io/v1beta3 networking: dnsDomain: cluster.local serviceSubnet: $ServiceCIDR podSubnet: $PodCIDR controlPlaneEndpoint: $MasterName --- kind: KubeletConfiguration apiVersion: kubelet.config.k8s.io/v1beta1 cgroupDriver: systemd --- apiVersion: kubeproxy.config.k8s.io/v1alpha1 kind: KubeProxyConfiguration mode: "ipvs" ipvs: strictARP: true EOF
-
Initialize Kubernetes
sudo kubeadm init --config kubeadm.conf
-
Copy the kubeconfig file to your own home folder
mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config
-
HURRAY, K8s is installed. Let's try it out
kubectl get nodes
AHHH it is NotReady :( ... no worries we will correct it in the next chapter.
-
Kubernetes by default will taint the master node so it will run only the core containers but we also would like to use it for normal workloads too (see, I told you it is not production grade) hence we need to remove the taints. (Note, one of them might fail which is normal.)
kubectl taint nodes --all node-role.kubernetes.io/control-plane-
As the Kubernetes API is already working hence we can connect the other VM to the cluster. If you check the printout of the kubeadm command then you can see that there is kubeadm join ...
command. Copy it and paste it to the Worker node. Note that you need to run it as sudo.
-
If you missed the kubeadm printout then you can generate a connection token with this command. Run this on the Master node:
kubeadm token create --print-join-command
-
Copy the connection command and paste it with sudo into the Worker node.
-
On the master node check the nodes.
kubectl get nodes
Still not ready but that is still okay.
A Kubernetes cluster requires a network module to give IP address to the pods. We will use Calico with a very simple configuration. Calico supports different networking methods and here we will go with a simple Overlay network. Check Calico's installation guide here: https://docs.tigera.io/calico/latest/getting-started/kubernetes/quickstart
Execute these steps on only the Master node!
-
Check your pods. Coredns shall be in Pending state
kubectl get pods -A
-
Deploy Calico directly from the internet
kubectl create -f https://raw.githubusercontent.com/projectcalico/calico/v3.26.1/manifests/tigera-operator.yaml
Yes, correct. Calico runs on top of the Kubernetes cluster as a container and it gives the networking feature to Kubernetes. It is so much fun here :)
-
Download the default config file and modify the PodCIDR to our CIDR. Then create this CRD in Kubernetes
curl https://raw.githubusercontent.com/projectcalico/calico/v3.26.1/manifests/custom-resources.yaml -s -o /tmp/custom-resources.yaml sed -i "s+192.168.0.0/16+$PodCIDR+g" /tmp/custom-resources.yaml sed -i "s+blockSize: 26+blockSize: 24+g" /tmp/custom-resources.yaml kubectl create -f /tmp/custom-resources.yaml rm /tmp/custom-resources.yaml
-
Wait until all your pods are coming to Running state. (Note, press CTRL + c to stop watch mode.)
watch kubectl get pods -A
-
Check your nodes
kubectl get nodes
HURRAY, your cluster and nodes are finally Ready It was a long run ... but still not complete.
There are different options to expose K8s services to the external world. If we would like to use the LoadBalancer type on an on-premise environment then we need a (Software based) loadbalancer. In this case the service will get a routable IP address from the subnet.
Note that this won't work out of the box with Public Cloud environment as the IP addresses won't be registered into the vNet but that is a different story.
Needless to say now but execute these steps on only the Master node!
-
We can use Helm to deploy the MetalLB so add its repo
helm repo add metallb https://metallb.github.io/metallb
-
We need to create and prepare its namespace with some labels
kubectl create ns metallb-system kubectl label namespace metallb-system pod-security.kubernetes.io/enforce=privileged kubectl label namespace metallb-system pod-security.kubernetes.io/audit=privileged kubectl label namespace metallb-system pod-security.kubernetes.io/warn=privileged kubectl label namespace metallb-system app=metallb
-
Deploy MetalLB
helm install metallb metallb/metallb -n metallb-system --wait \ --set crds.validationFailurePolicy=Ignore
Note that the last option is just needed because of a bug in MetalLB. See more details here: metallb/metallb#1597
-
We also need to configure MetalLB
cat <<EOF | kubectl apply -f - apiVersion: metallb.io/v1beta1 kind: IPAddressPool metadata: name: local-pool namespace: metallb-system spec: addresses: - $IngressRange EOF cat <<EOF | kubectl apply -f - apiVersion: metallb.io/v1beta1 kind: L2Advertisement metadata: name: advertizer namespace: metallb-system EOF
This is way far from production grade as NFS doesn't provide proper authentication. The only thing is to restrict access to subnets (what we grant for the whole subnet). Nevertheless it is an easy to configure Persistent Volume driver for a lab.
We will use the Master node as the storage provider.
Needless to say now but execute these steps on only the Master node!
-
Install NFS server
sudo apt install -y nfs-kernel-server
-
Create a local folder and set to read-writeable to anybody
sudo mkdir -p /mnt/k8s-pv-data sudo chown -R nobody:nogroup /mnt/k8s-pv-data/ sudo chmod 777 /mnt/k8s-pv-data/
-
Add the folder to the /etc/exports file
sudo tee -a /etc/exports<<EOF /mnt/k8s-pv-data ${NFSCIDR}(rw,sync,no_subtree_check) EOF
-
Reload the config file and restart the NFS service
sudo exportfs -a sudo systemctl restart nfs-kernel-server
With this step the folder and NFS is prepared on the node.
-
Check the StorageClasses on the K8s cluster. (Yeap it shall be empty)
kubectl get sc
-
We will deploy the NFS provisioner (CSI) plugin with Helm hence we need to add the repository
helm repo add nfs-subdir-external-provisioner https://kubernetes-sigs.github.io/nfs-subdir-external-provisioner/
-
Install the NFS provisioner and configure the NFS server (the Master node)
helm install nfs-subdir-external-provisioner nfs-subdir-external-provisioner/nfs-subdir-external-provisioner \ -n kube-system \ --set nfs.server=$MasterIP \ --set nfs.path=/mnt/k8s-pv-data \ --set storageClass.name=default \ --set storageClass.defaultClass=true
-
Check the StorageClasses again
kubectl get sc
Now you can create Persistent Volumes and then the Provisioner will automatically create a folder on the Master node and seamlessly attach it to your pods.
We can collect very useful metrics from the nodes and the pods but this requires a service to collect them and make it visible to use.
Needless to say now but execute these steps only on the Master node!
-
Add the Helm repo
helm repo add metrics-server https://kubernetes-sigs.github.io/metrics-server/
-
Deploy it via Helm
helm upgrade --install metrics-server metrics-server/metrics-server \ --set args={--kubelet-insecure-tls} \ -n kube-system
Note that we need to use the --kubelet-insecure-tls extra argument because of the self-signed certificates on the Kubernetes API side.
-
It takes some time to the metrics server to catch up but in few minutes we can see some result
kubectl top nodes
You are doing great so far but very likely here comes the hardest part ... install Nvidia drivers on Linux. If you deploy the cluster on your own machine which has an Nvidia GPU or you use an equivalent VM from the cloud then you need to install the driver to the host and the Kubernetes Device Plugin to handle the extra card. You can install them by hand one by one or you can use the Nvidia GPU Operator which does everything for you.
I suggest to take Option-1 but if you prefer to control your deployment then you can go with Option-2 as well.
Read more details here: https://github.com/NVIDIA/gpu-operator
And here: https://docs.nvidia.com/datacenter/cloud-native/gpu-operator/latest/index.html
- Add the Nvidia repository to
helm repo add nvidia https://nvidia.github.io/gpu-operator helm repo update
- Deploy the Operator
helm upgrade \ --install \ nvidia-operator \ nvidia/gpu-operator \ -n kube-system \ --set operator.defaultRuntime="containerd" \ --set driver.usePrecompiled="true" \ --set driver.version="535" \ --wait
- Wait until all components are up and running. (Note, press CTRL + c to stop watch mode.)
watch kubectl get pods -n kube-system -l app.kubernetes.io/managed-by=gpu-operator
Sometimes you might face that the Operator cannot download the proper images or cannot install it because you have Secure Boot enabled. The best solution here is to install the GPU driver by our own and then use the GPU Operator for the rest.
- Install the Nvidia Driver
sudo apt install -y nvidia-driver-535
- It is adviced to reboot now. If you have Secure Boot enabled then you MUST reboot.
sudo reboot
- Add the Nvidia repository to
helm repo add nvidia https://nvidia.github.io/gpu-operator helm repo update
- Deploy the Operator
helm upgrade \ --install \ nvidia-operator \ nvidia/gpu-operator \ -n kube-system \ --set operator.defaultRuntime="containerd" \ --set driver.enabled="false" \ --wait
- Wait until all components are up and running. (Note, press CTRL + c to stop watch mode.)
watch kubectl get pods -n kube-system -l app.kubernetes.io/managed-by=gpu-operator
Well, installing the Nvidia driver on Linux is not the easiest task. Hence these steps might not lead you to the full success.
-
Install the GPU driver
sudo apt install -y \ nvidia-driver-535 \ nvidia-cuda-toolkit \ libnvidia-compute-535-server
-
Deploy the Nvidia Device Plugin
kubectl create -f https://github.com/kubernetes/kubernetes/raw/master/cluster/addons/device-plugins/nvidia-gpu/daemonset.yaml kubectl label nodes $MasterName cloud.google.com/gke-accelerator=gpu
Alternatively you can install the latest CUDA SDK from the official repo. This is more recent thant the Ubunutu repository but might have some integration issues.
-
Remove the current CUDA related packages.
sudo apt remove -y nvidia-cuda-toolkit libnvidia-compute-535-server sudo apt autoremove -y
-
Install CUDA SDK packages
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-keyring_1.1-1_all.deb sudo apt-get install -y cuda
Very likely you also installed a new kernel at the very beginning and we made lot of configuration hence the best is to restart both of the nodes to validate that our cluster survives a restart.
Run: sudo reboot