Roughly stack racked by importance and ease of implementation.
You can work your way down this list linearly for maximum return on investment for however much time you have.
Many config templates and examples are available in the excellent HariSekhon/Kubernetes-configs repo referenced throughout this page for faster implementation seeing what has actually worked and just editing a few lines specific to your environment.
- Healthchecks
- Horizontal Pod Autoscaler
- Pod Disruption Budget
- Pod Anti-Affinity
- Ingress
- Applications
- DNS - Automatic DNS Records for Apps
- Secrets - Automated Secrets
- Namespaces
- Pod Security Policies
- Governance, Security & Best Practices
- Find Deprecated API objects to replace
- Helm
Readiness / liveness probes are critically important for the following reasons:
- Readiness Probes
- only direct traffic to pods which are fully initialized and functioning
- don't let users see frequent errors from pods which have been recently migrated / restarted which happens frequently on Kubernetes clusters
- Liveness Probes
- restart pods which are stuck after encountering state errors either at runtime or initialization time (eg. pull from a config source at initialization or a database connection failing to establish during startup)
- this is the only probe that will restart the pod to reset its state to overcome such issues
- Startup Probes
- newer versions of Kubernetes give a specific check for startup. This is useful for apps which have long initialization times but you don't want to set high times on Readiness probes which would delay dropping later malfunctioning pods out of the Kubernetes internal service load balancer in good time - which would end up sending requests in the interim which may be surfaced as errors to users
See the deployment and statefulset templates:
HariSekhon/Kubernetes-configs - deployment.yaml
HariSekhon/Kubernetes-configs - statefulset.yaml
Make sure your pods scale up to meet traffic demands and scale down off-peak to not waste resources and cloud usage costs.
HariSekhon/Kubernetes-configs - horizontal-pod-autoscaler.yaml
Ensure the Kubernetes scheduler doesn't take down more pods than you can afford for High Availability purposes or scaling capacity purposes to still be able to serve full traffic at the current scaling level.
Set your pod disruption budget according to your capacity and app's ability to handle a certain number of pods being unavailable at a given time due to being migrated (killed and restarted on another node):
This is doubly important if you're running apps:
- apps with a strict quorum requirements
- apps with sharded replicas (common with NoSQL systems)
- eg. Elasticsearch, SolrCloud, Cassandra, MongoDB, Couchbase where often an outage of 2 nodes could cause partial outages via shard unavailability, incomplete results or query failures
HariSekhon/Kubernetes-configs - pod-disruption-budget.yaml
Ensure your pod replicas are spread across nodes for maximum availability and stability.
By default the Kubernetes scheduler will attempt to do a basic spread of pods across nodes but Pod Anti-Affinity rules enhance this in the following ways:
- spread across different servers to protect against random hardware failure of a single server causing an outage
- spread across different cloud availability zones to protect against a single datacenter outage eg. power failure / fire / flood / networking issue
On cloud, choose between running your pods on full priced on-demand nodes or on discounted price preemptible or spot instances which are much cheaper.
If your application can take random pod migrations such as a horizontally scaled web farm, then use preemptible or spot instances to save significant money on your cloud budget.
This is part of basic best practice cloud cost optimization.
If you have an app like Jenkins server which is a single point of failure then you should definitely run it on stable on-demand nodes unless you like having several minute outages of your Jenkins UI and job scheduler while the Jenkins server pod is restarted on another node.
Jenkins for example takes several minutes to start up, you don't want this happening every day on GCP preemptible nodes or randomly on AWS spot instances.
Some apps like coordination services or clustered shared data services may not fair well if randomly restarted in any uncontrolled number such as spot instances may do.
Pod Disruption Budgets can't help here as they only control the Kubernetes scheduler's decision about how many pods to reap and redeploy elsewhere at one time. The Kubernetes scheduler and therefore pod disruption budgets have no control over the lower level Cloud's decision to reap spot instances at any time, meaning they could randomly take out any number of nodes upon a surge of demand for spot instances.
Do not run quorum coordination services on spot / preemptible instances for this reason as you could lose too many of them at the same time, causing a complete quorum outage and impacting all other applications depending on them for coordination.
No spot / preemptible for:
- Coordination Services:
- NoSQL data sharding services:
You may also choose to ensure certain apps are not deployed alongside other performance hungry apps to optimize the performance available to them.
HariSekhon/Kubernetes-configs - deployment.yaml
HariSekhon/Kubernetes-configs - statefulset.yaml
Set up a stable HTTPS entrypoint to your apps with DNS and SSL.
Set up Cert Manager for Automatic Certificate Management using the popular free Let's Encrypt certificate authority.
You can also use your cloud certificate authority if your corporate policy dictates.
HariSekhon/Kubernetes-configs - cert-manager
Ensure each app has an ingress address to be reachable via a URL.
Otherwise you'll have to waste time kubectl port-foward
tunneling to access it each time.
If you are stuck doing that, either because you haven't yet gotten all your Ingress magic set up yet, then you may want to use HariSekhon/DevOps-Bash-tools -kubectl_port_forward.sh.
In some cases this can't be avoided, such as Spark jobs launched by Informatica due to having the UI on randomly launched job driver pods.
If your ingress controllers are working, set up your app ingresses by editing this config:
HariSekhon/Kubernetes-configs -ingress.yaml
See also various app-specific ingresses already configured in
HariSekhon/Kubernetes-configs repo
under */overlay/ingress.yaml
.
Set up ArgoCD to automatically deploy, update and repair your Kubernetes configs from the saved good config in git ie. 'GitOps'.
HariSekhon/Kubernetes-configs - argocd
Setting appropriate resource requests
and limits
is critical to both performance and reliability.
Otherwise, apps will end up over-contended - degrading their performance or being outright killed by Linux's OOM Killer to save the host from crashing - resulting in sudden pod recreations on other nodes and possible service disruptions.
See resources sections in
HariSekhon/Kubernetes-configs - deployment.yaml
HariSekhon/Kubernetes-configs - statefulset.yaml
But what to set your resource requests
and limits
to?
Install Goldilocks to generate VPAs for resource recommendations with a nice dashboard.
It will tell you exactly how much your app is using so you can tune its resource requests
and limits
after setting an initial estimate of your best guess.
HariSekhon/Kubernetes-configs - Goldilocks
Install External DNS to automatically create DNS records for your apps.
It integrates with many popular DNS providers such as Cloudflare, AWS Route53, GCP Cloud DNS etc.
HariSekhon/Kubernetes-configs - External DNS
Install one of the following:
External Secrets integrates with and pulls secrets from:
HariSekhon/Kubernetes-configs - External Secrets
Sealed Secrets by Bitnami is a simpler solution in which you encrypt a secret using a private key unique to the cluster which results in a blob that is safe to store in Git because it can only be decrypted by the cluster to regenerate the Kubernetes secret object.
The drawback of this approach is that the secret must be generated for each cluster - whereas External Secrets config can be inherited across clusters - while if a Sealed Secrets cluster (or more accurately the Sealed Secrets installation with the private key on that cluster) is destroyed and recreated, then the sealed secrets are unrecoverable and you must regenerate all the secrets.
This means this is no good for fast DR or recreations of Kubernetes clusters unless you can also back up and restore the sealed secrets private keys for the cluster.
HariSekhon/Kubernetes-configs - Sealed Secrets
On multi-tenant Kubernetes clusters, create a namespace for each app / team and limit the amount of CPU and RAM resources they are allowed to request from the cluster's Kubernetes scheduler in their app resource requests.
This will prevent one team or app from greedily using up all the cluster resources and allow for better resource planning.
HariSekhon/Kubernetes-configs - resource-quota.yaml
These set default resource requests
and limits
for apps within the namespace.
Make these frugle and force people to right-size their apps in a couple quick iterations at time of deployment using Goldilocks.
HariSekhon/Kubernetes-configs - limit-range.yaml
Restrict communications between namespaces containing different apps and teams.
This is equivalent to old school internal firewalling between different LAN subnets inside the Kubernetes cluster.
If one app in one namespace was to get compromised, there is no reason to allow it to be using as a launching pad to attack adjacent apps in the cluster.
This will also force teams to document the network connections and services their app is using in order for you to permit their network access.
HariSekhon/Kubernetes-configs - network-policy.yaml
Deprecated in newer versions of Kubernetes.
HariSekhon/Kubernetes-configs - pod-security-policy.yaml
Install Polaris for a recommendations dashboard full of best practices.
HariSekhon/Kubernetes-configs - Polaris
Run Pluto against your cluster before Kubernetes cluster upgrades.
The following scripts are useful from in the popular DevOps Bash Tools repo:
- pluto_detect_kustomize_materialize.sh
- recursively materializes all
kustomization.yaml
and runs Pluto on each directory to work around this issue
- recursively materializes all
- pluto_detect_helm_materialize.sh
- recursively materializes all helm
Chart.yaml
and runs Pluto on each directory to work around this issue
- recursively materializes all helm
- pluto_detect_kubectl_dump_objects.sh
- dumps all live Kubernetes objects to /tmp all can run Pluto to detect deprecated API objects on the cluster from any source
People who deploy directly from Helm CLI should be aware that is PoC territory.
You must wrap Helm in Kustomize or ArgoCD or similar to detect live object config drift!
Use kustomize_update_helm_chart_versions.sh in the popular DevOps Bash Tools repo.
Migrated from HariSekhon/Kubernetes-configs repo