This repository contains Kareems New Kubernetes Deployment configuration. This is designed to bring up a cluster in AWS (Amazon Web Services) that is empty and ready to manipulate. This is intended to eventually reflect a production-ready implementation and is very stable, but it is in testing, and may not work if documentation is not followed. This template uses Terraform outputs from the Base VPC and Bastion Template. This template should be deployed after (and be torn down before) that initial template.
This is essentially my own 'Kubernetes from scratch' setup, made to work (eventually) on any public cloud provider (Intended: Amazon Web Services, Microsoft Azure, Google Cloud).
It contains:
- Documentation for setup and management
- Deployment Kubernetes Cluster
- Default configuration and settings to build environment
- Scripts to create, update and cleanup infrastructure
- Demo details of things to do with Kubernetes
Additional documentation for setup can be found in docs, when they become available.
Best to start at the base vpc template setup doc and then this repos setup doc to setup an environment.
After following the setup docs, you may want to check the demo deploys doc for details of how to setup a full-featured environment. This has examples of how to setup Kube objects like Ingress, Helm and Rancher.
- kubectl 1.12.2+
- Terraform 0.11.7+
Ansible 2.4.1.0+- An AWS Account.
- No special AWS limits are required.
- An AWS Route53 Hosted zone in your account.
- A Hosted zone is configured by default in this deployment.
- This is required for setup of DNS addresses which will make interaction with cluster easier.
- If you want a clean cluster, without any demo resources or rendered demo templates, you should delete the
demo_stuff.tf
file before doing anything. - If you don't have/use a Route53 hosted zone, you should delete the
dns.tf
file before doing anything, and ignore any references that are not ELB DNS addresses. - If you delete the
dns.tf
and/ordemo_stuff.tf
file(s), it is assumed you know what you're doing, and some of the demo deploys (Like ingress, which requires a routable URL) will not work. - There is no need to remove the custom URLs from the variables.tf or terraform.tfvars files. References in the demo deploys will break if they are removed, so best to just ignore them. Variables in
terraform.tfvars
will overridevariables.tf
.
- SSH/TLS Keypair creation for cluster
- the ssh-keypair name is inherited from the output of the parent base vpc template.
- the ssh-keypair will be auto-generated into the /config dir of the base vpc template.
- all TLS certs will be created and placed in the /config/ssl dir
- all TLS certs will be uploaded to an AWS S3 bucket for backup and use within the cluster.
- Diagram
TBC
-
AWS (Available from base vpc template)
- VPC
- Internet Gateway
- Route Tables
- Subnets
-
Kubernetes Cluster (This template)
- S3 Buckets
- EFS storage
- Route53 DNS
- ASG Controller
- ASG ETCD
- ASG Worker
-
Etcd Server (This template)
- docker.service
- etcd.service (now etcd3 :) )
- cfn-signal.service
-
Master/Controller Server (This template)
- docker-bootstrap.service
- flannel.service
- kube-apiserver.service
- kube-controller-manager.service
- kube-scheduler.service
- kube-proxy.service
-
Worker Nodes (This template)
- docker-bootstrap.service
- flannel.service
- docker.service
- kubelet.service
- kube-proxy.service
This template should be a ready to go template, but you may want to customise some elements before deploy and at future redeploys/updates. The most common task will likely be scaling the cluster, and changing instance types/sizes.
It is not recommended to change core cluster functionality and references after the cluster template is deployed, this is indicated in notes and template comments.
Versions tested:
- kubectl: 1.12.2 In testing: 1.13.x
- terraform: 0.8.8 upto 0.11.7
ansible-playbook: 2.2.1.0, 2.4.1.0
Terraform Inputs:
- This template will accept and require a number of outputs from the base template.
- You will want to deploy the base vpc template first.
- Security
- SSH keypair
- There is no need to generate
- the base vpc generates this and output is used by this template
- ROOT_CA/ETCD/API/Admin/Worker TLS certs
- There is no need to manually generate
- these are auto-generated by this template, using TLS resources
- available in dir "config/ssl"
- will be generated at deploy time
- SSH keypair
- Setup basic cluster variables
- cluster_name
- cluster_name_short
- s3_state_bucket
- s3_backup_bucket
- dns_urls
- Setup advanced cluster variables
- instance_types
- instances
- kubernetes
- cluster_tags
- efs_storage
- security
- find better way for TLS cert distribution
- encrypted EFS store?
- encrypted S3?
- hashicorp vault?
- simple DB storage?
- paramaterise SSL .cnf template
- translate SSL provisioning to terraform native
- terraform provision instance SSH keypair
- LetsEncrypt-enabled host-based routing working
- ingress demo - Nginx + LetsEncrypt
- ingress demo - Traefik
- find better way for TLS cert distribution
- documentation
- setup doc with example cli commands
- demo doc with example cli commands
- Create working demo of Kube services including ELB-ingress
- core services - kube-dns, dashboard & efs-storage
- ingress demo - Nginx
- ingress demo - Traefik
- ingress demo - host-based routing
- ingress demo - kube-ingress-aws
- etcd concerns
- resolve etcd provisioning
- all etcd communication using TLS certs
- rebuild etcd image with open logic
- etcd-aws-py docker-image ready
- update etcd to latest 3.2.x+
- update etcd to latest 3.3.x+
- etcd 3.x auto-backups
- etcd 3.x auto-restores
- etcd-aws-go docker-image ready
- update etcd to latest 3.x+
- etcd 3.x auto-backups
- etcd 3.x auto-restores
- kubernetes
- versions working/tested
- confirm working on kube 1.9.10 stable
- confirm working on kube 1.10.10 stable
- confirm working on kube 1.11.4 stable
- confirm working on kube 1.12.2 stable
- update/test kube to latest 1.x.x alpha/beta (1.13.0-beta.1)
- cluster autoscaling (cluster-autoscaler?) (kube-aws-autoscaler?)
- autoscaler IAM policy
- figure out friggin v1.8.x RBAC!
- RBAC: get basic roles organised/documented
- RBAC: get kubelet node role organised (requires deploy-time provisioning certs)
- Helm: Documented secure Helm install
- Rancher: Documented secure Rancher install
- versions working/tested
- terraform
- update terraform to latest 10.x
- update terraform to latest 11.x
- Fix some terraform code inconstencies
- translate etcd/controller/worker ASGs to terraform native
- AWS-specific
- Traefik ingress demo - AWS Route53 policy
- Nginx ingress demo - kube-ingress-aws IAM policy
- all security groups tightened
- secure ability to expose API server for multi-cloud
- test multi-cloud deployment
- Azure-specific
- develop multi-cloud extension
- Google-specific
- develop multi-cloud extension
- other
- confirm working on Ubuntu 16.04
- confirm working on Ubuntu 18.04
- FYI: Ubuntu 18.04 changes DNS, requires
sudo ln -sf /run/systemd/resolve/resolv.conf /etc/resolv.conf
- FYI: Ubuntu 18.04 changes DNS, requires
- FYI: Kube 1.8.x worker kubelet swap is undesired, requires
--kube-swap-on=false