Skip to content

Commit

Permalink
add 'getting started' section to README.md (#671)
Browse files Browse the repository at this point in the history
  • Loading branch information
cjerad authored Aug 16, 2022
1 parent 3a06701 commit 27486d6
Show file tree
Hide file tree
Showing 3 changed files with 100 additions and 7 deletions.
97 changes: 93 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,15 +17,104 @@ This project ensures that the Kubernetes control plane responds appropriately to
- Webhook feature to send shutdown or restart notification messages
- Unit & Integration Tests

## Differences from v1

The first major version of AWS Node Termination Handler (NTH) originally operated as a daemonset deployed to every desired node in the cluster (aka IMDS Mode); later, we added the option to deploy a single pod which read events for the entire cluster from an SQS queue (aka Queue Processor Mode). Both heavily utilized Helm for configuration, and changing configuration meant updating the deployment.

This second major version of NTH aims to refine the Queue Processor Mode. Only a single pod is deployed and configuration is done using a new custom resource called *Terminators*. A *Terminator* contains much of the configuration about where NTH should fetch events, what actions to take for a given event type, filter nodes to act upon, and webhook notifications. Multiple *Terminators* may be deployed, modified, or removed without needing to redeploy NTH itself.

## Getting Started

### Infrastructure Setup
### 1. Setup Infrastructure

TBD
#### 1.1. Create an IAM OIDC Provider

Your EKS cluster must have an IAM OIDC Provider. Follow the steps in [Create an IAM OIDC provider for your cluster](https://docs.aws.amazon.com/eks/latest/userguide/enable-iam-roles-for-service-accounts.html) to determine whether your EKS cluster already has an IAM OIDC Provider and, if necessary, create one.

#### 1.2. Create the NTH Service Account

##### 1.2.1. Create the IAM Policy

Download the service account policy template for AWS CloudFormation at https://github.com/aws/aws-node-termination-handler/releases/download/v2.0.0-alpha/infrastructure.yaml

Then create the IAM Policy by deploying the AWS CloudFormation stack:
```sh
aws cloudformation deploy \
--template-file infrastructure.yaml \
--stack-name nth-service-account \
--capabilities CAPABILITY_NAMED_IAM
```

##### 1.2.2. Create the Service Account

Use either the AWS CLI or AWS Console to lookup the ARN of the IAM Policy for the service account.

Create the cluster service account using the following command:
```sh
eksctl create iamserviceaccount \
--cluster <CLUSTER NAME> \
--namespace <NAMESPACE> \
--name "nth-service-account" \
--role-name "nth-service-account" \
--attach-policy-arn <SERVICE ACCOUNT POLICY ARN> \
--role-only \
--approve
```

### 2. Deploy NTH

Get the ARN of the service account role:
```sh
eksctl get iamserviceaccount \
--cluster <CLUSTER NAME> \
--namespace <NAMESPACE> \
--name "nth-service-account"
```

Add the AWS `eks-charts` helm repository and deploy the chart:
```sh
helm repo add eks https://aws.github.io/eks-charts

helm upgrade \
--install \
nth \
eks/aws-node-termination-handler-2 \
--namespace <NAMESPACE> \
--create-namespace \
--set aws.region=<AWS REGION> \
--set serviceAccount.name="nth-service-account" \
--set serviceAccount.annotations.eks\\.amazonaws\\.com/role-arn=<SERVICE ACCOUNT ROLE ARN>
```

For a full list of inputs see the Helm chart `README.md`.

### 3. Create a Terminator

#### 3.1. Create an SQS Queue

NTH reads events from one or more SQS Queues. If you already have an SQS Queue available then you may skip this step.

*Note:* Multiple Terminators may read from a single SQS Queue. A Terminator will only delete a message if a matching node was found in the cluster. The SQS Queue's visibility window setting can help to ensure that a message is delivered to only one Terminator at a time.

You may create your own SQS Queue but an AWS CloudFormation template is available that will create an SQS Queue and commonly used rules for AWS EventBridge. Download from https://github.com/aws/aws-node-termination-handler/releases/download/v2.0.0-alpha/queue-infrastructure.yaml

```sh
aws cloudformation deploy \
--template-file queue-infrastructure.yaml \
--stack-name nth-queue \
--parameter-overrides \
ClusterName=<CLUSTER NAME> \
QueueName=<QUEUE NAME>
```

#### 3.2. Define and deploy a Terminator

### Installation and Configuration
You may download a template file from https://github.com/aws/aws-node-termination-handler/releases/download/v2.0.0-alpha/terminator.yaml.tmpl. Edit the file with the required fields and desired configuration.

For a full list of inputs see the Helm chart [README.md](./charts/aws-node-termination-handler-2/README.md).
Deploy the Terminator:
```sh
kubectl apply -f <FILENAME>
```

## Metrics

Expand Down
2 changes: 0 additions & 2 deletions resources/eks-cluster.yaml.tmpl
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,6 @@ metadata:
name: ${CLUSTER_NAME}
region: ${AWS_REGION}
version: "1.22"
tags:
karpenter.sh/discovery: ${CLUSTER_NAME}
managedNodeGroups:
- instanceType: m5.large
amiFamily: AmazonLinux2
Expand Down
8 changes: 7 additions & 1 deletion resources/infrastructure.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,10 @@ Parameters:
ClusterName:
Description: EKS Cluster Name
Type: String
Default: ""

Conditions:
IncludeClusterName: !Not [!Equals [!Ref ClusterName, ""]]

Resources:
ServiceAccountPolicy:
Expand All @@ -14,7 +18,9 @@ Resources:
by the AWS Node Termination Handler controller process to interact with AWS resources.
Type: AWS::IAM::ManagedPolicy
Properties:
ManagedPolicyName: !Sub "${ClusterName}-serviceaccount"
ManagedPolicyName: !Sub
- "nth${s}-service-account"
- s: !If [IncludeClusterName, !Sub "-${ClusterName}", ""]
PolicyDocument:
Version: "2012-10-17"
Statement:
Expand Down

0 comments on commit 27486d6

Please sign in to comment.