Add support for ASG lifecycle hooks #4347

hintofbasil · 2021-10-19T09:31:51Z

What feature/behavior/change do you want?

Add support for autoscaling lifecycle hooks to node groups.

A sample configuration could be the following based on the cloudformation documentation. This supports all lifecycle hook configuration options.

cluster:
  ...
  nodeGroups:
    - name: spotNodeGroup
      lifecycleHooks:
        - defaultResult: <String>
          heartbeatTimeout: <Integer>
          lifecycleHookName: <String>
          lifecycleTransition: <String>
          notificationMetadata: <String>
          notificationTargetARN: <String>
          roleARN: <String>
      ...

Why do you want this feature?

Node groups already have support for capacity rebalance however this does not automatically drain the nodes before termination. The node termination exporter, an official solution to this problem from AWS, requires the addition of lifecycle hooks to work.

These shouldn't be required for managed node groups as they will automatically drain workloads on a rebalance event.

aclevername · 2021-10-20T15:04:11Z

Thanks for opening the issue @hintofbasil. Could you provide an example of the type of configuration you would do if we introduced the functionality to configure this? I'm not very familiar with the node termination exporter, but its looks very interesting and perhaps related to #4214 (comment)

hintofbasil · 2021-10-20T17:10:03Z

Lifecycle hooks are something I've never played around with before but AWS support has recommended them for this issue. As such I can only guess at what sort of configuration we would want. But this is my best guess

cluster:
  ...
  nodeGroups:
    - name: spotNodeGroup
      lifecycleHooks:
        - defaultResult: ABANDON
          lifecycleHookName: node-drain
          lifecycleTransition: EC2_INSTANCE_TERMINATING
      ...

It is possible we would also need to set some other values such as HeartbeatTimeout, NotificationTargetARN or RoleARN but I don't see an immediate need for them with my current understanding of the node termination exporter.

I'm not sure there is a strong connection between this and the issue you linked. The NTH will perform a drain excluding daemonsets just like eksctl delete nodegroup will. Really the NTH is closer to matching eksctl delete nodegroup functionality for single node terminations.

github-actions · 2021-11-20T01:45:35Z

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

github-actions · 2021-11-25T01:46:07Z

This issue was closed because it has been stalled for 5 days with no activity.

t0rr3sp3dr0 · 2021-11-26T02:05:41Z

Can we reopen this issue? I'm also looking into using AWS Node Termination Handler in Queue Processor mode with a cluster managed by eksctl.

I think eksctl could perform the first two steps described in the configuration guide (https://github.com/aws/aws-node-termination-handler#infrastructure-setup):

Setup a Termination Lifecycle Hook on an ASG (https://github.com/aws/aws-node-termination-handler#1-setup-a-termination-lifecycle-hook-on-an-asg)

aws autoscaling put-lifecycle-hook \
  --lifecycle-hook-name 'my-k8s-term-hook' \
  --auto-scaling-group-name 'my-k8s-asg' \
  --lifecycle-transition 'autoscaling:EC2_INSTANCE_TERMINATING' \
  --default-result 'CONTINUE' \
  --heartbeat-timeout '300'

Tag the ASGs (https://github.com/aws/aws-node-termination-handler#2-tag-the-asgs)

aws autoscaling create-or-update-tags \
  --tags 'ResourceId=my-auto-scaling-group,ResourceType=auto-scaling-group,Key=aws-node-termination-handler/managed,Value=,PropagateAtLaunch=true'

hintofbasil added the kind/feature New feature or request label Oct 19, 2021

github-actions bot added the stale label Nov 20, 2021

github-actions bot closed this as completed Nov 25, 2021

t0rr3sp3dr0 mentioned this issue Feb 11, 2022

Add support for ASG lifecycle hooks #4774

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for ASG lifecycle hooks #4347

Add support for ASG lifecycle hooks #4347

hintofbasil commented Oct 19, 2021

aclevername commented Oct 20, 2021

hintofbasil commented Oct 20, 2021

github-actions bot commented Nov 20, 2021

github-actions bot commented Nov 25, 2021

t0rr3sp3dr0 commented Nov 26, 2021

Add support for ASG lifecycle hooks #4347

Add support for ASG lifecycle hooks #4347

Comments

hintofbasil commented Oct 19, 2021

aclevername commented Oct 20, 2021

hintofbasil commented Oct 20, 2021

github-actions bot commented Nov 20, 2021

github-actions bot commented Nov 25, 2021

t0rr3sp3dr0 commented Nov 26, 2021