Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Drain nodegroups during cluster deletion #4205

Merged
merged 4 commits into from
Sep 22, 2021

Conversation

nikimanoledaki
Copy link
Contributor

@nikimanoledaki nikimanoledaki commented Sep 9, 2021

Description

We drain nodegroups prior to deleting them individually but not prior to deleting them when we delete the whole cluster. Draining nodegroups before deleting them is important so that pods can be evicted (and not scheduled on those pods again) and resources, such as IP addresses, can be released. Not draining nodes before deleting them may block the deletion of some resources and cause leftover resources such as IP addresses.

This PR focuses on unmanaged nodegroups because managed nodegroups get drained by EKS anyway.

I tried to do a bit of refactoring to backfill tests. I also extracted the CF call that gets allStacks since it was used elsewhere in the code where we delete unowned clusters.

Related issues: #1849, #536, #523, #3726

Demo

The following demo is for a eksctl-created cluster with unmanaged nodegroups.

./eksctl delete cluster --name drain-unmanaged
2021-09-13 11:19:04 [ℹ]  eksctl version 0.68.0-dev+fbe5484e.2021-09-13T10:40:28Z
2021-09-13 11:19:04 [ℹ]  using region us-west-2
2021-09-13 11:19:04 [ℹ]  deleting EKS cluster "drain-unmanaged"
2021-09-13 11:19:07 [ℹ]  will drain 1 nodegroup(s) in cluster "drain-unmanaged"
2021-09-13 11:19:11 [ℹ]  cordon node "ip-192-168-25-121.us-west-2.compute.internal"
2021-09-13 11:19:11 [ℹ]  cordon node "ip-192-168-94-224.us-west-2.compute.internal"
2021-09-13 11:19:12 [!]  ignoring DaemonSet-managed Pods: kube-system/aws-node-shq67, kube-system/kube-proxy-jl5g2
2021-09-13 11:19:13 [!]  ignoring DaemonSet-managed Pods: kube-system/aws-node-s2nm4, kube-system/kube-proxy-2jnjc
2021-09-13 11:19:15 [!]  ignoring DaemonSet-managed Pods: kube-system/aws-node-shq67, kube-system/kube-proxy-jl5g2
2021-09-13 11:19:16 [!]  ignoring DaemonSet-managed Pods: kube-system/aws-node-s2nm4, kube-system/kube-proxy-2jnjc
2021-09-13 11:19:17 [!]  pod eviction error ("error evicting pod: kube-system/coredns-86d9946576-9n75b: pods \"coredns-86d9946576-9n75b\" not found") on node ip-192-168-94-224.us-west-2.compute.internal
2021-09-13 11:19:22 [!]  ignoring DaemonSet-managed Pods: kube-system/aws-node-shq67, kube-system/kube-proxy-jl5g2
2021-09-13 11:19:24 [!]  ignoring DaemonSet-managed Pods: kube-system/aws-node-s2nm4, kube-system/kube-proxy-2jnjc
2021-09-13 11:19:24 [✔]  drained all nodes: [ip-192-168-25-121.us-west-2.compute.internal ip-192-168-94-224.us-west-

[etc]

Manually tested it for an unowned cluster with unmanaged nodegroups as well.

Checklist

  • Added tests that cover your change (if possible)
  • Added/modified documentation as required (such as the README.md, or the userdocs directory)
  • Manually tested
  • Made sure the title of the PR is a good description that can go into the release notes
  • (Core team) Added labels for change area (e.g. area/nodegroup) and kind (e.g. kind/improvement)

BONUS POINTS checklist: complete for good vibes and maybe prizes?! 🤯

  • Backfilled missing tests for code in same general area 🎉
  • Refactored something and made the world a better place 🌟

Copy link
Contributor

@aclevername aclevername left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Left a few comments 😄

@nikimanoledaki
Copy link
Contributor Author

added cfg.NodeGroups = []*api.NodeGroup{} before appending ng from stacks to avoid duplicates (@aclevername's suggestion )

Copy link
Contributor

@aclevername aclevername left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM ✨

@nikimanoledaki nikimanoledaki merged commit b4d0de2 into eksctl-io:main Sep 22, 2021
@nikimanoledaki nikimanoledaki deleted the drain-ng branch September 22, 2021 14:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants