Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fails to destroy asg #176

Closed
1 of 4 tasks
tomdavidson opened this issue Oct 21, 2018 · 9 comments
Closed
1 of 4 tasks

fails to destroy asg #176

tomdavidson opened this issue Oct 21, 2018 · 9 comments

Comments

@tomdavidson
Copy link

I have issues

More than what will be helped here ... The terrafrom-aws-eks module fails to destroy.

I'm submitting a...

  • bug report
  • feature request
  • support request
  • kudos, thank you, warm fuzzy

What is the current behavior?

terraform destroy exits non-zero :

* module.eks.aws_autoscaling_group.workers (destroy): 1 error(s) occurred:
* aws_autoscaling_group.workers: group still has 3 instances

If this is a bug, how to reproduce? Please include a code sample if relevant.

Apply a plan with the following module config:

module "eks" {
  source                 = "terraform-aws-modules/eks/aws"
  version                = "1.7.0"
  cluster_name           = "${local.cluster_name}"
  cluster_delete_timeout = "25m"
  config_output_path     = "${local.config_path}"
  subnets                = "${local.subnets}"
  tags                   = "${local.tags}"
  vpc_id                 = "${local.vpc_id}"

  workers_group_defaults = {
    additional_security_group_ids = ""          # A comma delimited list of additional security group ids to include in worker launch config
    asg_desired_capacity          = "3"         # Desired worker capacity in the autoscaling group.
    asg_max_size                  = "6"         # Maximum worker capacity in the autoscaling group.
    asg_min_size                  = "3"         # Minimum worker capacity in the autoscaling group.
    autoscaling_enabled           = true        # Sets whether policy and matching tags will be added to allow autoscaling.
    enable_monitoring             = true        # Enables/disables detailed monitoring.
    instance_type                 = "t3.medium" # Size of the workers instances.
    protect_from_scale_in         = true        # Prevent AWS from scaling in, so that cluster-autoscaler is solely responsible.
    public_ip                     = false       # Associate a public ip address with a worker
    root_volume_size              = "100"       # root volume size of workers instances.
    root_volume_type              = "gp2"       # root volume type of workers instances, can be 'standard', 'gp2', or 'io1'
    target_group_arns             = ""          # A comma delimited list of ALB target group ARNs to be associated to the ASG
  }
}

Then attempt to terraform destroy

What's the expected behavior?

That terraform destroy successfully deletes the asg and exits 0.

Are you able to fix this problem and submit a PR? Link here if you have already.

Willing to help, but could use some direction pinpointing the issue.

Environment details

  • Affected module version: 1.7.0
  • OS: ubuntu 18.04
  • Terraform version: 0.11.8

Any other relevant info

protect_from_scale_in = true seems to be the culprit. If I provision without it, I can successfully destroy.

@max-rocket-internet
Copy link
Contributor

protect_from_scale_in = true seems to be the culprit. If I provision without it, I can successfully destroy.

Cool, then problem solved, right? If you want to enable protect_from_scale_in then what could we do in this module? You've already decided to protect your instances from termination. If you want to be able to destroy your ASG and instances then don't enable that setting 🙂

@tomdavidson
Copy link
Author

tomdavidson commented Oct 25, 2018

That is not very satisfying. A plan that can not be destroyed seems defective. Deploying the cluster with it enable and then doing an update to disable and then trying the destroy also fails.

If this is as designed and not a bug, I think a more descriptive on the functionality and impact would be helpful.

Thanks.

@Vlaaaaaaad
Copy link

Just hit this and this is not a behavior I expected.
We are planning to use this module to create ephemeral clusters and being able to delete them is vital. At the same time, we very much want to run cluster-autoscaler so we do need to enable protect_from_scale_in.

What's the best way we could do this? How are you all handling this problem?

The best way I can think of is adding a local-exec or remote-exec provisioner with when = "destroy" that just deletes the remaining instances/ disables scale-in protection on the ASGs.
Do you have any other ideas? Would such a PR be accepted?

@Vlaaaaaaad
Copy link

Vlaaaaaaad commented Jan 8, 2019

Hi,

We have another module wrapping this module and @bmihaescu created the workaround I mentioned above:

resource "null_resource" "eks-predestroy" {
    provisioner "local-exec" {
    when = "destroy"

    interpreter = ["/bin/bash", "-c"]

    command = <<CMD
ASGName=$(aws autoscaling describe-auto-scaling-groups | grep "${local.cluster_name}" -A 100 -B 100 | grep AutoScalingGroupName | tr -d '" ,' | cut -f2 -d ":")

for asg in $ASGName; do
    InstanceID=$(aws autoscaling describe-auto-scaling-groups --auto-scaling-group-name "$asg" | grep InstanceId | tr -d '" ,' | cut -f2 -d ":")

    for instance in $InstanceID; do
        aws autoscaling set-instance-protection --instance-ids "$instance" --auto-scaling-group-name "$asg" --no-protected-from-scale-in

    done
done
CMD
    }
}

It's a bit crude and could be converted to use jq but it seems to work fine for us and the EKS cluster gets deleted properly on destroy.
Is there any value in upstreaming this script? Should we create a PR?

LE: aaaand due to hashicorp/terraform#13549 this only works on terraform destroy and not on tf apply when the module is deleted 😞

@RothAndrew
Copy link
Contributor

This is totally 100% "working as intended". If you set protect-from-scale-in to true you have to manually terminate the instances. If I set protect-from-scale-in to true I am intentionally and explicitly saying I don't want anything to happen to those instances until I intentionally and explicitly tell the ASG otherwise.

@RothAndrew
Copy link
Contributor

Changing the aws_autoscaling_group to fail fast if protect-from-scale-in is true should be the right choice here as proposed here

@max-rocket-internet
Copy link
Contributor

This is totally 100% "working as intended". If you set protect-from-scale-in to true you have to manually terminate the instances.

Exactly. You can't have it both ways.

@AnthonyFoiani-at
Copy link

Given how many folks have encountered this, I'd like the developers to reconsider. I see at least three arguments for being able to destroy ASGs declared with protect_from_scale_in = true:

  1. I deeply appreciate declarative configuration, but by its nature, it can only be a reflection of the steady state. The huge value to Terraform is its ability to take a stack in its current running state and see if there is any drift from the declared state; if differences are found, Terraform can safely evolve the stack to match the requested state.

    This means that the live state will transiently not match the declared state: not all such operations are atomic.

    Looking through that lens, it seems reasonable to have a mental model where moving from the "live" state to a "destroyed" state would require massaging some properties to temporarily not match the declared state.

  2. Consider object-oriented programming. Every method should preserve class and instance invariants on entry and exit from the method -- but within the methods (and especially within the constructor and destructor!) those invariants don't hold. With this perspective, viewing protect_from_scale_in = true as an instance invariant makes sense, but the constructor might have to do multiple steps to get it there, and likewise the destructor might need to take multiple steps to destroy it.

  3. This flag indicates that we don't want the instances destroyed by Amazon AutoScaling. Looking at the name, it has nothing to do with whether Terraform should be careful about them! The documentation even explicitly says (emphasis mine):

    Indicates whether the instance is protected from termination by Amazon EC2 Auto Scaling when scaling in.

    Destruction / tear-down is a very distinct operation from scaling in! (And if we can only implement graceful destroy through scaling down to 0, that's IMHO still a leaky abstraction that Terraform should be bridging, not exposing.)

Given these three arguments -- and the many duplicate requests / confusions in this and related threads -- I ask the authors to reconsider the viewpoint above.

The existing behavior might be matching the AWS semantics 1-to-1, but that's arguably less useful than adopting the viewpoints I give earlier -- and isn't Terraform all about making AWS more useful?

If the authors think that this kind of one-way switch / extra protection is valuable, maybe adopt some other mechanism?

@github-actions
Copy link

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 21, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants