Unable to use instance_template with instance_group_manager anymore #4934

migibert · 2019-11-18T14:00:47Z

Community Note

Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
If you are interested in working on this issue or have submitted a pull request, please leave a comment
If an issue is assigned to the "modular-magician" user, it is either in the process of being autogenerated, or is planned to be autogenerated soon. If an issue is assigned to a user, that user is claiming responsibility for the issue. If an issue is assigned to "hashibot", a community member has claimed the issue already.

Terraform Version

Terraform v0.12.15

provider.google v2.20.0
provider.google-beta v2.20.0

Affected Resource(s)

google_compute_instance_template

Terraform Configuration Files

A reproducer code is available here: https://gist.github.com/migibert/a51ad6521f565f4060aedd93b4337c33#file-reproducer-tf

Debug Output

https://gist.github.com/migibert/a51ad6521f565f4060aedd93b4337c33#file-tf-output-log

Panic Output

No Panic Output

Expected Behavior

I am using instance templates with images and instance group manager to manage an immutable infrastructure pattern.

I have been using this configuration from ~1 year and it used to run without any problem.

Actual Behavior

When I update the base image for a template, a cycle is detected and an error raises, preventing the update of the infrastructure.

Error: Cycle: google_compute_instance_template.igm-primary (destroy deposed 9a0c0b62), google_compute_instance_template.igm-canary (destroy deposed 885f672d), google_compute_instance_group_manager.igm-basic

Steps to Reproduce

terraform apply to create the basic configuration (an IGM with 2 templates using using a base image)
Update the base image (let's say, update it to debian-10)
terraform apply to update the configuration (an IGM with 2 templates using using a base image)

Important Factoids

I tried with both a user account and a service account but none works.

My hypothesis is that it is related to lifecycle create_before_destroy instruction because here is the workaround I found...

Update the base image and apply

Error: Cycle: google_compute_instance_template.igm-primary (destroy deposed 9a0c0b62), google_compute_instance_template.igm-canary (destroy deposed 885f672d), google_compute_instance_group_manager.igm-basic

Update the base image and apply targeting only the templates

Error: Error deleting instance template: googleapi: Error 400: The instance_template resource 'projects//global/instanceTemplates/reproducer-stable-20191118131559890900000002' is already being used by 'projects//zones/us-central1-c/instanceGroupManagers/reproducer', resourceInUseByAnotherResource

Error: Error deleting instance template: googleapi: Error 400: The instance_template resource 'projects//global/instanceTemplates/reproducer-canary-20191118131559890900000001' is already being used by 'projects//zones/us-central1-c/instanceGroupManagers/reproducer', resourceInUseByAnotherResource

BUT! Despite the errors, it creates the templates (it just fails to delete the old ones).

Apply targeting the instance group manager

OK

Apply without targeting any resource

google_compute_instance_template.igm-canary: Destroying... [id=reproducer-canary-20191118131559890900000001]
google_compute_instance_template.igm-primary: Destroying... [id=reproducer-stable-20191118131559890900000002]
google_compute_instance_template.igm-canary: Destruction complete after 4s
google_compute_instance_template.igm-primary: Destruction complete after 4s

Apply complete! Resources: 0 added, 0 changed, 2 destroyed.

References

The text was updated successfully, but these errors were encountered:

slevenick · 2019-11-18T19:01:13Z

Interesting....

I was unable to reproduce in Terraform v0.12.13, but when I upgraded to 0.12.15 I see the errors about cycles as well.

I'll do some digging, but a potential fix would be to downgrade to 0.12.13 and see if that works for you

slevenick · 2019-11-18T19:11:58Z

Looks like this is caused by hashicorp/terraform#23374

It was introduced in terraform core version 0.12.14, so downgrading to 0.12.13 should work until the fix is ready.

japgolly · 2019-11-18T21:21:58Z

downgrading to 0.12.13 should work until the fix is ready.

I've got a similar issue and tried downgrading but unfortunately that doesn't work:

Error: Error loading state: state snapshot was created by Terraform v0.12.15, which is newer than current v0.12.13; upgrade to Terraform v0.12.15 or greater to work with this state

slevenick · 2019-11-19T00:49:17Z

Looks like the upstream PR was merged in, so I would guess this will be fixed in the next release of terraform core.

Unfortunately there isn't a way to work around this within the provider itself

migibert · 2019-11-19T08:59:00Z

terraform core version v0.12.16 has been released including the PR fixing the issue causing the problem. Thanks for pointing me on the correct upstream issue!

It looks better but it does not seem to completely fix the issue:

Here is the output with the same scenario (a change in the base image):

Plan: 2 to add, 1 to change, 2 to destroy.

Do you want to perform these actions?
Terraform will perform the actions described above.
Only 'yes' will be accepted to approve.

Enter a value: yes

google_compute_instance_template.igm-canary: Creating...
google_compute_instance_template.igm-primary: Creating...
google_compute_instance_template.igm-primary: Creation complete after 5s [id=reproducer-stable-20191119084605404500000001]
google_compute_instance_template.igm-primary: Destroying... [id=reproducer-stable-20191118135449093000000001]
google_compute_instance_template.igm-canary: Creation complete after 5s [id=reproducer-canary-20191119084605404500000002]
google_compute_instance_group_manager.igm-basic: Modifying... [id=/us-central1-c/reproducer]
google_compute_instance_group_manager.igm-basic: Still modifying... [id=/us-central1-c/reproducer, 10s elapsed]
google_compute_instance_group_manager.igm-basic: Still modifying... [id=/us-central1-c/reproducer, 20s elapsed]
google_compute_instance_group_manager.igm-basic: Modifications complete after 28s [id=/us-central1-c/reproducer]
google_compute_instance_template.igm-canary: Destroying... [id=reproducer-canary-20191118135449093000000002]
google_compute_instance_template.igm-canary: Destruction complete after 3s

Error: Error deleting instance template: googleapi: Error 400: The instance_template resource 'projects/***/global/instanceTemplates/reproducer-stable-20191118135449093000000001' is already being used by 'projects/***zones/us-central1-c/instanceGroupManagers/reproducer', resourceInUseByAnotherResource

Then it works fine on the second execution (because it only remains resources to delete).
Is it still a terraform-core issue?

The debug log is here: https://gist.github.com/migibert/c88cfac6020761c9d7f903251d047574

slevenick · 2019-11-19T18:06:50Z

This looks like a provider problem now!

What is happening is that terraform builds a graph of operations that need to occur to get to the intended state, in this case that looks something like:

              create new template
               /           \
delete old template     update instance group manager

Where we create the new template before anything else, but then updating the IGM and deleting the old template happen in parallel. This is an issue because the API requires us to update the IGM before deleting the old template, as the IGM references the old template. I imagine something changed in how terraform core builds the graph between the last couple versions which is why we are seeing this now. I would guess that we were getting lucky in the ordering of operations before, causing the update to the IGM to happen before the delete.

I believe I can fix this by adding a retry to the delete of the instance template, so that it will wait long enough for the IGM to be updated to not reference it anymore

slevenick · 2019-11-19T21:03:06Z

Not entirely sure this is a provider issue anymore. I've filed an issue upstream about the change in behavior between 0.12.13 and 0.12.16.

migibert · 2019-11-20T09:08:44Z

Thanks for investigating, I will closely monitor the upstream issue!

slevenick · 2019-11-26T22:56:23Z

Going to close this out as it should be fixed in the next version of terraform core via that upstream issue and fix

ghost · 2020-03-29T13:54:51Z

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. If you feel I made an error 🤖 🙉 , please reach out to my human friends 👉 hashibot-feedback@hashicorp.com. Thanks!

ghost added the bug label Nov 18, 2019

slevenick added the upstream-terraform label Nov 18, 2019

slevenick self-assigned this Nov 18, 2019

slevenick mentioned this issue Nov 19, 2019

0.12.16 create_before_destroy behavior changes hashicorp/terraform#23422

Closed

slevenick closed this as completed Nov 26, 2019

ghost locked and limited conversation to collaborators Mar 29, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unable to use instance_template with instance_group_manager anymore #4934

Unable to use instance_template with instance_group_manager anymore #4934

migibert commented Nov 18, 2019 •

edited

Loading

slevenick commented Nov 18, 2019

slevenick commented Nov 18, 2019

japgolly commented Nov 18, 2019

slevenick commented Nov 19, 2019

migibert commented Nov 19, 2019

slevenick commented Nov 19, 2019 •

edited

Loading

slevenick commented Nov 19, 2019

migibert commented Nov 20, 2019

slevenick commented Nov 26, 2019

ghost commented Mar 29, 2020

Unable to use instance_template with instance_group_manager anymore #4934

Unable to use instance_template with instance_group_manager anymore #4934

Comments

migibert commented Nov 18, 2019 • edited Loading

Community Note

Terraform Version

Affected Resource(s)

Terraform Configuration Files

Debug Output

Panic Output

Expected Behavior

Actual Behavior

Steps to Reproduce

Important Factoids

References

slevenick commented Nov 18, 2019

slevenick commented Nov 18, 2019

japgolly commented Nov 18, 2019

slevenick commented Nov 19, 2019

migibert commented Nov 19, 2019

slevenick commented Nov 19, 2019 • edited Loading

slevenick commented Nov 19, 2019

migibert commented Nov 20, 2019

slevenick commented Nov 26, 2019

ghost commented Mar 29, 2020

migibert commented Nov 18, 2019 •

edited

Loading

slevenick commented Nov 19, 2019 •

edited

Loading