You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
If you are interested in working on this issue or have submitted a pull request, please leave a comment
If an issue is assigned to the "modular-magician" user, it is either in the process of being autogenerated, or is planned to be autogenerated soon. If an issue is assigned to a user, that user is claiming responsibility for the issue. If an issue is assigned to "hashibot", a community member has claimed the issue already.
Cluster creation should have succeeded, and cluster delete/recreate should have succeeded.
Actual Behavior
Timeout from client.
Steps to Reproduce
This is not reliably reproducible. The behavior is subtly different each time it happens, but the common thread across all of these failures is that the GCP API call appears to not return within the time allotted by the provider client.
terraform apply
Important Factoids
I recently upgraded to Terraform 0.12, and noticed now that upon failed create attempts, subsequent terraform apply marks the existing cluster (which did finish creating) as tainted, and recreates it. In other cases, terraform apply will try to create the cluster without deleting it, resulting in a 409 error from GCP. This hasn't happened enough for me to tell if the issue is related to when the timeout failure occurred during the initial creation--i.e., when creating the initial cluster or deleting the default node pool.
I'm submitting this as a new issue, but it's related to #3168 and Hashibot wants a new issue linked back to it. I noticed this issue (#3752) and wonder if it's related, too.
From your debug logs (thanks for providing them!) it looks like you caught a timeout error when trying to delete the default node pool. It also looks like we haven't wrapped that particular call in retry logic yet 😳 so it will stop the Create call if it fails. I'll add the retry wrapper shortly.
RE tainting: Since the cluster failed to remove the default node pool during Create I think that tainting + recreating is the correct behavior. In theory we should be able to catch most of the failures and persist the id to state so that Terraform will know that the resource was created in a partial state.
Thanks! For what it's worth, I was having intermittent connectivity issues, most likely related to Google's cloud outage this past weekend. But I've seen the timeout before when all of their services were ostensibly running well.
I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.
If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. If you feel I made an error 🤖 🙉 , please reach out to my human friends 👉 hashibot-feedback@hashicorp.com. Thanks!
ghost
locked and limited conversation to collaborators
Jul 4, 2019
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Community Note
Terraform Version
Affected Resource(s)
Terraform Configuration Files
Debug Output
https://gist.github.com/ejschoen/24b2178bed67d5538e630a41b6f6dfec
Expected Behavior
Cluster creation should have succeeded, and cluster delete/recreate should have succeeded.
Actual Behavior
Timeout from client.
Steps to Reproduce
This is not reliably reproducible. The behavior is subtly different each time it happens, but the common thread across all of these failures is that the GCP API call appears to not return within the time allotted by the provider client.
terraform apply
Important Factoids
I recently upgraded to Terraform 0.12, and noticed now that upon failed create attempts, subsequent
terraform apply
marks the existing cluster (which did finish creating) as tainted, and recreates it. In other cases,terraform apply
will try to create the cluster without deleting it, resulting in a 409 error from GCP. This hasn't happened enough for me to tell if the issue is related to when the timeout failure occurred during the initial creation--i.e., when creating the initial cluster or deleting the default node pool.I'm submitting this as a new issue, but it's related to #3168 and Hashibot wants a new issue linked back to it. I noticed this issue (#3752) and wonder if it's related, too.
References
The text was updated successfully, but these errors were encountered: