Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add ability to wait for operation completion #1131

Closed
saturnism opened this issue Feb 27, 2018 · 7 comments · Fixed by #1197
Closed

add ability to wait for operation completion #1131

saturnism opened this issue Feb 27, 2018 · 7 comments · Fixed by #1197
Assignees
Labels

Comments

@saturnism
Copy link

Affected Resource(s)

Please list the resources as a list, for example:

  • all

If this issue appears to affect multiple resources, it may be an issue with Terraform's core, so please mention this.

Terraform Configuration Files

esource "google_project_iam_member" "datastore_user" {
  role    = "roles/datastore.user"
  member  = "serviceAccount:${google_service_account.todo_account.email}"
  // Await for 1min before provisioning; would be nice to be able to wait for the actual completion.
  provisioner "local-exec" {
      command = "echo sleep 60s to propagate all permissions; sleep 60"
 }
}

// Create Pod
resource "kubernetes_pod" "test" {
 depends_on = ["google_project_iam_member.datastore_user"]
 count = "${var.mode=="pod" ? 1 : 0}"
 metadata {
   name = "todo-backend"
   namespace = "${kubernetes_namespace.todo_backend.metadata.0.name}"
 }

Expected Behavior

What should have happened? Rather than wait for unknown number of seconds (which is flaky), provide option to wait for operation completion.

Actual Behavior

What actually happened? Service account provisioning operation is kicked off, but the terraform resource returns immediately as if it's created since it's an async operation. Provide option to wait for operation completion.

Steps to Reproduce

Please list the steps required to reproduce the issue, for example:

  1. terraform apply

Important Factoids

Are there anything atypical about your accounts that we should know? N/A

References

Are there any other GitHub issues (open or closed) or Pull Requests that should be linked here? N/A

@danawillow
Copy link
Contributor

danawillow commented Feb 27, 2018

So my opinion here is that the resource should always wait for completion, rather than making it a separate option. This is already how we handle resources with proper asynchronous APIs (they return an Operation, and we poll until the Operation says complete).

The issue here is that the ResourceManager/IAM API doesn't work that way- it just returns the newly created Policy, even if that change hasn't fully propagated. We could poll reading the policy back until it matches the one that we set, although I'm a bit nervous that there might also be an eventual consistency thing going on as well. For example, if setting the policy happens in data center A, then we read it back from A it'll say it has it. But then maybe the k8s check happens in data center B before the policy has made it to B. I don't know enough about how IAM works under the hood to know whether or not that's a possibility. If it is, I'm not sure if there's a way we can solve that with the API we have now. However, we can definitely make progress by polling until the policy matches what we set.

@danawillow danawillow added the bug label Feb 27, 2018
@paddycarver
Copy link
Contributor

(I've solved this in other providers by polling until I get N results back that match what I expect, or a timeout occurs.)

@sennerholm
Copy link

I hade the original problem wish @saturnism filled the bug for.

When I don't have the sleep I get permission errors when the testcase in the pod tries to connect to the datastore.
The full repo is: https://github.com/sennerholm/node-todo-backend/tree/master/terraform/todo-backend
and the terragrunt https://github.com/sennerholm/terraform-infrastructure-live

I don't know how the Policy is distrubuted in the google cloud, and if the api go against a single master or is executed at europe-west1.

Datastore is also a part of Google Appengine, and I don't know how well it's integrated in the rest of the google infrastructe, it's maybe should have worked much better I had used some other backend.

I also have found this
https://cloud.google.com/iam/docs/testing-permissions, I'm not sure if I could use that one to check the permission before I start the pod.

Sincerely
Mikael

@danawillow
Copy link
Contributor

@paddycarver do you have an example that comes to mind? I can look around myself for one (or just try it myself) but if you have one handy that would certainly help :)

@morgante
Copy link

morgante commented Mar 2, 2018

This is potentially an issue with the policy API itself, as I sometimes have issues with permissions not propagating even when done through the Cloud Console.

@paddycarver
Copy link
Contributor

@danawillow here's an example I wrote for the Vault provider, to ensure that the AWS access keys it returned are probably done propagating before it returned them: https://github.com/terraform-providers/terraform-provider-vault/blob/d8921672d5e07aa78e4a6497478fafaa1b707537/vault/data_source_aws_access_credentials.go#L129-L169

@ghost
Copy link

ghost commented Mar 29, 2020

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. If you feel I made an error 🤖 🙉 , please reach out to my human friends 👉 hashibot-feedback@hashicorp.com. Thanks!

@ghost ghost locked and limited conversation to collaborators Mar 29, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
6 participants