Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

google_compute_instance resource churn when setting network_interface.subnetwork to a subnetwork name #19339

Closed
peikk0 opened this issue Sep 2, 2024 · 11 comments · Fixed by GoogleCloudPlatform/magic-modules#11653, hashicorp/terraform-provider-google-beta#8128 or #19400

Comments

@peikk0
Copy link

peikk0 commented Sep 2, 2024

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request.
  • Please do not leave +1 or me too comments, they generate extra noise for issue followers and do not help prioritize the request.
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment.
  • If an issue is assigned to a user, that user is claiming responsibility for the issue.
  • Customers working with a Google Technical Account Manager or Customer Engineer can ask them to reach out internally to expedite investigation and resolution of this issue.

Terraform Version & Provider Version(s)

Terraform v1.9.5
on amd64

  • provider registry.terraform.io/hashicorp/google v5.43.1
  • provider registry.terraform.io/hashicorp/google-beta v5.43.1

Affected Resource(s)

google_compute_instance

Terraform Configuration

resource "google_compute_instance" "my-instance" {
  name = "my-instance"
  ...

  network_interface {
    subnetwork = data.google_compute_subnetwork.my-subnet.name
  }
}

Debug Output

No response

Expected Behavior

No diff for existing google_compute_instance resources.

Actual Behavior

  # google_compute_instance.my-instance will be updated in-place
  ~ resource "google_compute_instance" "my-instance" {
        id                        = "projects/my-project/zones/us-central1-b/instances/my-instance"
        name                      = "my-instance"
        # (23 unchanged attributes hidden)
      ~ network_interface {
            name                        = "nic0"
          ~ subnetwork                  = "https://www.googleapis.com/compute/v1/projects/my-project/regions/us-central1/subnetworks/my-subnet" -> "my-subnet"
            # (9 unchanged attributes hidden)
        }
        # (5 unchanged blocks hidden)
    }

Steps to reproduce

  1. terraform apply
  2. terraform plan

Important Factoids

No diff with v5.42.0, there was a diff too with v5.43.0 though.

References

Subnetwork names are a valid value according to the documentation: https://registry.terraform.io/providers/hashicorp/google/5.43.1/docs/resources/compute_instance#subnetwork

b/364903020

@peikk0 peikk0 added the bug label Sep 2, 2024
@github-actions github-actions bot added forward/review In review; remove label to forward service/compute-instances labels Sep 2, 2024
@karolgorc
Copy link

karolgorc commented Sep 3, 2024

The data.google_compute_subnetwork.my-subnet.self_link is using name instead of a self_link as input for some reason.

 ~ subnetwork                  = "https://www.googleapis.com/compute/v1/projects/my-project/regions/us-central1/subnetworks/my-subnet" -> "my-subnet"

Couldn't simply recreate this with something like

data "google_compute_subnetwork" "default" {
  name = "default"
  project = var.project_id
}

resource "google_compute_instance" "default" {
  name         = "test"
  machine_type = "n2-standard-2"
  zone         = var.zone
  boot_disk {
    initialize_params {
      image = "debian-cloud/debian-11"
    }
  }
  network_interface {
    subnetwork = data.google_compute_subnetwork.default.self_link
  }
}

Doing something like this shows all of the expected values

output "data" {
  value = data.google_compute_subnetwork.default.self_link
}

output "instance" {
  value = google_compute_instance.default.network_interface.0.subnetwork
}

Tried to replicate this on newest provider versions and on 5.43.1 and couldn't hit this issue. Juding by this I'd say that this is an issue caused by something in your environment. Most likely the fact that your data block is returing name instead of self_link

Have you tried applying this configuration? If so does the diff persist?
Is this a newly created terraform configuration or did it happen while upgrading the provider version?
Can you provide a debug log of a terraform run? You can do it by setting a TF_LOG variable to DEBUG

TF_LOG=DEBUG terraform plan
TF_LOG=DEBUG terraform apply
#etc.

@peikk0
Copy link
Author

peikk0 commented Sep 3, 2024

@karolgorc oh yes you're right, it's actually the other way around (I'll put that on working late 😂), we are actually setting it to the name of the subnet (with subnetwork = google_compute_subnetwork.subnetwork.name), it seems to be stored as the resolved self-link in the TF state when applying, and so we get a diff self_link -> name on the next plan again. I'll update the title/description.

Have you tried applying this configuration? If so does the diff persist?

Yes, we get the diff again after applying.

Is this a newly created terraform configuration or did it happen while upgrading the provider version?

It's an existing configuration, it happened when upgrading from v5.42 to v5.43.

Can you provide a debug log of a terraform run? You can do it by setting a TF_LOG variable to DEBUG

I'll look at the debug logs tomorrow. 🙏🏻

@peikk0 peikk0 changed the title google_compute_instance resource churn when setting network_interface.subnetwork to a self-link google_compute_instance resource churn when setting network_interface.subnetwork to a subnetwork name Sep 3, 2024
@karolgorc
Copy link

A quick fix that will get rid of the diff would be to replace google_compute_subnetwork.subnetwork.name with google_compute_subnetwork.subnetwork.self_link. This should produce an empty plan because then the values will be equal.

This happens because the Google API translates the name to self_link outside of terraform when an Instance is created and we suppress that diff here.

"subnetwork": {
  Type:             schema.TypeString,
  Optional:         true,
  Computed:         true,
  DiffSuppressFunc: tpgresource.CompareSelfLinkOrResourceName, // <----- here
  Description:      `The name or self_link of the subnetwork attached to this interface.`,
},

This suppression works when i'm trying to simulate this provider upgrade and produces an empty plan. My only guess for why it doesn't happen on your configuration is that somehow the state got corrupted when upgrading, not really sure tough.

I wonder if you can reproduce this in a sterile environment with a clean configuration. If that quick fix isn't enough for you then providing a clean config with steps to recreate the issue would be ideal 🙂

@peikk0
Copy link
Author

peikk0 commented Sep 4, 2024

Ok I managed to reproduce it with a simple example, it seems to happen only if you also set network_interface.network_ip = "".

terraform {
  required_version = "~> 1.9"

  required_providers {
    google = {
      source  = "hashicorp/google"
      version = "~> 5.43.0"
    }
  }
}

variable "project_id" {
  type    = string
  default = "my-project"
}

variable "region" {
  type    = string
  default = "us-east1"
}

variable "zone" {
  type    = string
  default = "us-east1-b"
}

provider "google" {
  project = var.project_id
}

data "google_compute_subnetwork" "default" {
  name    = "default"
  project = var.project_id
  region  = var.region
}

resource "google_compute_instance" "default" {
  name         = "test"
  machine_type = "n2-standard-2"
  zone         = var.zone
  boot_disk {
    initialize_params {
      image = "debian-cloud/debian-11"
    }
  }
  network_interface {
    subnetwork = data.google_compute_subnetwork.default.name
    network_ip = ""
  }
}
❯ terraform apply
data.google_compute_subnetwork.default: Reading...
data.google_compute_subnetwork.default: Read complete after 1s [id=projects/my-project/regions/us-east1/subnetworks/default]

Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
  + create

Terraform will perform the following actions:

  # google_compute_instance.default will be created
  + resource "google_compute_instance" "default" {
      + can_ip_forward       = false
      + cpu_platform         = (known after apply)
      + current_status       = (known after apply)
      + deletion_protection  = false
      + effective_labels     = (known after apply)
      + guest_accelerator    = (known after apply)
      + id                   = (known after apply)
      + instance_id          = (known after apply)
      + label_fingerprint    = (known after apply)
      + machine_type         = "n2-standard-2"
      + metadata_fingerprint = (known after apply)
      + min_cpu_platform     = (known after apply)
      + name                 = "test"
      + project              = "my-project"
      + self_link            = (known after apply)
      + tags_fingerprint     = (known after apply)
      + terraform_labels     = (known after apply)
      + zone                 = "us-east1-b"

      + boot_disk {
          + auto_delete                = true
          + device_name                = (known after apply)
          + disk_encryption_key_sha256 = (known after apply)
          + kms_key_self_link          = (known after apply)
          + mode                       = "READ_WRITE"
          + source                     = (known after apply)

          + initialize_params {
              + image                  = "debian-cloud/debian-11"
              + labels                 = (known after apply)
              + provisioned_iops       = (known after apply)
              + provisioned_throughput = (known after apply)
              + size                   = (known after apply)
              + type                   = (known after apply)
            }
        }

      + confidential_instance_config (known after apply)

      + network_interface {
          + internal_ipv6_prefix_length = (known after apply)
          + ipv6_access_type            = (known after apply)
          + ipv6_address                = (known after apply)
          + name                        = (known after apply)
          + network                     = (known after apply)
          + network_ip                  = (known after apply)
          + stack_type                  = (known after apply)
          + subnetwork                  = "default"
          + subnetwork_project          = (known after apply)
        }

      + reservation_affinity (known after apply)

      + scheduling (known after apply)
    }

Plan: 1 to add, 0 to change, 0 to destroy.

Do you want to perform these actions?
  Terraform will perform the actions described above.
  Only 'yes' will be accepted to approve.

  Enter a value: yes

google_compute_instance.default: Creating...
google_compute_instance.default: Still creating... [10s elapsed]
google_compute_instance.default: Still creating... [20s elapsed]
google_compute_instance.default: Creation complete after 22s [id=projects/my-project/zones/us-east1-b/instances/test]

Apply complete! Resources: 1 added, 0 changed, 0 destroyed.
❯ terraform plan
data.google_compute_subnetwork.default: Reading...
data.google_compute_subnetwork.default: Read complete after 2s [id=projects/my-project/regions/us-east1/subnetworks/default]
google_compute_instance.default: Refreshing state... [id=projects/my-project/zones/us-east1-b/instances/test]

Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
  ~ update in-place

Terraform will perform the following actions:

  # google_compute_instance.default will be updated in-place
  ~ resource "google_compute_instance" "default" {
        id                   = "projects/my-project/zones/us-east1-b/instances/test"
        name                 = "test"
        tags                 = []
        # (22 unchanged attributes hidden)

      ~ network_interface {
            name                        = "nic0"
          ~ subnetwork                  = "https://www.googleapis.com/compute/v1/projects/my-project/regions/us-east1/subnetworks/default" -> "default"
            # (9 unchanged attributes hidden)
        }

        # (3 unchanged blocks hidden)
    }

Plan: 0 to add, 1 to change, 0 to destroy.

───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

Note: You didn't use the -out option to save this plan, so Terraform can't guarantee to take exactly these actions if you run "terraform apply" now.

No diff if I remove network_ip = "". And I've verified again that it doesn't happen with v5.42, only with v5.43 and also v6.0.

@peikk0
Copy link
Author

peikk0 commented Sep 4, 2024

So it's likely related to #19135

@peikk0
Copy link
Author

peikk0 commented Sep 4, 2024

It works fine with setting it to null (and I'm fixing our modules to do that), it's only happening when set to an empty string.

@karolgorc
Copy link

karolgorc commented Sep 4, 2024

Ok, i can reproduce the issue now. Will check out the conditions that make this happen.

Also tested out the quick fix I've provided earlier. Providing self_link does work if you just need to get rid of this diff for now
image

Also the network_ip parameter is optional. You can just remove it and it will get rid of the diff as well
image

@peikk0
Copy link
Author

peikk0 commented Sep 4, 2024

Also the network_ip parameter is optional. You can just remove it and it will get rid of the diff as well

Yes, we actually have a conditional value on it, which is why it was set to "" in those cases:

  network_interface {
    subnetwork         = var.subnetwork_name != "" ? var.subnetwork_name : one(google_compute_subnetwork.subnetwork[*].name)
    subnetwork_project = var.project
    network_ip         = var.static_private_ip ? google_compute_address.static-ip-address[count.index].address : ""
  }

@ggtisc ggtisc self-assigned this Sep 5, 2024
@ggtisc
Copy link
Collaborator

ggtisc commented Sep 5, 2024

resource "google_compute_instance" "default" {
  name         = "test"
  machine_type = "n2-standard-2"
  zone         = var.zone
  boot_disk {
    initialize_params {
      image = "debian-cloud/debian-11"
    }
  }
  network_interface {
    subnetwork = data.google_compute_subnetwork.default.name
    network_ip = ""
  }
}

Confirmed permadiff issue with the shared configuration

@ggtisc ggtisc removed their assignment Sep 5, 2024
@ggtisc ggtisc removed the forward/review In review; remove label to forward label Sep 5, 2024
@karolgorc
Copy link

karolgorc commented Sep 6, 2024

EDIT: found the issue in our code.

not really sure how to explain it but it works after some changes 😄

  1. The forceNewIfNetworkIPNotUpdatableFunc is executed as a part of CustomizeDiff
  2. network_ip recognizes a change for some reason when set to "" and is set as a ForceNew field
  3. But the API doesn't recognize the change and there aren't any parameters returned from the API that could cause the ForceNew
  4. This overrides all of the DiffSuppress functions and the other fields cause diff

Fixes to apply until the PR is merged

  • use self_link instead of name
  • use null instead of ""

Will submit a PR with the fix in a few moments

Copy link

github-actions bot commented Oct 7, 2024

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 7, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.