Data resource forces update in modules with count #25759

atimush · 2020-08-06T11:04:05Z

Terraform Version

Terraform v0.13.0-rc1

Terraform Configuration Files

The submodule (./modules/eks_helm_components):

data "aws_region" "current" {}

resource "helm_release" "cluster_autoscaler" {
  name       = "cluster-autoscaler"
  namespace  = "kube-system"
  repository = "https://kubernetes-charts.storage.googleapis.com"
  chart      = "cluster-autoscaler"
  version    = lookup({ for k, v in var.helm_components : "chart_version" => v["chart_version"] if k == "cluster_autoscaler" }, "chart_version")
  values = [
    yamlencode(lookup({ for k, v in var.helm_components : "values_default" => v["values_default"] if k == "cluster_autoscaler" }, "values_default")),
    yamlencode(lookup({ for k, v in var.helm_components : "values" => v["values"] if k == "cluster_autoscaler" }, "values")),
    ### The below values can not be overridden by user
    yamlencode({
      awsRegion = data.aws_region.current.name
      autoDiscovery = {
        clusterName = var.cluster_name
      }
    })]
}

The root module:

module "eks_helm_components_2" {
  source                      = "./modules/eks_helm_components"
  count                       = var.cluster_2 != null ? 1 : 0
  cluster_name                = var.cluster_1.name
  providers = {
    helm = helm.cluster_1
  }
  helm_components = { for k, v in local.helm_components_properties : k => merge(v, lookup(var.cluster_2.base_components, k, {})) }
}

Debug Output

module.test.module.eks_helm_components_2[0].data.aws_region.current will be read during apply
  (config refers to values not yet known)
 <= data "aws_region" "current"  {
      + current     = (known after apply)
      ~ description = "Europe (Ireland)" -> (known after apply)
      ~ endpoint    = "ec2.eu-west-1.amazonaws.com" -> (known after apply)
      ~ id          = "eu-west-1" -> (known after apply)
      ~ name        = "eu-west-1" -> (known after apply)
    }

module.test.module.eks_helm_components_2[0].helm_release.cluster_autoscaler will be updated in-place
  ~ resource "helm_release" "cluster_autoscaler" {
        atomic                     = false
        chart                      = "cluster-autoscaler"
        cleanup_on_fail            = false
        create_namespace           = false
        dependency_update          = false
        disable_crd_hooks          = false
        disable_openapi_validation = false
        disable_webhooks           = false
        force_update               = false
        id                         = "cluster-autoscaler"
        lint                       = false
        max_history                = 0
        metadata                   = [
            {
                chart     = "cluster-autoscaler"
                name      = "cluster-autoscaler"
                namespace = "kube-system"
                revision  = 1
                values    = jsonencode(
                    {
                        autoDiscovery  = {
                            clusterName = "test"
                        }
                        awsRegion      = "eu-west-1"
                        cloudProvider  = "aws"
                        image          = {
                            tag = "v1.17.1"
                        }
                        nameOverride   = "cluster-autoscaler"
                    }
                )
                version   = "7.3.4"
            },
        ]
        name                       = "cluster-autoscaler"
        namespace                  = "kube-system"
        recreate_pods              = false
        render_subchart_notes      = true
        replace                    = false
        repository                 = "https://kubernetes-charts.storage.googleapis.com"
        reset_values               = false
        reuse_values               = false
        skip_crds                  = false
        status                     = "deployed"
        timeout                    = 300
      ~ values                     = [
          - <<~EOT
                "cloudProvider": "aws"
                "image":
                  "tag": "v1.17.1"
                "nameOverride": "cluster-autoscaler"
            EOT,
          - jsonencode({}),
          - <<~EOT
                "autoDiscovery":
                  "clusterName": "test"
                "awsRegion": "eu-west-1"
            EOT,
        ] -> (known after apply)
        verify                     = false
        version                    = "7.3.4"
        wait                       = true
    }

Crash Output

Expected Behavior

TF should not force Helm resource update on each plan unless data resource change.

Actual Behavior

TF forces the Helm resource to update on each plan.

Steps to Reproduce

terraform init
terraform apply
terraform plan

Additional Context

The state file has the "current": null, value. We run TF in TF Cloud.

"mode": "data",
      "type": "aws_region",
      "name": "current",
      "provider": "provider[\"registry.terraform.io/hashicorp/aws\"]",
      "instances": [
        {
          "schema_version": 0,
          "attributes": {
            "current": null,
            "description": "Europe (Ireland)",
            "endpoint": "ec2.eu-west-1.amazonaws.com",
            "id": "eu-west-1",
            "name": "eu-west-1"
          }
        }
      ]

References

The text was updated successfully, but these errors were encountered:

danieldreier · 2020-08-13T16:29:37Z

@jbardin can you help me understand this? Is this a matter of the aws_region data data source being evaluated during application rather than during planning?

jbardin · 2020-08-13T17:51:06Z

Yes, it appears that data.aws_region.current is being deferred to apply, but I'm not certain exactly what is causing this as it's clear there is no data source configuration to be "unknown". The only other thing that should cause it to be deferred is if there are any depends_on references with changes, but the only other resource shown is a direct dependent and there are no depends_on attributes in the sample config.

The "current": null field in the state is expected with aws provider versions <3.0.0, since that field has been deprecated but not yet removed.

If this is still happening with 0.13.0, I would need a trace log or reproduction to learn more.

danieldreier · 2020-08-13T17:55:27Z

Thanks @jbardin! @atimush, do you think you can help generate a simplified reproduction case that we can run directly? The code you've provided has dependencies I can't see, so we'd need something we can copy-paste or git clone. Ideally, this would have a similar structure but leave out helm and eks, because the effort of setting up an EKS + helm infrastructure is pretty significant.

qwerty1979bg · 2020-08-16T16:09:09Z

@danieldreier

Note: this might turn out to be an unrelated issue

It seems that at least this specific AWS data source (aws_vpcs) forces an update in v0.13.0. It does not have to be in a module or use count.

provider "aws" {
  version = "3.2.0"
  region  = "us-east-1"
}

data "aws_vpcs" "this" {
}

resource "null_resource" "test" {
}

This code applies only once, as expected, on TF v0.12.29.

With TF v0.13.0 it tries to read the data source during apply.

  # data.aws_vpcs.this will be read during apply
  # (config refers to values not yet known)

(Note: The AWS provider version is the same in both cases - 3.2.0)

remilapeyre · 2020-08-16T16:25:16Z

Hi @qwerty1979bg, I think the issue you are having is reported at hashicorp/terraform-provider-aws#14579.

qwerty1979bg · 2020-08-17T10:02:35Z

@remilapeyre, thanks ! This is an exact match ! :)

@atimush,
Your code uses data "aws_region" "current" {}, which is in the list of affected resources in hashicorp/terraform-provider-aws#14579

apparentlymart · 2020-08-19T22:14:33Z

Thanks for the pointer, @remilapeyre!

I believe the issue over in the AWS provider affects the aws_regions (plural) data source, but not the aws_region singular one, because the singular one sets its id to the name of the single region it looks up.

My sense is that this problem arose because a plural data source doesn't really have a single id string to use, and the Terraform SDK requires that every data source result include a non-empty id so the developers of these data sources used the current time as a placeholder ID value.

However, because aws_region does converge on a stable result (it uses the current region name as the id), I think the behavior reported here is separate from hashicorp/terraform-provider-aws#14579 and warrants some independent investigation.

With that said, as @danieldreier said the next step here would be to find a simplified configuration that we can run to reproduce the problem. Hopefully there's a smaller reproduction case that includes only aws_region and potentially an additional placeholder resource like null_resource to exercise its result, which would then allow us to observe it without needing a full Kubernetes cluster.

jbardin · 2020-08-20T14:48:20Z

To be clear, the non-deterministic ID issues described in terraform-provider-aws#14579 is only the symptom. There are 2 primary causes we have found: data sources that are altering non-computed attributes, and the SDK altering zero block values. Since those two situations cover all the known cases so far, we should probably just wait and confirm the status of this issue with the upcoming aws provider and core releases.

atimush · 2020-08-23T20:14:28Z

Hello guys,
Thank you for your efforts and apologies for a late reply. I've been trying to provide a simple local execution to simulate the issue but all runs are perfectly fine :(

Still, the existing EKS modules are reporting the same issue. Noticed that the problem is not related with AWS region only data source but with any data resources (even with some of the AWS IAM Policy documents). I would assume could be something related to the providers:
providers = { helm = helm.cluster_1 }

Providing below some of the traces that might help:

2020/08/23 19:37:07 [TRACE] vertex "module.g8s.module.eks_helm_components_1.data.aws_region.current": expanding dynamic subgraph
2020/08/23 19:37:07 [TRACE] Executing graph transform *terraform.ResourceCountTransformer
2020/08/23 19:37:07 [TRACE] ResourceCountTransformer: adding module.g8s.module.eks_helm_components_1[0].data.aws_region.current as *terraform.NodeRefreshableDataResourceInstance
2020/08/23 19:37:07 [TRACE] Completed graph transform *terraform.ResourceCountTransformer with new graph:
  module.g8s.module.eks_helm_components_1[0].data.aws_region.current - *terraform.NodeRefreshableDataResourceInstance  ------
2020/08/23 19:37:07 [TRACE] Executing graph transform *terraform.OrphanResourceInstanceCountTransformer
2020/08/23 19:37:07 [TRACE] Completed graph transform *terraform.OrphanResourceInstanceCountTransformer (no changes)
2020/08/23 19:37:07 [TRACE] Executing graph transform *terraform.AttachStateTransformer
2020/08/23 19:37:07 [TRACE] Completed graph transform *terraform.AttachStateTransformer (no changes)
2020/08/23 19:37:07 [TRACE] Executing graph transform *terraform.TargetsTransformer
2020/08/23 19:37:07 [TRACE] Completed graph transform *terraform.TargetsTransformer (no changes)
2020/08/23 19:37:07 [TRACE] Executing graph transform *terraform.ReferenceTransformer
2020/08/23 19:37:07 [DEBUG] ReferenceTransformer: "module.g8s.module.eks_helm_components_1[0].data.aws_region.current" references: []
2020/08/23 19:37:07 [TRACE] Completed graph transform *terraform.ReferenceTransformer (no changes)
2020/08/23 19:37:07 [TRACE] Executing graph transform *terraform.RootTransformer
2020/08/23 19:37:07 [TRACE] Completed graph transform *terraform.RootTransformer (no changes)
2020/08/23 19:37:07 [TRACE] vertex "module.g8s.module.eks_helm_components_1.data.aws_region.current": entering dynamic subgraph
2020/08/23 19:37:07 [TRACE] dag/walk: visiting "module.g8s.module.eks_helm_components_1[0].data.aws_region.current"
2020/08/23 19:37:07 [TRACE] vertex "module.g8s.module.eks_helm_components_1[0].data.aws_region.current": starting visit (*terraform.NodeRefreshableDataResourceInstance)
2020/08/23 19:37:07 [TRACE] vertex "module.g8s.module.eks_helm_components_1[0].data.aws_region.current": evaluating
2020/08/23 19:37:07 [TRACE] [walkRefresh] Entering eval tree: module.g8s.module.eks_helm_components_1[0].data.aws_region.current
2020/08/23 19:37:07 [TRACE] eval: *terraform.EvalSequence
2020/08/23 19:37:07 [TRACE] eval: *terraform.EvalGetProvider

Feel free to close the case if we can not find the root cause of this behaviour. Will try to update in the upcoming hours if able to re-produce it in a more simple way.

Thank you,
Andrei

jbardin · 2020-09-01T18:27:11Z

@atimush

You mention that this happens with a module count in conjunction with the data sources. Without a repro we can't tell exactly what's going on, but some general debugging may help. I suspect the data source being shown in the plan is simply a side-effect of what's going on, and not the reason. From the partial config here, I would suspect it's being caused by something in the helm_components inputs being unknown.

First thing to check, is this still a problem with 0.13.1?
Does the situation change when hard-coding the count value?
Does hard-coding the awsRegion value change anything?

Reducing the problem to a single config change that triggers the diff could give us a clue as to what is going on.
The relevant section of the trace log would be from the plan creation, but the entire log would be more helpful if possible.

Thanks

jbardin · 2020-10-13T13:27:03Z

There hasn't been any new activity here, so I'm going to close this out.
Similar issues should be rectified in 0.14, as the merging of refresh+plan has allowed for more data resource lifecycle improvements.

Thanks!

ghost · 2020-11-13T01:51:33Z

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

atimush added bug new new issue not yet triaged labels Aug 6, 2020

danieldreier added the waiting-response An issue/pull request is waiting for a response from the community label Aug 13, 2020

ghost removed waiting-response An issue/pull request is waiting for a response from the community labels Aug 16, 2020

apparentlymart added the waiting-response An issue/pull request is waiting for a response from the community label Aug 19, 2020

ghost removed waiting-response An issue/pull request is waiting for a response from the community labels Aug 23, 2020

jbardin added waiting for reproduction unable to reproduce issue without further information and removed new new issue not yet triaged labels Sep 1, 2020

jbardin closed this as completed Oct 13, 2020

ghost locked as resolved and limited conversation to collaborators Nov 13, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Data resource forces update in modules with count #25759

Data resource forces update in modules with count #25759

atimush commented Aug 6, 2020 •

edited

Loading

danieldreier commented Aug 13, 2020

jbardin commented Aug 13, 2020

danieldreier commented Aug 13, 2020

qwerty1979bg commented Aug 16, 2020

remilapeyre commented Aug 16, 2020

qwerty1979bg commented Aug 17, 2020

apparentlymart commented Aug 19, 2020

jbardin commented Aug 20, 2020

atimush commented Aug 23, 2020 •

edited by jbardin

Loading

jbardin commented Sep 1, 2020

jbardin commented Oct 13, 2020

ghost commented Nov 13, 2020

Data resource forces update in modules with count #25759

Data resource forces update in modules with count #25759

Comments

atimush commented Aug 6, 2020 • edited Loading

Terraform Version

Terraform Configuration Files

Debug Output

Crash Output

Expected Behavior

Actual Behavior

Steps to Reproduce

Additional Context

References

danieldreier commented Aug 13, 2020

jbardin commented Aug 13, 2020

danieldreier commented Aug 13, 2020

qwerty1979bg commented Aug 16, 2020

remilapeyre commented Aug 16, 2020

qwerty1979bg commented Aug 17, 2020

apparentlymart commented Aug 19, 2020

jbardin commented Aug 20, 2020

atimush commented Aug 23, 2020 • edited by jbardin Loading

jbardin commented Sep 1, 2020

jbardin commented Oct 13, 2020

ghost commented Nov 13, 2020

atimush commented Aug 6, 2020 •

edited

Loading

atimush commented Aug 23, 2020 •

edited by jbardin

Loading