-
Notifications
You must be signed in to change notification settings - Fork 9.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: data sources refresh every time which is causing resource to be replaced unnecessarily #29421
Comments
Community NoteVoting for Prioritization
Volunteering to Work on This Issue
|
We are encountering a similar issue with the We're on version We have two modules that are used together for making a lambda function. I've simplified the examples to keep this concise. # main.tf
module "iam_role" {
}
module "lambda_function" {
iam_role = module.iam_role.arn
} Inside the lambda_function module, we look up the aws_region and join it with a few user values to create the function name. # lambda_function module
data "aws_region" "current" {}
resource "random_id" "main" {
byte_length = 2
}
resource "aws_lambda_function" "main" {
function_name = join("-", [var.input_1, data.aws_region.current.name, var.input_2, random_id.main.hex,])
} The problem we're encountering is when an inconsequential input to the iam_role module is changed, it's triggering the data resource inside the lambda module to re-read, and then recreate the function (with the same name after the apply completes). We also use a random string in the name, and that resource does NOT get changed. For instance, if we simply add a new tag to the iam_role, the data resource will get marked as needing to be read during apply due to a dependency.
|
Same issue with |
Same issue with Terraform version: |
Hey @avnerv 👋 Thank you for taking the time to raise this (and for everyone else for the ongoing discussion). The behavior you're experiencing here is described in the Terraform Data Source documentation under Data Resource Behavior. If the data source depends on another managed resource (including modules) that has changes, the data source cannot be read until apply time. When the data from the data source is then used to set the value of an argument on an additional resource, that value will in turn not be known until apply time. For certain arguments on certain resources, this is seen as a reason to replace the resource; for example, changing the |
I am experiencing the same behaviour. We created a Terraform module that uses We recently updated the module and created a new version release, when the input to the module is changed, the data sources are read again, and resulted in the behaviour justinretzolk described above. While I understand this is a Terraform behaviour, and combined with the fact that certain attributes of AWS resources are immutable which leads to resources being replaced; this behaviour remains undesirable and renders the point of using data sources moot. Is there a known solution / workaround to the data source problem, or is there a way to force data sources to be "pre-applied" prior to the resources getting applied? |
I had to remove almost all of the data providers in the project because of this. My understanding is that Hashicorp wants things to be more deterministic but even in cases where the data provider is always the same like with availability zones, the data provider causes the resources to be re-created. What exactly is Hashicorp trying to achieve here? This is too big of an issue for this to be just a bug. |
Hi all, Looking at the original example I see that the I would suggest solving this by making use of the data Terraform already has in memory as a result of planning and applying the For example (assuming all three of those locals {
subnets_by_name = {
for subnet in concat(aws_subnet.public, aws_subnet.infra, aws_subnet.public) :
subnet.tags_all["Name"] => subnet
}
}
resource "aws_route_table_association" "this" {
for_each = { for key in var.route_table_routes : key.name => key }
subnet_id = local.subnets_by_name[each.value.subnet_name].id
route_table_id = aws_route_table.this[each.key].id
} Any situation where you use a For any situation with these symptoms that isn't caused by trying to re-read the same object another part of the configuration is already managing, the usual answer will be to specify your dependencies more precisely. In particular, there's no reason for an empty |
Like I said, data providers are nearly useless now. So many rules and expectations from developers just to make it behave is a really poor developer experience. There is no reason a data provider could not run at plan time and determine what if any changes would be made to the state. Literally no one on my team would be able to understand these rules and still get value out of data providers. It's too obscure of a design requirement. |
At the moment we are working around this issue with a lifecycle policy on resources that consumes this problematic AWS data source. lifecycle {
ignore_changes = [
availability_zone
]
} |
I'm having a similar issue to this in modules that have explicit dependencies. It might be something going on but if I set a module that depends on something this triggers as well. In those cases does aws_region inherit the dependency and specifically trigger even though it's effectively a constant within the same region? I've had to work around this by injecting region as a constant into modules instead of being able to query them but it's a real pain. for example: resource "some_resource_type" "some_resource_name" {
#...
}
module "that_contains_a_region_data_source" {
depends_on = [some_resource_type.some_resource_name]
# ...
} ^ this ends with a persistent diff on anything that relies on region - I'm assuming because of this issue - ideally I'd be able to use aws_region in this case without having to pass region through. |
I'm facing the same issue with following data types:
|
@apparentlymart not sure if what you write here applies to my case. I don't use the I created a submodule which should create backup vaults. I'm passing my provider and want to determine the KMS key to use. terraform {
required_providers {
aws = {
source = "hashicorp/aws"
configuration_aliases = [
aws.backup,
]
}
}
}
data "aws_region" "this" {
provider = aws.backup
}
resource "aws_backup_vault" "backup" {
provider = aws.backup
name = var.source_account.id
kms_key_arn = data.aws_region.this.name == "us-east-1" ? var.aws_kms_key.arn_home_region : var.aws_kms_key.arn_backup_region
} I get the same message for the
This results in the key not being known to terraform and therefore terraform wants to replace (destroy and recreate) the backup vault. |
@woodcockjosh wrote:
Same. In a new module I had to remove all of them except for two. |
@apparentlymart Just checking in on this, curious about the folks reporting this with aws_region, which shouldn't depend on any resources. |
Terraform v1.7.4
Similar issue, where re-create happens on no change with data retrieved from data-source:
results in:
This is just one example. Another case is where I use count based on data resource:
Results in:
The amount of security groups are the same, they are static in this case. Why they can't be retrieved and counted during plan phase? I've wrote a small python script that does the plan, finds the resources that need to be read, and outputs the apply command targeting those particular resources.
This fixes the issue for my particular case. Run the script above, run resulting target apply, next run is normal apply -> all green, no changes. it is ugly, but it works :/ Edit: Some good news, at least during apply it won't re-create resources that are matching:
|
That All of the resources in a child module must inherit whatever dependencies the module itself has, because Terraform must resolve the module call before it can resolve anything inside the module. Modules acquire their own dependencies through use of |
Any one have solution for this? We are also having same issue . |
When did this change exactly? We have a module that is in use across multiple applications (though varying versions of the AWS provider and Terraform to be sure) and this problem only just hit us on our latest application which is generally using more current versions of each. Basically the module accepts a list of subnets and, in the process of creating a security group, fetches the subnet data from AWS to get the associated vpc_id:
On each plan now, Terraform wants to destroy and recreate this security group because it thinks the vpc_id might change (when it doesn't). If I move this data resource to the parent module and pass in the VPC directly as a variable, Terraform is happy to not recreate the underlying resource because it doesn't actually change. The work around is to update the module to accept a vpc_id directly, but still seems odd that this worked before and has apparently stopped. It was already mentioned, but I'm also seeing Versions: hashicorp/aws v4.67.0, terraform v1.7.2 ETA: Checked another project with a very similar setup. Uses an older version of the same module, though on comparison nothing substantially different stands out. We just plan/applied that project and the data resource for the subnet did not trigger a diff. Versions: hashicorp/aws v3.76.0, terraform v1.3.7. |
In our case we are creating broker first than using datasource for data "dns_a_record_set" to get ip which passed to nlb .. terraform assuming if any update on broker will change ip address but in our its not . |
Also running into the same issue for various resources, some of which can be updated in-place, and other needs to be recreated entirely from scratch. After putting some thought into this issue, I'm pretty sure this isn't AWS' fault, rather, this is a Hashicorp/Terraform issue, as others have mentioned in this thread, it doesn't make any logical sense for a datablock to not* be queried while developing out the execution plan (just like how other states are pulled to determine if resources need to be created in the first place). |
I see this "known after apply" for even a trivial case with a child resource:
The terraform plan with no changes in the code results in the following in the plan:
How can this simple policy not be known at plan time?? Many other resources don't have this problem, but aws_iam_policy and lambda resources seem particularly problematic with regards to unnecessary terraform plan noise. |
I think @justinretzolk already answered the reason. I am putting an example here. Here is simple directory structure. Assume there are two modules
cw/main.tf variable "name_prefix" {
type = string
}
data "aws_region" "current" {}
resource "aws_cloudwatch_log_group" "this" {
name = "${var.name_prefix}/application/sample-${data.aws_region.current.name}" # depends on data resource
} cw/versions.tf is terraform {
required_version = ">= 1.0"
required_providers {
aws = {
source = "hashicorp/aws"
version = ">= 5.0"
}
}
} terraform.tf terraform {
required_version = "~> 1.8.0"
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.47"
}
}
backend "s3" {
region = "eu-north-1"
bucket = "terraform-state-blah-eu-north-1"
key = "mock-test.tfstate"
profile = "aws-dev"
dynamodb_table = "terraform-state-lock"
}
}
provider "aws" {
region = "eu-north-1"
profile = "aws-dev"
}
module "alarm" {
source = "./alarm"
}
# NOTE: I am calling same module twice once without depends on and once with depends on
module "cw_without_depends_on" {
source = "./cw"
name_prefix = "without-depends-on"
}
module "cw_with_depends_on" {
source = "./cw"
name_prefix = "with-depends-on"
depends_on = [module.alarm] <------- See this
}
For this following plan is generated. # module.cw_with_depends_on.data.aws_region.current will be read during apply
# (depends on a resource or a module with changes pending)
<= data "aws_region" "current" {
+ description = (known after apply)
+ endpoint = (known after apply)
+ id = (known after apply)
+ name = (known after apply) <--- due to depends_on
}
# module.cw_with_depends_on.aws_cloudwatch_log_group.this will be created
+ resource "aws_cloudwatch_log_group" "this" {
+ arn = (known after apply)
+ id = (known after apply)
+ log_group_class = (known after apply)
+ name = (known after apply) <--- name NOT resolved, due to aws_region resolution is deffered
+ name_prefix = (known after apply)
+ retention_in_days = 0
+ skip_destroy = false
+ tags_all = (known after apply)
}
# module.cw_without_depends_on.aws_cloudwatch_log_group.this will be created
+ resource "aws_cloudwatch_log_group" "this" {
+ arn = (known after apply)
+ id = (known after apply)
+ log_group_class = (known after apply)
+ name = "without-depends-on/application/sample-eu-north-1" <--- name resolved.
+ name_prefix = (known after apply)
+ retention_in_days = 0
+ skip_destroy = false
+ tags_all = (known after apply)
}
data resource Hope this helps. |
I was able to replicate the issue with This is a huge issue. I have a workaround at hand for my case, but single resource requiring refresh should not cause ALL data sources in the module to refresh. This is a bug. |
Same issue with data "aws_ec2_transit_gateway_attachment" Terraform version: v1.5.4 |
I see this issue with changes in default_tags which can be annoying if you have a lot of policy documents in a single directory. # data.aws_iam_policy_document.default will be read during apply
# (depends on a resource or a module with changes pending)
<= data "aws_iam_policy_document" "default" {
+ id = (known after apply)
+ json = (known after apply)
+ minified_json = (known after apply)
+ statement {
+ actions = [
+ "events:PutEvents",
]
+ effect = "Allow"
+ resources = [
+ "arn:aws:events:us-east-1:snip:event-bus/guardduty",
]
+ sid = "snip"
+ principals {
+ identifiers = [
+ "snip",
]
+ type = "AWS"
}
}
}
# aws_cloudwatch_event_bus_policy.default will be updated in-place
! resource "aws_cloudwatch_event_bus_policy" "default" {
id = "guardduty"
! policy = jsonencode(
{
- Statement = [
- {
- Action = "events:PutEvents"
- Effect = "Allow"
- Principal = {
- AWS = "arn:aws:iam::snip:root"
}
- Resource = "arn:aws:events:us-east-1:snip:event-bus/guardduty"
- Sid = "snip"
},
]
- Version = "2012-10-17"
}
) -> (known after apply)
# (1 unchanged attribute hidden)
} |
Hey all 👋 As Martin and I outlined above, the requirement to delay reading a data source until apply time is something that is dictated by Terraform Core itself, and is something that the AWS provider has no control over. For what it's worth, in my six years at HashiCorp (all supporting Terraform in some way), my recollection is that this behavior has remained consistent. Regardless, being that this is Core functionality and applies to all providers, discussions around whether that behavior should be changed don't really fit this repository. I feel that leaving this issue open any longer would indicate that the AWS Provider team needs to investigate this further, or that this might be a bug. Since neither of those things are true, I'm going to close this issue. As always, if you run into unexpected behavior in the future, please do let us know. |
Warning This issue has been closed, meaning that any additional comments are hard for our team to see. Please assume that the maintainers will not see them. Ongoing conversations amongst community members are welcome, however, the issue will be locked after 30 days. Moving conversations to another venue, such as the AWS Provider forum, is recommended. If you have additional concerns, please open a new issue, referencing this one where needed. |
@justinretzolk Thanks for the comment and the additional information around the terraform-core issue. Is there an upstream issue in hashicorp/terraform or discussion that can be tracked to follow up on this? |
I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. |
Terraform Core Version
1.3.6
AWS Provider Version
3.76.1
Affected Resource(s)
Expected Behavior
In this scenario, when I create a new subnet (such as
aws_subnet.infra
, which is referred to asmanagement
), I anticipate that Terraform will only create the resources that are related to it, such as:Actual Behavior
what occurs is that Terraform replaces an existing
aws_route_table_association
resources even if they haven't been modified, when a new subnet is created.Relevant Error/Panic Output Snippet
No response
Terraform Configuration Files
The text was updated successfully, but these errors were encountered: