-
Notifications
You must be signed in to change notification settings - Fork 9.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
aws_msk_scram_secret_association overrides/removes existing secret associations #16791
Comments
I experienced this when I submitted my original PR, it is because the API will always return all the associated secrets, regardless of how they were associated. With most terraform providers, you would normally read the remote resource and store all the resources returned, but this makes a huge assumption of how you manage resources, i.e all through terraform. Here is the code - https://github.com/hashicorp/terraform-provider-aws/pull/15302/files#diff-2836c697ab96fb1c60919893bfb0dfd0405fb9138d7d9640375ea3fc8686ae81R69 The way that I got around this was to filter the secrets based on the inputs and the response for the API in my original PR - https://github.com/dblooman/terraform-provider-aws/blob/58e10f98553d811047dd6f33b1c1d01359814a46/aws/resource_aws_msk_scram_secret.go#L74 I think there are other examples of this behaviour in the provider, so there might be a helper to filter these resources, perhaps @anGie44 knows |
Ahh yes that param is tricky given we're keeping track of secrets associated via the resource vs. those outside TF or via another resource.. I see your original rationale makes sense @dblooman -- let me propose a fix to include the logic you had added(it's a bit unconventional but we'll see if it's a maintainable alternative). Apologies that got left out in the original feature! |
I understand you can't use a for_each expression with this resource because it will remain only the last iteration of the loop? |
Yes, and you also can't use this resource in multiple modules (e.g. maybe each of your microservices is a separate module and you want to handle MSK credentials independently, not centrally). |
Are there any updates on this? |
Is there any update on this? I'm experiencing this issue and require the ability to assign secrets to msk per microservice deployment. AWS Provider version: 3.55.0 We have a module that generates the secret and then will associate it with the MSK cluster. Once that's done we use the mongey kafka provider to set the ACLs for a new user associated with the MSK cluster. Code from the module: variables.tfvariable "secret_identifier" {
type = string
description = ""
}
variable "username" {
type = string
description = "Username to be set in the secret as the username to access MSK"
}
variable "auto_generate_password" {
type = bool
description = "Toggle whether the password to be used with the specified username should be autogenerated"
default = false
}
variable "password" {
type = string
description = "Password to be stored in the secret to be the password used to access MSK"
default = null
sensitive = true
}
variable "msk_cluster_arn" {
type = string
description = "ARN of the msk cluster where the user will be attached to"
}
variable "kms_key_id" {
type = string
description = "KMS Key ID used to encrypt the secret that holds the username and password. This is generally created in the msk module"
}
variable "kafka_acls" {
type = list(object({
resource_name_pattern = string
resource_pattern_type = string
resource_type = string
acl_operation = string
acl_permission_type = string
}))
description = "List of configuration to set access control list for the specified user under creation. The kafka provider needed for this should be configured to use admin credentials"
default = []
validation {
condition = alltrue([
for v in var.kafka_acls : contains(["Prefixed", "Any", "Match", "Literal"], v.resource_pattern_type)
])
error_message = "Resource Pattern Type can only be one of the following values: Prefixed, Any, Match, Literal."
}
validation {
condition = alltrue([
for v in var.kafka_acls : contains(["Unknown", "Any", "Topic", "Group", "Cluster", "TransactionalID"], v.resource_type)
])
error_message = "Resource Type can only be one of the following values: Unknown, Any, Topic, Group, Cluster, TransactionalID."
}
validation {
condition = alltrue([
for v in var.kafka_acls : contains(["Unknown", "Any", "All", "Read", "Write", "Create", "Delete", "Alter", "Describe", "ClusterAction", "DescribeConfigs", "AlterConfigs", "IdempotentWrite"], v.acl_operation)
])
error_message = "ACL operation can only be one of the following values: Unknown, Any, All, Read, Write, Create, Delete, Alter, Describe, ClusterAction, DescribeConfigs, AlterConfigs, IdempotentWrite."
}
validation {
condition = alltrue([
for v in var.kafka_acls : contains(["Unknown", "Any", "Allow", "Deny"], v.acl_permission_type)
])
error_message = "ACL Permission Type can only be one of the following values: Unknown, Any, Allow, Deny."
}
} main.tflocals {
password = var.auto_generate_password == true && var.password == null ? random_password.auto_generated_password[0].result : var.password
}
resource "aws_secretsmanager_secret" "this" {
name = "AmazonMSK_${var.secret_identifier}"
kms_key_id = var.kms_key_id
recovery_window_in_days = 0 //Force delete as we don't have permissions to restore secrets
}
resource "random_password" "auto_generated_password" {
count = var.auto_generate_password == true && var.password == null ? 1 : 0
length = 16
special = true
upper = true
lower = true
override_special = "!@#$_-"
}
resource "aws_secretsmanager_secret_version" "this" {
secret_id = aws_secretsmanager_secret.this.id
secret_string = jsonencode({ username = var.username, password = local.password })
}
resource "aws_secretsmanager_secret_policy" "this" {
secret_arn = aws_secretsmanager_secret.this.arn
policy = data.aws_iam_policy_document.secret_policy.json
}
resource "aws_msk_scram_secret_association" "this" {
depends_on = [
aws_secretsmanager_secret_version.this,
aws_secretsmanager_secret_policy.this
]
cluster_arn = var.msk_cluster_arn
secret_arn_list = [aws_secretsmanager_secret.this.arn]
}
resource "kafka_acl" "this" {
depends_on = [aws_msk_scram_secret_association.this]
for_each = { for acl in var.kafka_acls : "${acl.resource_name_pattern}-${var.username}-${acl.acl_operation}" => acl }
acl_principal = "User:${var.username}"
acl_host = "*"
acl_operation = each.value.acl_operation
acl_permission_type = each.value.acl_permission_type
resource_name = each.value.resource_name_pattern
resource_type = each.value.resource_type
resource_pattern_type_filter = each.value.resource_pattern_type
} |
Any updates on this? I have the same issue, but this is independent of whether I am using a |
Hey Terraform team, any update and schedule for this problem? |
Facing this issue too, my workaround is to look up all the secrets and add them:
|
We tried MSK user association resource with the below scenario: @gdavison any thoughts/pointers? |
Alright, I've been struggling trying to get this working with a SOA setup where our secrets are decentralized. Here's what I did that seems to be working for now. We've got a module that is responsible for generating user credentials as well as associating those secrets with the cluster and then generating ACLs for the newly generated user. A service imports this module into their project, specifies the topics they want access to and everything else is taken care of behind the scenes. Like everyone here, I was running into issues with the secrets association overwriting everything that currently existed. Tried a couple of the above hacks with middling success. |
Can you share some more insight into the solution may be an example of the solution. so it solves most of the people's issues related to msk user association using terraform? |
Certainly.
Lambda code
|
Hi. Is there any update on this? This does effectively prevent the use of SCRAM associations in a non-centralized manner. Thanks |
We've opted for a shell-based solution, but it is really the same as @kiley-poole's lambda idea. So we have some code to trigger the script: resource "terraform_data" "secret_association" {
triggers_replace = [
aws_secretsmanager_secret.new_identity.id,
data.aws_msk_cluster.msk_cluster.arn
]
provisioner "local-exec" {
working_dir = "${path.module}/src"
interpreter = ["/bin/bash"]
command = "associate-secret.sh"
environment = {
MSK_CLUSTER_ARN = data.aws_msk_cluster.msk_cluster.arn
IDENTITY_SECRET_ARN = aws_secretsmanager_secret.new_identity.arn
}
}
}
locals {
# Try and ensure this variable is not set until terraform_data.secret_association has executed so outputs are not prematurely available
ensure_secret_associated = terraform_data.secret_association.output == "FORCE_DEPENDENCY_ON_SECRET_BEING_ASSOCIATED" ? true : true
}
output "identity_secret_arn" {
description = "ARN of Secret containing Identity credentials"
value = local.ensure_secret_associated ? aws_secretsmanager_secret.new_identity.arn : null
} That last bit fixed an issue where Terraform eagerly returns The shell script itself looks like: #!/bin/bash
: "${MSK_CLUSTER_ARN:?}"
: "${IDENTITY_SECRET_ARN:?}"
_fn_run_task() {
RESULT=$(aws kafka batch-associate-scram-secret --cluster-arn "${MSK_CLUSTER_ARN}" --secret-arn-list "${IDENTITY_SECRET_ARN}")
ERROR=$?
echo "${RESULT}"
if { [[ $ERROR -ne 0 ]] || [[ "${RESULT}" =~ "ErrorMessage" ]]; } && [[ ! "${RESULT}" =~ "secret is already associated" ]]; then
return 1
else
if [[ "${RESULT}" =~ "secret is already associated" ]]; then
echo -e "\033[0;36mIgnoring error about already associated Secret\033[0m"
fi
return 0
fi
}
# Sleep upfront to hopefully reduce need for retry at all and have a less confusing log
sleep 10
MAX_ATTEMPTS=2
RETRY_COUNT=0
while ! _fn_run_task; do
RETRY_COUNT=$((RETRY_COUNT + 1))
if [[ "${RETRY_COUNT}" -lt "${MAX_ATTEMPTS}" ]]; then
echo -e "\033[0;36mRetrying Secret association (attempt $((RETRY_COUNT + 1)) of ${MAX_ATTEMPTS})...\033[0m"
sleep 5
else
break
fi
done
if [[ ${RETRY_COUNT} -ge "${MAX_ATTEMPTS}" ]]; then
echo -e "\033[1;31mSecret association failed after ${RETRY_COUNT} attempts\033[0m"
exit 1
fi I added the retry handling as sometimes there was a slight delay before the Secret was actually available for association, and in another situation we create a Secret immediately after creating the Cluster itself, and just because Terraform returns after that operation there is a definite delay before it is usable. This does assume you can run the AWS CLI directly so it needs to be in your Docker image for example |
Hello, resource "aws_msk_single_scram_secret_association" "some-secret-association" {
cluster_arn = data.aws_msk_cluster.some-cluster.arn
secret_arn = aws_secretsmanager_secret.some-secret.arn
} Changing the existing resource would introduce a breaking change, unless we make the secret_arn_list field optional and introduce a second secret_arn field. One of the two fields need to be set for validation to succeed. If we are going with the new resource, feel free to suggest a suitable name. What would be the best approach going forward? |
Thank you @jandillenkofer. |
Thanks @jandillenkofer! while having the |
What would be the difference between having |
The difference between When removing the I created this fork with an example implementation of Changing the existing resource is complicated as well, because we need to migrate the old state of the existing resource to a new state with multiple entries with just one single secret_arn. And it would definitely be a breaking change |
In our use case, we would also favour |
From @jandillenkofer's initial explanation I understood that it would be possible to keep only one resource, but still maintain each scram association separately. Wouldn't it be possible to only keep the current resource and only This is what I have in mind: # Associating just a single secret:
resource "aws_msk_scram_secret_association" "single" {
cluster_arn = aws_msk_cluster.example.arn
secret_arn_list = [aws_secretsmanager_secret.single.arn]
depends_on = [aws_secretsmanager_secret_version.single]
}
# Associating a bunch of secrets:
resource "aws_msk_scram_secret_association" "bulk" {
cluster_arn = aws_msk_cluster.example.arn
secret_arn_list = [
aws_secretsmanager_secret.bulk[0].arn,
aws_secretsmanager_secret.bulk[1].arn,
aws_secretsmanager_secret.bulk[2].arn
]
depends_on = [aws_secretsmanager_secret_version.bulk]
} After applying this, there would be a total of 4 associations made. PS. Not sure if important, but no other resource name in AWS provider has |
@ppihus complementing my prev explanation, I'm sure it would still be useful to keep the current behaviour for the use cases where it is needed to control the full set (to remove all, overwrite, or make sure the content is exactly that). I think that to be able to have both behaviours (full and independent) we would need to have the separate resources or, at least, add a flag/field that modifies the current resource (eg: mode: incremental/full, or other relevant name) |
I understand, but this violates the very key principles of idempotency and makes a great risk for configuration drift because of the "last writer wins" pattern. Are there any other resources in the AWS provider that allow this kind of behaviour? |
so @ppihus i've been thinking about this. What about keeping the resource name the same but adding a new attribute that is just So like this:
|
Can you elaborate why would this be better than just having one item in the Also a question about a theoretical secnario. What if you initially have: resource "aws_msk_scram_secret_association" "example" {
cluster_arn = aws_msk_cluster.example.arn
secret_arn = aws_secretsmanager_secret.single.arn
depends_on = [aws_secretsmanager_secret_version.single]
} but then change this resource to: resource "aws_msk_scram_secret_association" "example" {
cluster_arn = aws_msk_cluster.example.arn
secret_arn_list = [aws_secretsmanager_secret.single.arn]
depends_on = [aws_secretsmanager_secret_version.single]
} Would tf recreate this association even though the end result is exactly the same? |
@ppihus I see your point there. I honestly think that the problem with with how it's stored in the state right now, cause if you only link in one item, the state has everything else that is already associated, thus breaking everything on the next run. Maybe that's the real fix here, is making sure that the state only cares about what was passed in and not everything else that comes back from the |
I just found the following comment from @bflad in a PR from 2020.
Would this be a fitting solution to our problem? |
Hi. Has there been any development on this issue, or a workaround? Thanks |
Warning This issue has been closed, meaning that any additional comments are hard for our team to see. Please assume that the maintainers will not see them. Ongoing conversations amongst community members are welcome, however, the issue will be locked after 30 days. Moving conversations to another venue, such as the AWS Provider forum, is recommended. If you have additional concerns, please open a new issue, referencing this one where needed. |
This functionality has been released in v5.81.0 of the Terraform AWS Provider. Please see the Terraform documentation on provider versioning or reach out if you need any assistance upgrading. For further feature requests or bug reports with this functionality, please create a new GitHub issue following the template. Thank you! |
I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. |
Community Note
Terraform CLI and Terraform AWS Provider Version
Affected Resource(s)
Terraform Configuration Files
Please include all Terraform configurations required to reproduce the bug. Bug reports without a functional reproduction may be closed without investigation.
Expected Behavior
Existing resources that are not defined in this terraform stack (or in another stack) should not be removed.
This is needed to avoid a centralized management of secrets associated to a MSK cluster.
Actual Behavior
Existing secrets are removed from the MSK Cluster in the plan:
Looks like the provider is trying to remove all the secrets that are not defined in this
aws_msk_scram_secret_association
.Steps to Reproduce
terraform apply
terraform plan
References
Pull request:
The text was updated successfully, but these errors were encountered: