-
-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use kubernetes provider to manage aws auth #355
Use kubernetes provider to manage aws auth #355
Conversation
Interesting! Previously this was avoided because there was no How does this work now? How does it work with 5 separate clusters defined? |
I actually never tested that. We always only have one cluster per terraform state. I will try to test this somewhere this week. I guess you should be able to do this using provider aliases: https://www.terraform.io/docs/configuration/providers.html#alias-multiple-provider-instances But then I should add the possibility to pass one through a variable. |
0d59f51
to
ec3f82c
Compare
I have done some testing and it is very easy to do it with more then one cluster. For an example see this comment: I also always thought it was not possible to have depends_on for provider. The depends_on field is still not supported. But you can pass variables in the provider depending on other resources. I got the example from here: https://www.terraform.io/docs/providers/aws/d/eks_cluster_auth.html. I added an example in the eks test fixtures. Is this enough or should we document this somewhere? |
ec3f82c
to
e5d63ea
Compare
@max-rocket-internet That's correct. We've worked around this by implementing the aws-auth part in a separate module, the module then ensures the dependency (in this case the Kubernetes/EKS cluster needs to be created/available) is met by taking for example the cluster ID as a variable. |
So we would have to move the aws-auth part out of this module for it to work correctly then? |
You don't have to move the aws-auth part outside of this module. Terraform is smart enough to notice the provider depends on a data object, that depends on a resource and waits nicely until the resource is created. I have tested this branch with a setup I am using as a client. |
e5d63ea
to
647de9e
Compare
No, you can use a nested module for that :)
AFAIK the issue with a provider depending on resource creation that's done by another provider hasn't been fixed. See the previously mentioned hashicorp/terraform#2430 |
@simonvanderveldt A terraform plan works. You can easily try it by checking out my branch and doing a terraform plan of eks_test_fixture in the examples folder. Just to be sure I double checked and ran terraform plan locally and it works. |
Sure but you have to apply it, from scratch. Just a plan is not enough 😄
OK!
Exactly, that's my point in my first comment.
If that works, then cool 👍 but should be part of this PR then?
Can you provide instructions on how to do this? Does it really matter anyway as it would just be applying the same configmap data again? i.e. nothing is lost. |
@max-rocket-internet No worries I tested a terraform apply from scratch as well. I just hadn't tested only the terraform plan so I double checked it. But as I said earlier I have tested the branch with a brand new terraform apply. Not of the example but of another setup where I used this module in. If you want we could have a google hangouts chat about this PR to make communication easier? |
To be 100% sure I also just tested the |
@@ -28,11 +28,6 @@ output "cluster_security_group_id" { | |||
value = "${local.cluster_security_group_id}" | |||
} | |||
|
|||
output "config_map_aws_auth" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we just keep this as an output? I'm sure someone will still want it. e.g. so they can have the file in their version control or something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added the configmap as output, but it is a terraform resource now. Which is not the same as the rendered yaml. But I can't think of an easy way to achieve that
Sorry for the delay! OK let's get this merged but I'm still a little confused here. Do we have any issues at all with this PR? Here's my summary, correct me if I'm wrong:
Apart from rebase, what do we need to do to get this merged? What else needs to be tested? |
Let me check that applying the configmap from terraform when it already exists is not a problem. That way no migration is needed. And I will look into making the output work as before. But I am not sure if that is a possibility. For the rest I have tested this with multiple clusters, the examples folder and another setup. So I think it has been tested enough :) I don’t have time right now but I should be able to do this somewhere near the end of the week. |
Just tested applying when the aws-auth configmap already exists.
So we have to provide a migration path. |
Fun story: providers initialized with data sources don't work with May make the migration path interesting too. https://www.terraform.io/docs/commands/import.html#provider-configuration |
hey, I started to do something similar on my own branch before noticing this PR. with Terraform 0.12 you can use for loops to avoid template generation, not sure if that would be of interest. I prefer avoiding templates where possible. here's what I had put together if you are interested: locals {
aws_auth_role_arns = [
for role in compact(concat(
coalescelist(
aws_iam_instance_profile.workers_launch_template_mixed.*.role,
data.aws_iam_instance_profile.custom_worker_group_launch_template_mixed_iam_instance_profile.*.role_name,
[""]
),
coalescelist(
aws_iam_instance_profile.workers_launch_template.*.role,
data.aws_iam_instance_profile.custom_worker_group_launch_template_iam_instance_profile.*.role_name,
[""]
),
coalescelist(
aws_iam_instance_profile.workers.*.role,
data.aws_iam_instance_profile.custom_worker_group_iam_instance_profile.*.role_name,
[""]
)
)) : "arn:aws:iam::${data.aws_caller_identity.current.account_id}:role/${role}"
]
}
resource "kubernetes_config_map" "aws_auth" {
count = var.manage_aws_auth ? 1 : 0
metadata {
name = "aws-auth"
namespace = "kube-system"
}
data = {
mapRoles = jsonencode(concat([
for role_arn in local.aws_auth_role_arns : {
rolearn = role_arn
username = "system:node:{{EC2PrivateDNSName}}"
groups = ["system:bootstrappers", "system:nodes"]
}
], [
for v in var.map_roles : {
rolearn = v.role_arn
username = v.username
groups = [v.group]
}
]))
mapUsers = jsonencode([
for v in var.map_users : {
userarn = v.user_arn
username = v.username
groups = [v.group]
}
])
mapAccounts = jsonencode(var.map_accounts)
}
depends_on = [
aws_eks_cluster.this
]
} |
@tristanpemble I really like that approach and I'd really like to see something like that merged. maybe creating a new PR with that would help? Right now we are using Atlantis to manage cross-account EKS clusters and we're forced to set I can help with testing next week. |
I this one still a thing? I would have loved to see that coming in the v6 release |
Hi Guys, I used this thing on post kubernetes stage - i have deployed helm tiller to prepare eks for helm app deployment. Do we have to do some additional testings or what is the status? Thanks |
@sppwf The problem here is described by #355 (comment) and #355 (comment) So we need to provided a migration path for users who are already managing the |
Recently I ran into an issue where having the AWS provider assume a role to create an EKS cluster would cause the local exec kubectl to fail. The local exec only had access to the env var credentials of the entity that assumed the role, not the assumed role's credentials. Because of this, I figured I'd take a stab at updating the vendored copy of this module (v6.0.1) to use the Here are the snippets:
Still doesn't address the migration issue though. |
name = module.eks.cluster_id | ||
} | ||
|
||
data "aws_eks_cluster_auth" "cluster" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems there is inconsistency in the usage of the aws_eks_cluster_auth
.
The first time runs perfectly, but the subsequents run it fails with unathorized
.
My guess here is that once Terraform stored the token data in data
, is already expired for the next runs. I couldn't find a way to force the token update on each run.
I've opened this issue: hashicorp/terraform-provider-aws#10362
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure that it is not working in that context? We faced the same problem and it was caused by the depends_on in the "aws_eks_cluster_auth" - which is present in your sample, but not this code.
If the dependency is induced by referencing the output of the module it seems to work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem wasn't the depends_on
it was that the cluster name must be got from the resource aws_eks_cluster, and not from a var or local.
I would love to see this merged. How can we move forward on this PR ? How can I help ? |
This would be a big help for those of us in terraform cloud that want to manage |
Maybe I'm missing something but it doesn't seem to work for me. I was trying to do something similar with my configuration (in my case, setting a k8s service account) and found this PR. I tried setting up the provider with the data "aws_eks_cluster" "this" {
name = module.tools_cluster.cluster_id
}
data "aws_eks_cluster_auth" "this" {
name = module.tools_cluster.cluster_id
}
provider "kubernetes" {
host = data.aws_eks_cluster.this.endpoint
cluster_ca_certificate = base64decode(data.aws_eks_cluster.this.certificate_authority.0.data)
token = data.aws_eks_cluster_auth.this.token
load_config_file = false
alias = "my_alias"
version = "~> 1.9"
}
module "eks" {
# source = "terraform-aws-modules/eks/aws"
source = "git@github.com:stijndehaes/terraform-aws-eks.git?ref=feature/aws-auth-k8s"
# version = "7.0.0"
cluster_name = "foobar"
vpc_id = module.vpc.vpc_id
subnets = module.vpc.private_subnets
worker_groups = [
{
instance_type = "t3.medium"
asg_max_size = 6
asg_min_size = 3
asg_desired_capacity = 3
autoscaling_enabled = true
}
]
tags = local.aws_tags
providers = {
"kubernetes" = "kubernetes.my_alias"
}
} When the apply gets to the Kubernetes resources I get this error:
|
@wperron, |
f85fa4a
to
03a3686
Compare
This commit changes the way aws auth is managed. Before a local file was used the generate the template and a null resource to apply it. This is now switched to the terraform kubernetes provider.
03a3686
to
e00f3e0
Compare
@max-rocket-internet I added some documentation on the breaking change and how to upgrade. It is very short but should contain all information. Also rebased and retested the PR everything is working. I used the basic example to test everything. |
Really sorry for delays. Time is money 💸 Anyway, I've tested this and it works great! Love it. I imported the configmap like this:
And it worked perfectly. I also tested adding and removing a user. All good 🎉 |
@@ -18,9 +18,29 @@ Read the [AWS docs on EKS to get connected to the k8s dashboard](https://docs.aw | |||
|
|||
## Usage example | |||
|
|||
A full example leveraging other community modules is contained in the [examples/basic directory](https://github.com/terraform-aws-modules/terraform-aws-eks/tree/master/examples/basic). Here's the gist of using it via the Terraform registry: | |||
A full example leveraging other community modules is contained in the [examples/basic directory](https://github.com/terraform-aws-modules/terraform-aws-eks/tree/master/examples/basic). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Keep this line
@@ -18,9 +18,29 @@ Read the [AWS docs on EKS to get connected to the k8s dashboard](https://docs.aw | |||
|
|||
## Usage example | |||
|
|||
A full example leveraging other community modules is contained in the [examples/basic directory](https://github.com/terraform-aws-modules/terraform-aws-eks/tree/master/examples/basic). Here's the gist of using it via the Terraform registry: | |||
A full example leveraging other community modules is contained in the [examples/basic directory](https://github.com/terraform-aws-modules/terraform-aws-eks/tree/master/examples/basic). | |||
Please do not forget to set the provider to the EKS cluster. This is needed to provision the aws_auth configmap in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But I think we remove all of this and put it in changelog. If we start a precedence of putting change related notes in the README it will get messy quite fast.
@@ -0,0 +1,14 @@ | |||
# Upgrading from version <= 7.x to 8.0.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we move these notes to the changelog as this will be the first place people look for upgrade notes.
Historically we haven't written much more than breaking
for things like this but I think we should start writing more detail in the changelog and this change is the perfect example 😃
@@ -16,6 +16,7 @@ project adheres to [Semantic Versioning](http://semver.org/). | |||
### Changed | |||
|
|||
- Updated instance_profile_names and instance_profile_arns outputs to also consider launch template as well as asg (by @ankitwal) | |||
- **Breaking:** Configure the aws-auth configmap using the terraform kubernetes providers. Read the [docs](docs/upgrading-to-aws-auth-kubernetes-provider.md) for more info (by @sdehaes) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add a new subheading under here like ### Notes
and detail the import and other notes.
I requested a few cosmetic changes just around where to place the upgrade notes. Let's resolve the conflicts and then we can merge ASAP 🙂 Thanks again @stijndehaes for your work on this, it's super appreciated 🙏 |
Actually, in the interest of moving things forward quicker, I'll just merge this and fix the docs myself 🚀 |
With this change I'm getting |
As an aside, I am using the Terraform Kubernetes provider to manage some CMs for like... very easy service discovery and I can confirm that it’s often timing out( cluster not yet ready, DNS not yet propagated, and so on) or failing with unauthorized( no idea why). A second plan and apply usually works. |
Interesting. Whether you are using BUT before there was |
I'm sure it is, and with this PR we can't do any retries anymore, in my opinion previous implementation was ugly but it worked as it should. |
Yeah we often observe eks master timeouts for a minute or two after creation |
I'm going to lock this pull request because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further. |
PR o'clock
Description
This changes the aws auth configmap to be managed by the kubernetes provider. This is a breaking change as already created aws auth configmaps will have to be moved into the terraform state.
Checklist
terraform fmt
andterraform validate
both work from the root andexamples/eks_test_fixture
directories (look in CI for an example)