Use kubernetes provider to manage aws auth #355

stijndehaes · 2019-04-22T13:56:36Z

PR o'clock

Description

This changes the aws auth configmap to be managed by the kubernetes provider. This is a breaking change as already created aws auth configmaps will have to be moved into the terraform state.

Checklist

terraform fmt and terraform validate both work from the root and examples/eks_test_fixture directories (look in CI for an example)
Tests for the changes have been added and passing (for bug fixes/features)
Test results are pasted in this PR (in lieu of CI)
I've added my change to CHANGELOG.md
Any breaking changes are highlighted above

max-rocket-internet · 2019-04-25T12:48:56Z

Interesting!

Previously this was avoided because there was no depends_on for providers.

How does this work now? How does it work with 5 separate clusters defined?

stijndehaes · 2019-04-29T07:00:30Z

I actually never tested that. We always only have one cluster per terraform state. I will try to test this somewhere this week. I guess you should be able to do this using provider aliases: https://www.terraform.io/docs/configuration/providers.html#alias-multiple-provider-instances

But then I should add the possibility to pass one through a variable.

stijndehaes · 2019-05-04T13:59:49Z

I have done some testing and it is very easy to do it with more then one cluster. For an example see this comment:
hashicorp/terraform#18682 (comment)

I also always thought it was not possible to have depends_on for provider. The depends_on field is still not supported. But you can pass variables in the provider depending on other resources. I got the example from here: https://www.terraform.io/docs/providers/aws/d/eks_cluster_auth.html.
And since the data objects depend on the cluster creation everything nicely waits until the whole cluster is created. Before creating resources on the cluster.

I added an example in the eks test fixtures. Is this enough or should we document this somewhere?

simonvanderveldt · 2019-05-07T15:37:40Z

Previously this was avoided because there was no depends_on for providers.

How does this work now? How does it work with 5 separate clusters defined?

@max-rocket-internet That's correct. We've worked around this by implementing the aws-auth part in a separate module, the module then ensures the dependency (in this case the Kubernetes/EKS cluster needs to be created/available) is met by taking for example the cluster ID as a variable.

max-rocket-internet · 2019-05-07T16:22:28Z

That's correct. We've worked around this by implementing the aws-auth part in a separate module

So we would have to move the aws-auth part out of this module for it to work correctly then?

stijndehaes · 2019-05-07T17:24:34Z

That's correct. We've worked around this by implementing the aws-auth part in a separate module

So we would have to move the aws-auth part out of this module for it to work correctly then?

You don't have to move the aws-auth part outside of this module. Terraform is smart enough to notice the provider depends on a data object, that depends on a resource and waits nicely until the resource is created. I have tested this branch with a setup I am using as a client.

simonvanderveldt · 2019-05-07T19:36:42Z

So we would have to move the aws-auth part out of this module for it to work correctly then?

No, you can use a nested module for that :)

You don't have to move the aws-auth part outside of this module. Terraform is smart enough to notice the provider depends on a data object, that depends on a resource and waits nicely until the resource is created. I have tested this branch with a setup I am using as a client.

AFAIK the issue with a provider depending on resource creation that's done by another provider hasn't been fixed. See the previously mentioned hashicorp/terraform#2430
Does it work if there are no resources and you just run a terraform plan?

stijndehaes · 2019-05-08T05:21:01Z

@simonvanderveldt A terraform plan works. You can easily try it by checking out my branch and doing a terraform plan of eks_test_fixture in the examples folder. Just to be sure I double checked and ran terraform plan locally and it works.

max-rocket-internet · 2019-05-08T14:04:46Z

A terraform plan works.

Sure but you have to apply it, from scratch. Just a plan is not enough 😄

We've worked around this by implementing the aws-auth part in a separate module

OK!

AFAIK the issue with a provider depending on resource creation that's done by another provider hasn't been fixed

Exactly, that's my point in my first comment.

you can use a nested module for that :)

If that works, then cool 👍 but should be part of this PR then?

already created aws auth configmaps will have to be moved into the terraform state.

Can you provide instructions on how to do this? Does it really matter anyway as it would just be applying the same configmap data again? i.e. nothing is lost.

stijndehaes · 2019-05-08T18:01:02Z

@max-rocket-internet No worries I tested a terraform apply from scratch as well. I just hadn't tested only the terraform plan so I double checked it. But as I said earlier I have tested the branch with a brand new terraform apply. Not of the example but of another setup where I used this module in.

If you want we could have a google hangouts chat about this PR to make communication easier?

stijndehaes · 2019-05-08T18:27:35Z

To be 100% sure I also just tested the examples/eks_test_fixture. This one also works correctly the configmap is nicely created after the cluster has been created

max-rocket-internet · 2019-05-13T08:42:37Z

outputs.tf

@@ -28,11 +28,6 @@ output "cluster_security_group_id" {
  value       = "${local.cluster_security_group_id}"
 }

-output "config_map_aws_auth" {


Can we just keep this as an output? I'm sure someone will still want it. e.g. so they can have the file in their version control or something.

I added the configmap as output, but it is a terraform resource now. Which is not the same as the rendered yaml. But I can't think of an easy way to achieve that

max-rocket-internet · 2019-05-13T08:44:46Z

Sorry for the delay!

OK let's get this merged but I'm still a little confused here. Do we have any issues at all with this PR? Here's my summary, correct me if I'm wrong:

depends_on for providers issue doesn't matter because it just works
Multiple clusters in the same code base works also with this PR?
Changes to the configmap are applied to the cluster correctly

Apart from rebase, what do we need to do to get this merged? What else needs to be tested?

stijndehaes · 2019-05-13T15:51:20Z

Let me check that applying the configmap from terraform when it already exists is not a problem. That way no migration is needed.

And I will look into making the output work as before. But I am not sure if that is a possibility.

For the rest I have tested this with multiple clusters, the examples folder and another setup. So I think it has been tested enough :)

I don’t have time right now but I should be able to do this somewhere near the end of the week.

stijndehaes · 2019-05-21T19:24:38Z

Just tested applying when the aws-auth configmap already exists.
This does not work:

* kubernetes_config_map.aws_auth: configmaps "aws-auth" already exists

So we have to provide a migration path.

dpiddockcmp · 2019-05-22T16:41:22Z

Fun story: providers initialized with data sources don't work with terraform import. If you merge this please add, or link to, documentation of how to use terraform import with this module.

May make the migration path interesting too.

https://www.terraform.io/docs/commands/import.html#provider-configuration

tristanpemble · 2019-06-30T04:49:03Z

hey, I started to do something similar on my own branch before noticing this PR. with Terraform 0.12 you can use for loops to avoid template generation, not sure if that would be of interest. I prefer avoiding templates where possible.

here's what I had put together if you are interested:

locals {
  aws_auth_role_arns = [
    for role in compact(concat(
      coalescelist(
        aws_iam_instance_profile.workers_launch_template_mixed.*.role,
        data.aws_iam_instance_profile.custom_worker_group_launch_template_mixed_iam_instance_profile.*.role_name,
        [""]
      ),
      coalescelist(
        aws_iam_instance_profile.workers_launch_template.*.role,
        data.aws_iam_instance_profile.custom_worker_group_launch_template_iam_instance_profile.*.role_name,
        [""]
      ),
      coalescelist(
        aws_iam_instance_profile.workers.*.role,
        data.aws_iam_instance_profile.custom_worker_group_iam_instance_profile.*.role_name,
        [""]
      )
    )) : "arn:aws:iam::${data.aws_caller_identity.current.account_id}:role/${role}"
  ]
}

resource "kubernetes_config_map" "aws_auth" {
  count = var.manage_aws_auth ? 1 : 0

  metadata {
    name      = "aws-auth"
    namespace = "kube-system"
  }

  data = {
    mapRoles = jsonencode(concat([
      for role_arn in local.aws_auth_role_arns : {
        rolearn  = role_arn
        username = "system:node:{{EC2PrivateDNSName}}"
        groups   = ["system:bootstrappers", "system:nodes"]
      }
    ], [
      for v in var.map_roles : {
        rolearn  = v.role_arn
        username = v.username
        groups   = [v.group]
      }
    ]))

    mapUsers = jsonencode([
      for v in var.map_users : {
        userarn  = v.user_arn
        username = v.username
        groups   = [v.group]
      }
    ])

    mapAccounts = jsonencode(var.map_accounts)
  }

  depends_on = [
    aws_eks_cluster.this
  ]
}

Vlaaaaaaad · 2019-07-04T13:02:29Z

@tristanpemble I really like that approach and I'd really like to see something like that merged. maybe creating a new PR with that would help?

Right now we are using Atlantis to manage cross-account EKS clusters and we're forced to set manage_aws_auth = false.

I can help with testing next week.

pierresteiner · 2019-09-19T17:37:42Z

I this one still a thing? I would have loved to see that coming in the v6 release

sppwf · 2019-09-25T21:09:43Z

Hi Guys,

I used this thing on post kubernetes stage - i have deployed helm tiller to prepare eks for helm app deployment.
it worked fine for me, soe we can try using kubernetes provider to manage the config map. Thats will be huge improvement, as we do not need to have kubectl and aws-iam-authenticator on box that creates the EKS cluster.

Do we have to do some additional testings or what is the status?

Thanks
Sergiu

barryib · 2019-09-26T11:36:09Z

@sppwf The problem here is described by #355 (comment) and #355 (comment)

So we need to provided a migration path for users who are already managing the aws-auth configmap with this module (so with kubectl).

erumble · 2019-09-26T22:08:06Z

Recently I ran into an issue where having the AWS provider assume a role to create an EKS cluster would cause the local exec kubectl to fail. The local exec only had access to the env var credentials of the entity that assumed the role, not the assumed role's credentials. Because of this, I figured I'd take a stab at updating the vendored copy of this module (v6.0.1) to use the kubernetes_config_map resource. Turns out it was much easier than I though.

Here are the snippets:

# aws_auth.tf
resource "kubernetes_config_map" "aws_auth" {
  count = var.manage_aws_auth ? 1 : 0

  metadata {
    name      = "aws-auth"
    namespace = "kube-system"
  }

  data = {
    mapAccounts = yamlencode(var.map_accounts)
    mapRoles    = yamlencode(concat(local.workers_group_role_maps, var.map_roles))
    mapUsers    = yamlencode(var.map_users)
  }
}

# locals.tf
locals {
  # ... other local defs

  workers_group_role_maps = [
    for name in distinct(concat(
      aws_iam_instance_profile.workers_launch_template.*.role,
      data.aws_iam_instance_profile.custom_worker_group_launch_template_iam_instance_profile.*.role_name,
      aws_iam_instance_profile.workers.*.role,
      data.aws_iam_instance_profile.custom_worker_group_iam_instance_profile.*.role_name,
      )) : {
      rolearn  = "arn:aws:iam::${data.aws_caller_identity.current.account_id}:role/${name}"
      username = "system:node:{{EC2PrivateDNSName}}"
      groups = [
        "system:bootstrappers",
        "system:nodes",
      ]
    }
  ]
}

Still doesn't address the migration issue though.

gdurandvadas · 2019-10-04T13:32:56Z

examples/launch_templates/main.tf

+  name = module.eks.cluster_id
+}
+
+data "aws_eks_cluster_auth" "cluster" {


It seems there is inconsistency in the usage of the aws_eks_cluster_auth.
The first time runs perfectly, but the subsequents run it fails with unathorized.
My guess here is that once Terraform stored the token data in data, is already expired for the next runs. I couldn't find a way to force the token update on each run.
I've opened this issue: hashicorp/terraform-provider-aws#10362

Are you sure that it is not working in that context? We faced the same problem and it was caused by the depends_on in the "aws_eks_cluster_auth" - which is present in your sample, but not this code.

If the dependency is induced by referencing the output of the module it seems to work.

The problem wasn't the depends_on it was that the cluster name must be got from the resource aws_eks_cluster, and not from a var or local.

barryib · 2019-10-25T15:52:12Z

I would love to see this merged. How can we move forward on this PR ? How can I help ?

mzupan · 2019-10-31T18:24:14Z

This would be a big help for those of us in terraform cloud that want to manage aws-auth configmap

wperron · 2019-11-01T19:11:39Z

Maybe I'm missing something but it doesn't seem to work for me. I was trying to do something similar with my configuration (in my case, setting a k8s service account) and found this PR. I tried setting up the provider with the aws_eks_cluster_auth data block just like the proposed change and it failed. I then tried to simply apply the configuration as shown in the PR using the fork as the source for the module and it still crashed. This is how I set my config:

data "aws_eks_cluster" "this" {
    name = module.tools_cluster.cluster_id
}

data "aws_eks_cluster_auth" "this" {
    name = module.tools_cluster.cluster_id
}

provider "kubernetes" {
    host                   = data.aws_eks_cluster.this.endpoint
    cluster_ca_certificate = base64decode(data.aws_eks_cluster.this.certificate_authority.0.data)
    token                  = data.aws_eks_cluster_auth.this.token
    load_config_file       = false
    alias                  = "my_alias"
    version                = "~> 1.9"
}

module "eks" {
    # source        = "terraform-aws-modules/eks/aws"
    source        = "git@github.com:stijndehaes/terraform-aws-eks.git?ref=feature/aws-auth-k8s"
    # version       = "7.0.0"
    cluster_name  = "foobar"
    vpc_id        = module.vpc.vpc_id
    subnets       = module.vpc.private_subnets
    worker_groups = [
        {
            instance_type        = "t3.medium"
            asg_max_size         = 6
            asg_min_size         = 3
            asg_desired_capacity = 3
            autoscaling_enabled  = true
        }
    ]

    tags = local.aws_tags

    providers = {
        "kubernetes" = "kubernetes.my_alias"
    }
}

When the apply gets to the Kubernetes resources I get this error:

Error: Post https://<cluster-id>.gr7.us-east-1.eks.amazonaws.com/api/v1/namespaces/kube-system/configmaps: net/http: TLS handshake timeout

  on .terraform/modules/my-cluster/aws_auth.tf line 35, in resource "kubernetes_config_map" "aws_auth":
  35: resource "kubernetes_config_map" "aws_auth" {

CarpathianUA · 2019-11-04T10:09:45Z

@wperron,
By reading your code snippet, it looks like you gather a certificate from a provisioned EKS cluster via aws_eks_cluster data source and then try to use is for kubernetes provider?

This commit changes the way aws auth is managed. Before a local file was used the generate the template and a null resource to apply it. This is now switched to the terraform kubernetes provider.

stijndehaes · 2019-11-11T15:42:08Z

@max-rocket-internet I added some documentation on the breaking change and how to upgrade. It is very short but should contain all information. Also rebased and retested the PR everything is working. I used the basic example to test everything.

max-rocket-internet · 2019-11-27T11:55:20Z

Really sorry for delays. Time is money 💸

Anyway, I've tested this and it works great! Love it. I imported the configmap like this:

terraform import module.cluster1.kubernetes_config_map.aws_auth[0] kube-system/aws-auth

And it worked perfectly. I also tested adding and removing a user. All good 🎉

max-rocket-internet · 2019-11-27T11:58:39Z

README.md

@@ -18,9 +18,29 @@ Read the [AWS docs on EKS to get connected to the k8s dashboard](https://docs.aw

 ## Usage example

-A full example leveraging other community modules is contained in the [examples/basic directory](https://github.com/terraform-aws-modules/terraform-aws-eks/tree/master/examples/basic). Here's the gist of using it via the Terraform registry:
+A full example leveraging other community modules is contained in the [examples/basic directory](https://github.com/terraform-aws-modules/terraform-aws-eks/tree/master/examples/basic).


Keep this line

max-rocket-internet · 2019-11-27T12:05:16Z

README.md

@@ -18,9 +18,29 @@ Read the [AWS docs on EKS to get connected to the k8s dashboard](https://docs.aw

 ## Usage example

-A full example leveraging other community modules is contained in the [examples/basic directory](https://github.com/terraform-aws-modules/terraform-aws-eks/tree/master/examples/basic). Here's the gist of using it via the Terraform registry:
+A full example leveraging other community modules is contained in the [examples/basic directory](https://github.com/terraform-aws-modules/terraform-aws-eks/tree/master/examples/basic).
+Please do not forget to set the provider to the EKS cluster. This is needed to provision the aws_auth configmap in


But I think we remove all of this and put it in changelog. If we start a precedence of putting change related notes in the README it will get messy quite fast.

max-rocket-internet · 2019-11-27T12:06:57Z

docs/upgrading-to-aws-auth-kubernetes-provider.md

@@ -0,0 +1,14 @@
+# Upgrading from version <= 7.x to 8.0.0


I think we move these notes to the changelog as this will be the first place people look for upgrade notes.

Historically we haven't written much more than breaking for things like this but I think we should start writing more detail in the changelog and this change is the perfect example 😃

max-rocket-internet · 2019-11-27T12:09:47Z

CHANGELOG.md

@@ -16,6 +16,7 @@ project adheres to [Semantic Versioning](http://semver.org/).
 ### Changed

 - Updated instance_profile_names and instance_profile_arns outputs to also consider launch template as well as asg (by @ankitwal)
+- **Breaking:** Configure the aws-auth configmap using the terraform kubernetes providers. Read the [docs](docs/upgrading-to-aws-auth-kubernetes-provider.md) for more info (by @sdehaes)


Maybe add a new subheading under here like ### Notes and detail the import and other notes.

max-rocket-internet · 2019-11-27T12:10:58Z

I requested a few cosmetic changes just around where to place the upgrade notes. Let's resolve the conflicts and then we can merge ASAP 🙂

Thanks again @stijndehaes for your work on this, it's super appreciated 🙏

max-rocket-internet · 2019-11-28T09:24:52Z

Actually, in the interest of moving things forward quicker, I'll just merge this and fix the docs myself 🚀

ivanich · 2019-12-02T16:47:08Z

With this change I'm getting
Error: Post https://FGFFFFFFFJDJKDNDHEJFBF.yl4.us-east-1.eks.amazonaws.com/api/v1/namespaces/kube-system/configmaps: dial tcp xx.xx.xx.xx:443: i/o timeout on .terraform/modules/eks/aws_auth.tf line 45, in resource "kubernetes_config_map" "aws_auth": 45: resource "kubernetes_config_map" "aws_auth" {
Are you guys sure it was a good idea to merge this PR?

Vlaaaaaaad · 2019-12-02T16:50:19Z

As an aside, I am using the Terraform Kubernetes provider to manage some CMs for like... very easy service discovery and I can confirm that it’s often timing out( cluster not yet ready, DNS not yet propagated, and so on) or failing with unauthorized( no idea why). A second plan and apply usually works.

max-rocket-internet · 2019-12-02T16:54:43Z

Interesting.

Whether you are using kubectl like before or the new provider config that is now merged, it wouldn't make any difference if you can't connect to the API endpoint or if DNS is not propagated.

BUT before there was for i in seq 1 10; do xxx so it would retry 10 times. Perhaps that's the issue here?

ivanich · 2019-12-02T17:00:02Z

BUT before there was for i in seq 1 10; do xxx so it would retry 10 times. Perhaps that's the issue here?

I'm sure it is, and with this PR we can't do any retries anymore, in my opinion previous implementation was ugly but it worked as it should.

max-rocket-internet · 2019-12-02T17:03:48Z

hashicorp/terraform-provider-kubernetes#96

rchernobelskiy · 2019-12-10T03:33:56Z

Yeah we often observe eks master timeouts for a minute or two after creation

github-actions · 2022-11-20T02:32:16Z

I'm going to lock this pull request because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

stijndehaes mentioned this pull request Apr 22, 2019

Suggestion - use the kubernetes provider to remove dependency on kubectl and config #353

Closed

max-rocket-internet added the needs_testing label May 2, 2019

stijndehaes force-pushed the feature/aws-auth-k8s branch from 0d59f51 to ec3f82c Compare May 4, 2019 13:58

stijndehaes force-pushed the feature/aws-auth-k8s branch from ec3f82c to e5d63ea Compare May 4, 2019 17:08

stijndehaes mentioned this pull request May 4, 2019

Option to install kubectl and aws-iam-authenticator in module path #339

Closed

5 tasks

max-rocket-internet added the help wanted label May 6, 2019

stijndehaes force-pushed the feature/aws-auth-k8s branch from e5d63ea to 647de9e Compare May 7, 2019 17:27

max-rocket-internet reviewed May 13, 2019

View reviewed changes

willejs mentioned this pull request Jul 2, 2019

[EKS] [request]: Manage IAM identity cluster access with EKS API aws/containers-roadmap#185

Closed

gdurandvadas reviewed Oct 4, 2019

View reviewed changes

stijndehaes force-pushed the feature/aws-auth-k8s branch 2 times, most recently from f85fa4a to 03a3686 Compare November 11, 2019 15:22

Use kubernetes provider to manage aws auth

e00f3e0

This commit changes the way aws auth is managed. Before a local file was used the generate the template and a null resource to apply it. This is now switched to the terraform kubernetes provider.

stijndehaes force-pushed the feature/aws-auth-k8s branch from 03a3686 to e00f3e0 Compare November 11, 2019 15:39

max-rocket-internet suggested changes Nov 27, 2019

View reviewed changes

Merge branch 'master' into feature/aws-auth-k8s

ec90c96

max-rocket-internet approved these changes Nov 28, 2019

View reviewed changes

max-rocket-internet merged commit 9363662 into terraform-aws-modules:master Nov 28, 2019

Ventals mentioned this pull request Dec 5, 2019

configmap/aws-auth doesn't update when we use workers_role_name #610

Closed

4 tasks

max-rocket-internet mentioned this pull request Dec 13, 2019

Is the complexity of this module getting too high? #635

Closed

exequielrafaela mentioned this pull request Jan 7, 2020

Support for EKS Managed Node Groups + aws-auth with K8s tf provider binbashar/terraform-aws-eks#5

Merged

barryib mentioned this pull request Mar 14, 2020

Cannot import existing kubernetes object into terraform state when migrating from 7.x to 8.x or greater #785

Closed

4 tasks

lancefrench mentioned this pull request Dec 9, 2020

[ChainOps] Deployments via Terraform Cloud failing Sifchain/sifnode#308

Closed

github-actions bot locked as resolved and limited conversation to collaborators Nov 20, 2022

Use kubernetes provider to manage aws auth #355

Use kubernetes provider to manage aws auth #355

Conversation

stijndehaes commented Apr 22, 2019

PR o'clock

Description

Checklist

max-rocket-internet commented Apr 25, 2019

stijndehaes commented Apr 29, 2019

stijndehaes commented May 4, 2019 • edited Loading

simonvanderveldt commented May 7, 2019

max-rocket-internet commented May 7, 2019

stijndehaes commented May 7, 2019

simonvanderveldt commented May 7, 2019

stijndehaes commented May 8, 2019

max-rocket-internet commented May 8, 2019

stijndehaes commented May 8, 2019

stijndehaes commented May 8, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

max-rocket-internet commented May 13, 2019

stijndehaes commented May 13, 2019

stijndehaes commented May 21, 2019

dpiddockcmp commented May 22, 2019

tristanpemble commented Jun 30, 2019 • edited Loading

Vlaaaaaaad commented Jul 4, 2019

pierresteiner commented Sep 19, 2019

sppwf commented Sep 25, 2019

barryib commented Sep 26, 2019 • edited Loading

erumble commented Sep 26, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

barryib commented Oct 25, 2019

mzupan commented Oct 31, 2019

wperron commented Nov 1, 2019 • edited Loading

CarpathianUA commented Nov 4, 2019

stijndehaes commented Nov 11, 2019

max-rocket-internet commented Nov 27, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

max-rocket-internet commented Nov 27, 2019

max-rocket-internet commented Nov 28, 2019

ivanich commented Dec 2, 2019

Vlaaaaaaad commented Dec 2, 2019

max-rocket-internet commented Dec 2, 2019

ivanich commented Dec 2, 2019 • edited Loading

max-rocket-internet commented Dec 2, 2019

rchernobelskiy commented Dec 10, 2019

github-actions bot commented Nov 20, 2022

stijndehaes commented May 4, 2019 •

edited

Loading

tristanpemble commented Jun 30, 2019 •

edited

Loading

barryib commented Sep 26, 2019 •

edited

Loading

wperron commented Nov 1, 2019 •

edited

Loading

ivanich commented Dec 2, 2019 •

edited

Loading