Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

eks: ack prop for potential cluster replacement #30107

Open
1 of 2 tasks
pahud opened this issue May 8, 2024 · 0 comments
Open
1 of 2 tasks

eks: ack prop for potential cluster replacement #30107

pahud opened this issue May 8, 2024 · 0 comments
Labels
@aws-cdk/aws-eks Related to Amazon Elastic Kubernetes Service effort/medium Medium work item – several days of effort feature-request A feature should be added or improved. p2

Comments

@pahud
Copy link
Contributor

pahud commented May 8, 2024

Describe the feature

aws-eks.Cluster has some cluster props that would result in cluster replacement on prop update.

// if there is an update that requires replacement, go ahead and just create
// a new cluster with the new config. The old cluster will automatically be
// deleted by cloudformation upon success.
if (updates.replaceName || updates.replaceRole || updates.replaceVpc) {
// if we are replacing this cluster and the cluster has an explicit
// physical name, the creation of the new cluster will fail with "there is
// already a cluster with that name". this is a common behavior for
// CloudFormation resources that support specifying a physical name.
if (this.oldProps.name === this.newProps.name && this.oldProps.name) {
throw new Error(`Cannot replace cluster "${this.oldProps.name}" since it has an explicit physical name. Either rename the cluster or remove the "name" configuration`);
}
return this.onCreate();
}

And users can hardly tell that from cdk diff or cdk deploy as the cluster resource is actually a custom resource and we just notice the custom resource props change in this case.

We only notice that the custom resource would change but in fact the existing cluster would be torn down and replaced, resulting data loss.

image

I think we should have a gatekeeper prop which default value is false and only when you explicit set it as true will the cluster replacement happen.

Use Case

as above

Proposed Solution

I was thinking maybe we can have a removalPolicy prop of the cluster which default to RETAIN and the cluster replacement would only happen when the value is DESTROY

Another option is to have a replaceOnUpdate or allowReplaceOnUpdate prop for eks.Cluster which defaults to false.

Other Information

No response

Acknowledgements

  • I may be able to implement this feature request
  • This feature might incur a breaking change

CDK version used

all

Environment details (OS name and version, etc.)

all

@pahud pahud added feature-request A feature should be added or improved. needs-triage This issue or PR still needs to be triaged. labels May 8, 2024
@github-actions github-actions bot added the @aws-cdk/aws-eks Related to Amazon Elastic Kubernetes Service label May 8, 2024
@pahud pahud added p1 effort/medium Medium work item – several days of effort and removed needs-triage This issue or PR still needs to be triaged. labels May 8, 2024
@pahud pahud self-assigned this May 8, 2024
@pahud pahud changed the title eks: ack prop for cluster replacement eks: ack prop for potential cluster replacement May 8, 2024
mergify bot pushed a commit that referenced this issue Jun 12, 2024
###  Background

Amazon EKS originally uses `ConfigMap` as its access management and in aws-eks we use AwsAuth to leverage the kubectl from kubectl-lambda-layer to create the AwsAuth configmap for that. The ConfigMap has been very difficult to maintain due to its lack support of EKS API but thanks to the AwsAuth class, it's been very smooth in CDK.

In AWS reInvent 2023 we [announced](https://aws.amazon.com/blogs/containers/a-deep-dive-into-simplified-amazon-eks-access-management-controls/) the access API support that simplifies the management as a replacement of the traditional ConfigMap. In CloudFormation we have the [AccessConfig](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-eks-cluster.html#cfn-eks-cluster-accessconfig) with [AuthenticationMode](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-properties-eks-cluster-accessconfig.html#cfn-eks-cluster-accessconfig-authenticationmode) and `BootstrapClusterCreatorAdminPermissions` now.

The `AuthenticationMode` supports `CONFIG_MAP`, which is the default, `API_AND_CONFIG_MAP` and `CONFIG_MAP`. It allows users to switch the mode on cluster creation or update. When the mode has API support, users have to define the `AccessEntry` to map the access policies and the IAM principals. This PR introduces the `AccessEntry` and `AccessPolicy` classes for that to simplify it with similar experience just as the [iam.ManagedPolicy ](https://github.com/aws/aws-cdk/blob/3928eae1ee92a03ba9959288f05f59d6bd5edcba/packages/aws-cdk-lib/aws-iam/lib/managed-policy.ts#L104)class. This PR also introduces the `grantAccess()` method that allows a cluster to `grant` its access to a specific principal and abstracts away the complexity.

Overview of the API experience from this PR:

```ts
const cluster = new eks.Cluster(this, 'Cluster', {
  vpc,
  mastersRole: clusterAdminRole,
  version: eks.KubernetesVersion.V1_30,
  kubectlLayer: new KubectlV29Layer(this, 'KubectlLayer'),
  authenticationMode: eks.AuthenticationMode.API_AND_CONFIG_MAP,
});

// Cluster Admin role for this cluster
cluster.grantAccess('clusterAdminAccess', clusterAdminRole.roleArn, [
  eks.AccessPolicy.fromAccessPolicyName('AmazonEKSClusterAdminPolicy', {
    accessScopeType: eks.AccessScopeType.CLUSTER,
  }),
]);

// EKS Admin role for specified namespaces of thie cluster
cluster.grantAccess('eksAdminRoleAccess', eksAdminRole.roleArn, [
  eks.AccessPolicy.fromAccessPolicyName('AmazonEKSAdminPolicy', {
    accessScopeType: eks.AccessScopeType.NAMESPACE,
    namespaces: ['foo', 'bar'],
  }),
]);

// EKS Admin Viewer role for specified namespaces of thie cluster
cluster.grantAccess('eksAdminViewRoleAccess', eksAdminViewRole.roleArn, [
  eks.AccessPolicy.fromAccessPolicyName('AmazonEKSAdminViewPolicy', {
    accessScopeType: eks.AccessScopeType.NAMESPACE,
    namespaces: ['foo', 'bar'],
  }),
]);
```


### Issue # (if applicable)

Closes  #28588

This PR introduces the `authenticationMode`, `AccessEntry` and `AccessPolicy` for both `Cluster` and `FargateCluster` construct.

- [x] bump `@aws-sdk/client-eks` to [v3.476.0](https://github.com/aws/aws-sdk-js-v3/releases/tag/v3.476.0)(the minimal version with EKS Cluster Access Management support)
- [x] make sure it deploys with the new AccessConfig support for a new cluster
- [x] make sure an existing cluster can update by adding this new prop
- [x] make sure it deploys with a new FargateCluster
- [x] make sure an existing FargateCluster can update by adding this new prop
- [x] make sure it works with CfnAccessEntry L1 resources
- [x] AccessEntry L2 construct support
- [x] AccessPolicy class
- [x] bootstrapClusterCreatorAdminPermissions
- [x] unit tests
- [x] integ tests
- [x] update README
- [x] add PR notes

### Notes

1. Switching authentication modes on an existing cluster is a one-way operation like:

undefined(CONFIG_MAP) -> API_AND_CONFIG_MAP -> API

You can switch from undefined or CONFIG_MAP to API_AND_CONFIG_MAP. You can then switch from API_AND_CONFIG_MAP to API. You cannot revert these operations in the opposite direction. Meaning you cannot switch back to CONFIG_MAP or API_AND_CONFIG_MAP from API. And you cannot switch back to CONFIG_MAP from API_AND_CONFIG_MAP. (see [here](https://aws.amazon.com/blogs/containers/a-deep-dive-into-simplified-amazon-eks-access-management-controls/)) This PR adds relevant checks in the custom resource and add docstring in the `authenticationMode` prop.

2. Switching `bootstrapClusterCreatorAdminPermissions` would cause cluster replacement, we callout in the README and construct prop docstring as a headsup. This option is [available](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-properties-eks-cluster-accessconfig.html#cfn-eks-cluster-accessconfig-bootstrapclustercreatoradminpermissions) in CFN which triggers replacement on resource update as well. I have created #30107 for further improvement.

3. This feature does not support AWS China regions at this moment as the JS SDK version of lambda node18 runtime in China regions is `3.462.0` while this feature requires SDK [3.476.0](https://github.com/aws/aws-sdk-js-v3/releases/tag/v3.476.0) or above. It's `3.552.0` in us-east-1. Use [this example](https://docs.aws.amazon.com/lambda/latest/dg/lambda-nodejs.html#nodejs-sdk-included) to check the version.


### Reason for this change



### Description of changes



### Description of how you validated changes



### Checklist
- [x] My code adheres to the [CONTRIBUTING GUIDE](https://github.com/aws/aws-cdk/blob/main/CONTRIBUTING.md) and [DESIGN GUIDELINES](https://github.com/aws/aws-cdk/blob/main/docs/DESIGN_GUIDELINES.md)

----

*By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
@pahud pahud added p2 and removed p1 labels Jun 18, 2024
@pahud pahud removed their assignment Jun 18, 2024
mazyu36 pushed a commit to mazyu36/aws-cdk that referenced this issue Jun 22, 2024
###  Background

Amazon EKS originally uses `ConfigMap` as its access management and in aws-eks we use AwsAuth to leverage the kubectl from kubectl-lambda-layer to create the AwsAuth configmap for that. The ConfigMap has been very difficult to maintain due to its lack support of EKS API but thanks to the AwsAuth class, it's been very smooth in CDK.

In AWS reInvent 2023 we [announced](https://aws.amazon.com/blogs/containers/a-deep-dive-into-simplified-amazon-eks-access-management-controls/) the access API support that simplifies the management as a replacement of the traditional ConfigMap. In CloudFormation we have the [AccessConfig](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-eks-cluster.html#cfn-eks-cluster-accessconfig) with [AuthenticationMode](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-properties-eks-cluster-accessconfig.html#cfn-eks-cluster-accessconfig-authenticationmode) and `BootstrapClusterCreatorAdminPermissions` now.

The `AuthenticationMode` supports `CONFIG_MAP`, which is the default, `API_AND_CONFIG_MAP` and `CONFIG_MAP`. It allows users to switch the mode on cluster creation or update. When the mode has API support, users have to define the `AccessEntry` to map the access policies and the IAM principals. This PR introduces the `AccessEntry` and `AccessPolicy` classes for that to simplify it with similar experience just as the [iam.ManagedPolicy ](https://github.com/aws/aws-cdk/blob/3928eae1ee92a03ba9959288f05f59d6bd5edcba/packages/aws-cdk-lib/aws-iam/lib/managed-policy.ts#L104)class. This PR also introduces the `grantAccess()` method that allows a cluster to `grant` its access to a specific principal and abstracts away the complexity.

Overview of the API experience from this PR:

```ts
const cluster = new eks.Cluster(this, 'Cluster', {
  vpc,
  mastersRole: clusterAdminRole,
  version: eks.KubernetesVersion.V1_30,
  kubectlLayer: new KubectlV29Layer(this, 'KubectlLayer'),
  authenticationMode: eks.AuthenticationMode.API_AND_CONFIG_MAP,
});

// Cluster Admin role for this cluster
cluster.grantAccess('clusterAdminAccess', clusterAdminRole.roleArn, [
  eks.AccessPolicy.fromAccessPolicyName('AmazonEKSClusterAdminPolicy', {
    accessScopeType: eks.AccessScopeType.CLUSTER,
  }),
]);

// EKS Admin role for specified namespaces of thie cluster
cluster.grantAccess('eksAdminRoleAccess', eksAdminRole.roleArn, [
  eks.AccessPolicy.fromAccessPolicyName('AmazonEKSAdminPolicy', {
    accessScopeType: eks.AccessScopeType.NAMESPACE,
    namespaces: ['foo', 'bar'],
  }),
]);

// EKS Admin Viewer role for specified namespaces of thie cluster
cluster.grantAccess('eksAdminViewRoleAccess', eksAdminViewRole.roleArn, [
  eks.AccessPolicy.fromAccessPolicyName('AmazonEKSAdminViewPolicy', {
    accessScopeType: eks.AccessScopeType.NAMESPACE,
    namespaces: ['foo', 'bar'],
  }),
]);
```


### Issue # (if applicable)

Closes  aws#28588

This PR introduces the `authenticationMode`, `AccessEntry` and `AccessPolicy` for both `Cluster` and `FargateCluster` construct.

- [x] bump `@aws-sdk/client-eks` to [v3.476.0](https://github.com/aws/aws-sdk-js-v3/releases/tag/v3.476.0)(the minimal version with EKS Cluster Access Management support)
- [x] make sure it deploys with the new AccessConfig support for a new cluster
- [x] make sure an existing cluster can update by adding this new prop
- [x] make sure it deploys with a new FargateCluster
- [x] make sure an existing FargateCluster can update by adding this new prop
- [x] make sure it works with CfnAccessEntry L1 resources
- [x] AccessEntry L2 construct support
- [x] AccessPolicy class
- [x] bootstrapClusterCreatorAdminPermissions
- [x] unit tests
- [x] integ tests
- [x] update README
- [x] add PR notes

### Notes

1. Switching authentication modes on an existing cluster is a one-way operation like:

undefined(CONFIG_MAP) -> API_AND_CONFIG_MAP -> API

You can switch from undefined or CONFIG_MAP to API_AND_CONFIG_MAP. You can then switch from API_AND_CONFIG_MAP to API. You cannot revert these operations in the opposite direction. Meaning you cannot switch back to CONFIG_MAP or API_AND_CONFIG_MAP from API. And you cannot switch back to CONFIG_MAP from API_AND_CONFIG_MAP. (see [here](https://aws.amazon.com/blogs/containers/a-deep-dive-into-simplified-amazon-eks-access-management-controls/)) This PR adds relevant checks in the custom resource and add docstring in the `authenticationMode` prop.

2. Switching `bootstrapClusterCreatorAdminPermissions` would cause cluster replacement, we callout in the README and construct prop docstring as a headsup. This option is [available](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-properties-eks-cluster-accessconfig.html#cfn-eks-cluster-accessconfig-bootstrapclustercreatoradminpermissions) in CFN which triggers replacement on resource update as well. I have created aws#30107 for further improvement.

3. This feature does not support AWS China regions at this moment as the JS SDK version of lambda node18 runtime in China regions is `3.462.0` while this feature requires SDK [3.476.0](https://github.com/aws/aws-sdk-js-v3/releases/tag/v3.476.0) or above. It's `3.552.0` in us-east-1. Use [this example](https://docs.aws.amazon.com/lambda/latest/dg/lambda-nodejs.html#nodejs-sdk-included) to check the version.


### Reason for this change



### Description of changes



### Description of how you validated changes



### Checklist
- [x] My code adheres to the [CONTRIBUTING GUIDE](https://github.com/aws/aws-cdk/blob/main/CONTRIBUTING.md) and [DESIGN GUIDELINES](https://github.com/aws/aws-cdk/blob/main/docs/DESIGN_GUIDELINES.md)

----

*By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
@pahud pahud added p3 and removed p3 labels Sep 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
@aws-cdk/aws-eks Related to Amazon Elastic Kubernetes Service effort/medium Medium work item – several days of effort feature-request A feature should be added or improved. p2
Projects
None yet
Development

No branches or pull requests

1 participant