-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
aws-eks: Updating KubernetesManifest deletes it instead #33406
Comments
IMO since you're changing the construct identifier, that's why it gets recreated - that's a standard CDK behaviour and not a no-op. |
Thank you for the detailed report. After investigating the code, I can confirm this is a significant issue with how CloudFormation's resource replacement sequence interacts with Kubernetes resource management. Root Cause:
This is what's happening under the hood and you are right: Short-term Workarounds:
const manifest = new eks.KubernetesManifest(cluster, 'MyManifest', {
// ... other props ...
});
(manifest.node.defaultChild as CfnResource).applyRemovalPolicy(RemovalPolicy.RETAIN); Long-term Fix: We need to modify how the custom resource handles deletions. Possible approaches:
I will bring this up to the team for further inputs. |
@rantoniuk I am fine with replacement - a bit of downtime while switching between resources is totally okay. You are correct that this is not really a "no-op" on the CFN side, however, the issue is that the replacement deletes the Kubernetes resources entirely and leaves it in a missing state. Edit: updated the issue to reflect this. @pahud Appreciate you taking a look :) |
Describe the bug
Updating a KubernetesManifest resource through CDK can actually cause it to get deleted.
During a resource replacement, if
overwrite: true
and the previous manifest has any overlap with the new manifest, the overlapping section would be lost. When the manifest is unchanged, the entire resource is deleted. Issue cannot be mitigated by a rollback or code revert and will repeat on any subsequent update.Regression Issue
Last Known Working CDK Version
No response
Expected Behavior
Replacing a KubernetesManifest should at most delete and re-create the underlying EKS manifest resources. A minimal update to a KubernetesManifest should not result in a loss of cluster functionality resulting from missing Kubernetes resources.
Current Behavior
Updates to a KubernetesManifest which are applied as a replacement cause cluster resources to be wiped. Rollbacks and reverts do not bring the cluster back to a healthy state.
Given Manifest A (previous) and Manifest B (new) are based on the same yaml, replacing the KubernetesManifest resource looks like this:
Reproduction Steps
Setup
Minimal Change
CloudFormation Events
<stack>
Reverts Are Ineffective
Similar events to above, sleeper pod is created then deleted again.
Possible Solution
Immediate Mitigating Options:
Note: Using
RemovalPolicy.RETAIN
comes with the natural downside of having to clean up dangling resources manuallyAdditional Information/Context
Additional Risks:
If we update manifests and there is any overlap between the original and subsequent manifests, CloudFormation might invisibly delete parts of a manifest. For example, if manifest version 1.0 is deployed and replaced with manifest version 2.0, the intersecting resources (1.0 ∩ 2.0) will be deleted when cleaning up 1.0.
CDK CLI Version
2.160.0
Framework Version
No response
Node.js Version
18
OS
Amazon Linux 2 x86_64
Language
TypeScript
Language Version
5.0.4
Other information
Sev2: P199049085
Tracking: P200043360
Case ID 173931643600782
The text was updated successfully, but these errors were encountered: