Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

aws_s3_deployment: BucketDeployment is deleted if custom resource lambda fails during update #22670

Closed
rittneje opened this issue Oct 27, 2022 · 6 comments · Fixed by #24428
Closed
Assignees
Labels
@aws-cdk/aws-s3-deployment bug This issue is a bug. effort/small Small work item – less than a day of effort p1

Comments

@rittneje
Copy link

Describe the bug

We have an aws_s3_deployment.BucketDeployment with retain_on_delete=False. Recently, when updating the files in the deployment, the custom resource lambda failed while invaliding the CloudFront distribution due to some transient issue. Due to a bug in the lambda, it incorrectly sent back a bogus physical resource id to CloudFormation, which caused CloudFormation to conclude it had done an update replacement, so then it tried to roll back by deleting the new physical resource id. Since the custom resource lambda does not consider the physical resource id, it just decided to delete the whole deployment, resulting in data loss.

The source of the issue is here:

responseBody['PhysicalResourceId'] = physicalResourceId or context.log_stream_name

If an exception is thrown and caught, no physicalResourceId is passed, so it defaults to the log stream name, which is different from the real uuid-based physical resource id.

Expected Behavior

It should not have sent back a new bogus physical resource id, so CloudFormation would just do a normal reverse update during rollback instead of a delete.

Current Behavior

data loss

Reproduction Steps

Since the original failure was due to some transient CloudFront issue, I don't know how to reliably reproduce this. Our CDK construct looks like this:

        aws_s3_deployment.BucketDeployment(
            self,
            "deployment",
            sources=[aws_s3_deployment.Source.asset(asset_path)],
            destination_bucket=bucket,
            destination_key_prefix=path_prefix,
            distribution=distribution,
            retain_on_delete=False
        )

Possible Solution

No response

Additional Information/Context

No response

CDK CLI Version

2.39.1 (build f188fac)

Framework Version

No response

Node.js Version

v16.17.0

OS

Alpine 3.16

Language

Python

Language Version

3.10.6

Other information

No response

@rittneje rittneje added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Oct 27, 2022
@rittneje
Copy link
Author

cc @peterwoodworth

@rittneje
Copy link
Author

rittneje commented Nov 3, 2022

ping @otaviomacedo @peterwoodworth

@rittneje
Copy link
Author

@peterwoodworth Are there plans to fix this bug? It can result in unexpected data loss.

@peterwoodworth
Copy link
Contributor

@rittneje sorry it's taken so long for you to get a response here. We take data loss seriously and will address this soon.

The behavior you're running into is described here under PhysicalResourceId

I'm not super familiar with this custom resource, so I'm not immediately sure why we are setting the PhysicalResourceId to be different in circumstances, or how we can fix this without breaking previous behavior. We'll look into this

@peterwoodworth peterwoodworth added p1 effort/small Small work item – less than a day of effort and removed needs-triage This issue or PR still needs to be triaged. labels Nov 19, 2022
@garyaparker
Copy link
Contributor

@peterwoodworth any update on this issue? We appear to have experienced data lost twice this week due to the combination of this bug and an outage of the CloudFront API.

Is my understanding of the issue correct in that we can set retain_on_delete=True as a temporary fix until the root cause is resolved?

@mergify mergify bot closed this as completed in #24428 Mar 29, 2023
mergify bot pushed a commit that referenced this issue Mar 29, 2023
Send the correct physical id to CloudFormation to prevent unintended object deletion during rollback.

Closes #22670 

----

*By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
@github-actions
Copy link

⚠️COMMENT VISIBILITY WARNING⚠️

Comments on closed issues are hard for our team to see.
If you need more assistance, please either tag a team member or open a new issue that references this one.
If you wish to keep having a conversation with other community members under this issue feel free to do so.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
@aws-cdk/aws-s3-deployment bug This issue is a bug. effort/small Small work item – less than a day of effort p1
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants