-
Notifications
You must be signed in to change notification settings - Fork 85
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Garbage Collection for Assets #64
Comments
How do we generate keys when uploading? Just random? If it were deterministic, we could use lifecycle rules of object versions: https://aws.amazon.com/about-aws/whats-new/2014/05/20/amazon-s3-now-supports-lifecycle-rules-for-versioning/ That way objects are only considered for deletion if they are not 'current'. |
Now that I think about it, versioning would only help if we were regularly re-uploading files. That's probably not the case? |
Life cycle rules sounds like a good option for sure. The object keys are based on the hash of the contents of the asset, so to avoid uploading in case the content hasn't changed (code) |
I'm not sure this works if there are two stacks, and only one is deployed with a new version of the asset. In my mind, the other stack should still point to the old version of the asset (should not be automatically updated), but now the asset will be aged out after a while and the undeployed stack will magically break after a certain time. Alternative idea (but much more involved and potentially not Free Tier): a Lambda that runs ever N hours which enumerates all stacks, detects the assets that are in use (from stack parameters or in some other way) and clears out the rest? |
But I think this runs afoul of reuse across stacks. |
👍 on garbage collecting lambda that runs every week or so, with ability to opt out and some cost warnings on docs and online |
Seems risky. What if a deployment happen while crawling? |
Yeah perhaps only collect old assets (month old) and we can salt the object key such that if a whole month had passed, it will be a new object? Thinking out loud.... requires a design |
I've got quite a few assets in my bucket now after a month or so of deploying from CD. How do I determine which ones are in use, even manually? I can't seem to figure out the correlation between the names of the files in S3 and anything else I could use to determine what's being used. The lambdas don't point back at them in any way I can see. I want to eventually write a script to do this safely for my use case, but absent a way of telling what's being used I'm stuck. |
S3 now has lifecycle rules that can automatically delete objects a number of days after creation which might be a solution too. |
The ones in use are those referenced in active CloudFormation stacks deployed via CDK. Those stack templates will include something like this: "GenFeedFunc959C5085": {
"Type": "AWS::Lambda::Function",
"Properties": {
"Code": {
"S3Bucket": {
"Fn::Sub": "cdk-xxx-assets-${AWS::AccountId}-${AWS::Region}"
},
"S3Key": "5946c35f6797cf45370f262c1e5992bc54d8e7dd824e3d5aa32152a2d1e85e5d.zip"
},
Unfortunately, that won't help since old objects might still be in use, e.g. when a Lambda wasn't deployed in a while. (It doesn't help that all assets are stored in the same "folder" either.) |
Doesn't lambda copy the resources during deployment? |
The Lambda service will cache functions, but AFAIK there's no guarantee that they will be cached forever. |
I'm fairly certain that Lambda only reads the assets during deployment and that they aren't needed afterwards. You can for example deploy to lambda without using S3 for smaller assets and those aren't stored in there. |
Would love to read about the behavior in the docs somewhere. That 50 MB upload limit must exist for a reason. Haven't found anything so far though. |
I can't find any concrete resources on this but I haven't found any docs mentioning that you cannot delete the object after deletion. Also, my lambdas doesn't have any permission to read my staging bucket nor does it mention that the object is from S3 so I doubt it's required to keep the object around. |
I would suggest writing s3 objects as The main issue here is you need to pick your date / prefix only once, and stick with that, for the whole build process. You don't want to upload files at Tue 23:59 and a later process/function looks in Wednesday for the object. Perhaps just having an Option 2 for S3 is to check the ECR is different because it's not transient, it is the long term backing store for Lambda. Just spitballing here, but if there was a "transient ECR repo" with a 7 day deletion policy, you could push new builds to that, and then during cloudformation deploy, those images would then be "copied/hardlinked" to a "runtime ECR repo" with lifetime managed by cloudformation, e.g. removed upon stack update/delete. Maybe the same thing could be accomplished if cloudformation could set ECR Tags that "pin" images in the transient repo that are in use, and tagged images are excluded from lifecycle rules. However, to avoid races, builds have to push something (e.g. at least some tiny change) to the transient ECR repo to refresh the image age (at least if the image is older than a day), so it won't be deleted right before cloudformation starts tracking / pinning it. |
Word of caution: doing just a time based deletion on assets is a little risky. We have had this scenario play out:
So, unsure how you would even accomplish this, but ideally don't delete any S3 assets that are referenced in existing CFN templates. |
Sorry, was reading quickly - sound like the RFC is going to try to do this, so yah! |
People have been polite for long enough. We spend millions of dollars a year with AWS. Time to put customers interests first. |
This comment was marked as off-topic.
This comment was marked as off-topic.
Look folks I appreciate that this is frustrating, but can we please not flame someone who's trying to do their job. After a few months of reflection the solution for my use case of this construct was to separate the build and deploy stages of my service. Essentially I had two CDK stacks, one of which provisioned an ECR repo to which I could apply a life cycle policy on images, and the other which deployed my service. Then when it came to deployment I simply deployed my first stack, parsed out the ECR repo ARN and piped it into the call to This approach is more in line with the build, release, run philosophy of the 12 Factor App, which I broadly agree with. |
Personally I had refined default bootstrap solution that prepares accounts with cdk toolkit. That includes reasonable retention policies and I find it to a very good job. |
It seems several solutions clean up CDK staging bucket from the perspective of listing deployed CloudFormation templates. This might not be ideal in case multiple CDK projects are being used in a single AWS account. and we want a different set up per project. I have tried a different approach that modifies Default Stack Synthesizer to use S3 object key prefix and leverage cloud assembly output from CDK app synth method to determine, which assets should be kept after successful deployment. The full description and example Java code is at https://github.com/NewTownData/events-monolith/blob/main/infrastructure/docs/clean-up.md |
Just speaking for myself here, I have a small team and just noticed we're wasting 500GB on CDK assets in various buckets. |
hi everyone, some updates for this feature: aws/aws-cdk#31611 is released in CDK 2.163.0 under an between the two of them and any follow ups, we should be able to close out this feature as completed. |
This is a request for comments about Asset Garbage Collection. See #64 for additional details. APIs are signed off by @rix0rrr . --- _By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license_ --------- Co-authored-by: Rico Hermans <rix0rrr@gmail.com>
The RFC for this issue has been approved and merged (#379). The implementation has also been completed and should be fully released this week in CDK 2.165.0 under the |
@kaizencc Very exited for this new feature, and just tried to run it now and...
My first instinct is to say that the gc command should probably ignore stacks that it doesn't have access, but leave a warning in the console. Maybe even have a parameter to control this behaviour? Should I open a separate ticket for this? |
Thanks for building this feature. FYI - 5th day in a row running it and still going. Stops every 12 hours because my token gets revoked:
would be great to fix the I can confirm it's working if I go to my ecr and s3 assets and count the number of assets now vs what they were before I began. |
After more investigating, looks like ECR is analyzing the same image ids over and over. Has anyone else seen this behavior? |
I think this was just fixed yesterday: aws/aws-cdk#32679 / aws/aws-cdk#32498 |
Thanks for the heads up. Will keep an eye out for a new release. |
Still seeing unexpected behavior. In one account I see the following:
This continues seemingly indefinitely. It's also strange that the verbose flag stops working. In another account I see the following:
This also continues seemingly indefinitely. Same with verbose flag behavior not working as expected. It's possible I'm not allowing the operation to continue for long enough. Any ideas? |
Since this feature is past the RFC now, I think it'd be better to open bugs in the CDK repo. This RFC has a lot of subscribers and is likely not being actively monitored by maintainers anymore. |
@blimmer thanks. Opened aws/aws-cdk#32742. |
Description
Assets which are uploaded to the CDK's S3 bucket and ECR repository are never deleted. This will incur costs for users in the long term. We should come up with a story on how those should be garbage collected safely.
Initially we should offer
cdk gc
which will track down unused assets (e.g. by tracing them back from deployed stacks) and offering users to delete them. We can offer an option to automatically run this after every deployment (either in CLI or through CI/CD). Later we can even offer a construct that you deploy to your environment and it can do that for you.Proposed usage:
Examples:
This command will find all orphaned S3 and ECR assets in a specific AWS environment and will delete them:
This command will garbage collect all assets in all environments that belong to the current CDK app (if
cdk.json
exists):Just list orphaned assets:
Roles
Workflow
status/proposed
)status/review
)api-approved
applied to pull request)status/final-comments-period
)status/approved
)status/planning
)status/implementing
)status/done
)The text was updated successfully, but these errors were encountered: