Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(cli): garbage collect s3 assets (under --unstable flag) #31611

Merged
merged 85 commits into from
Oct 21, 2024
Merged
Show file tree
Hide file tree
Changes from 80 commits
Commits
Show all changes
85 commits
Select commit Hold shift + click to select a range
5e635b9
basic api surface area
kaizencc Sep 23, 2024
785d820
more scaffolding with sdks
kaizencc Sep 23, 2024
b12362a
add cfn api calls
kaizencc Sep 24, 2024
796e8d2
tagging api in use now
kaizencc Sep 26, 2024
52b6c10
new garbo collector
kaizencc Sep 27, 2024
71062e0
parallel delete
kaizencc Sep 27, 2024
6fbcce4
integ test
kaizencc Sep 30, 2024
574e8ef
integ tests mostly work
kaizencc Sep 30, 2024
8bf315d
working integ test
kaizencc Oct 1, 2024
689b283
documentation
kaizencc Oct 1, 2024
96fbb5a
delete parallel
kaizencc Oct 1, 2024
64fe5db
more docs
kaizencc Oct 1, 2024
6f2381d
tagging works now
kaizencc Oct 1, 2024
890daaf
delete logs and add more docs
kaizencc Oct 1, 2024
e85c622
unstable
kaizencc Oct 1, 2024
e3a708e
linter
kaizencc Oct 1, 2024
835e399
Merge branch 'main' into conroy/gc
kaizencc Oct 1, 2024
4a5fd50
yarnlock
kaizencc Oct 2, 2024
3141414
change name to rollbackBufferDays
kaizencc Oct 2, 2024
1ace9ea
Merge branch 'main' into conroy/gc
kaizencc Oct 2, 2024
a6412e9
Merge branch 'conroy/gc' of https://github.com/aws/aws-cdk into conro…
kaizencc Oct 2, 2024
30005d5
buncha renames
kaizencc Oct 2, 2024
778fff0
consolidate api options
kaizencc Oct 2, 2024
e4c5ebb
faciliate actions better
kaizencc Oct 2, 2024
73d8eda
fix refresh stacks
kaizencc Oct 2, 2024
8e936a0
parallel get tags
kaizencc Oct 3, 2024
232f186
remove parallelism
kaizencc Oct 3, 2024
4e6bf9a
Merge branch 'main' into conroy/gc
kaizencc Oct 3, 2024
bcf561f
Merge branch 'conroy/gc' of https://github.com/aws/aws-cdk into conro…
kaizencc Oct 3, 2024
7ff4273
small change
kaizencc Oct 3, 2024
9f7341e
readme
kaizencc Oct 3, 2024
eb204b3
stupid stupid stupid
kaizencc Oct 3, 2024
2250768
review in progress means waiting
kaizencc Oct 7, 2024
f942870
unit tests
kaizencc Oct 7, 2024
18376d8
more unit tests
kaizencc Oct 7, 2024
80440df
robust qualifier
kaizencc Oct 7, 2024
033fed4
update integs
kaizencc Oct 7, 2024
f65f09c
more unit tests
kaizencc Oct 8, 2024
2731756
Merge branch 'main' into conroy/gc
kaizencc Oct 8, 2024
0b58840
finish merge
kaizencc Oct 8, 2024
607fa33
remove dup dep
kaizencc Oct 8, 2024
e898d3f
argh linter
kaizencc Oct 8, 2024
e69fcfc
Merge branch 'main' into conroy/gc
kaizencc Oct 8, 2024
44edfc6
Merge branch 'main' into conroy/gc
kaizencc Oct 8, 2024
2e1b780
minor updates
kaizencc Oct 13, 2024
fd08da5
refactor how stack refresh is done
kaizencc Oct 13, 2024
796248e
add debugs
kaizencc Oct 14, 2024
6a406c3
progress printer
kaizencc Oct 14, 2024
551aa86
lint
kaizencc Oct 14, 2024
411df38
various pr comments
kaizencc Oct 14, 2024
a46aa35
add two missing unit tests
kaizencc Oct 14, 2024
b3c6b99
ignore assets that are changed after gc starts
kaizencc Oct 14, 2024
0f233c5
linters
kaizencc Oct 14, 2024
f76f494
untag
kaizencc Oct 14, 2024
d861414
add integ test
kaizencc Oct 14, 2024
d7ea819
Merge branch 'main' into conroy/gc
kaizencc Oct 14, 2024
d1ac1b1
make some calls synchronous to simplify code
kaizencc Oct 15, 2024
4bf31b6
pr comments
kaizencc Oct 15, 2024
4610ac4
unit tests for large # of objects
kaizencc Oct 15, 2024
8c1c376
Merge branch 'conroy/gc' of https://github.com/aws/aws-cdk into conro…
kaizencc Oct 15, 2024
770ab8d
stupid stupid linter
kaizencc Oct 15, 2024
1728677
Merge branch 'main' into conroy/gc
kaizencc Oct 16, 2024
ffc0e18
pr comment
kaizencc Oct 16, 2024
b9f92b5
unit tests for background refresh
kaizencc Oct 16, 2024
d9d9b41
add prompt before deletion
kaizencc Oct 17, 2024
5d7942f
createdAtBufferDays
kaizencc Oct 17, 2024
4ce3a1a
lint
kaizencc Oct 17, 2024
e4c290c
readme update
kaizencc Oct 17, 2024
22a5365
Merge branch 'main' into conroy/gc
kaizencc Oct 17, 2024
ba28cf2
typos
kaizencc Oct 17, 2024
099c327
pause
kaizencc Oct 17, 2024
b109823
integ test fix
kaizencc Oct 17, 2024
9dc66a2
Merge branch 'main' into conroy/gc
kaizencc Oct 17, 2024
09f212b
Update packages/aws-cdk/lib/api/garbage-collection/stack-refresh.ts
kaizencc Oct 21, 2024
36c83ae
Merge branch 'main' into conroy/gc
kaizencc Oct 21, 2024
533797a
Update packages/aws-cdk/lib/cli.ts
kaizencc Oct 21, 2024
6056bf6
Update packages/aws-cdk/lib/cli.ts
kaizencc Oct 21, 2024
3736e37
start is not async
kaizencc Oct 21, 2024
9deae15
rename cli options
kaizencc Oct 21, 2024
4fec4a6
global unstable
kaizencc Oct 21, 2024
ada0384
Apply suggestions from code review
kaizencc Oct 21, 2024
59edf16
remove console statement
kaizencc Oct 21, 2024
4a75ec4
Merge branch 'conroy/gc' of https://github.com/aws/aws-cdk into conro…
kaizencc Oct 21, 2024
60a8ec5
Merge branch 'main' into conroy/gc
kaizencc Oct 21, 2024
0c851ba
Merge branch 'main' into conroy/gc
mergify[bot] Oct 21, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
45 changes: 45 additions & 0 deletions packages/@aws-cdk-testing/cli-integ/lib/with-cdk-app.ts
Original file line number Diff line number Diff line change
Expand Up @@ -331,6 +331,30 @@ export interface CdkModernBootstrapCommandOptions extends CommonCdkBootstrapComm
readonly usePreviousParameters?: boolean;
}

export interface CdkGarbageCollectionCommandOptions {
/**
* The amount of days an asset should stay isolated before deletion, to
* guard against some pipeline rollback scenarios
*
* @default 0
kaizencc marked this conversation as resolved.
Show resolved Hide resolved
*/
readonly rollbackBufferDays?: number;

/**
* The type of asset that is getting garbage collected.
*
* @default 'all'
*/
readonly type?: 'ecr' | 's3' | 'all';
kaizencc marked this conversation as resolved.
Show resolved Hide resolved

/**
* The name of the bootstrap stack
*
* @default 'CdkToolkit'
*/
readonly bootstrapStackName?: string;
}

export class TestFixture extends ShellHelper {
public readonly qualifier = this.randomString.slice(0, 10);
private readonly bucketsToDelete = new Array<string>();
Expand Down Expand Up @@ -464,6 +488,26 @@ export class TestFixture extends ShellHelper {
});
}

public async cdkGarbageCollect(options: CdkGarbageCollectionCommandOptions): Promise<string> {
const args = [
'gc',
'--unstable=gc', // TODO: remove when stabilizing
'--confirm=false',
'--created-buffer-days=0', // Otherwise all assets created during integ tests are too young
];
if (options.rollbackBufferDays) {
args.push('--rollback-buffer-days', String(options.rollbackBufferDays));
}
if (options.type) {
args.push('--type', options.type);
}
if (options.bootstrapStackName) {
args.push('--bootstrapStackName', options.bootstrapStackName);
}

return this.cdk(args);
}

public async cdkMigrate(language: string, stackName: string, inputPath?: string, options?: CdkCliOptions) {
return this.cdk([
'migrate',
Expand Down Expand Up @@ -634,6 +678,7 @@ async function ensureBootstrapped(fixture: TestFixture) {
CDK_NEW_BOOTSTRAP: '1',
},
});

ALREADY_BOOTSTRAPPED_IN_THIS_RUN.add(envSpecifier);
}

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,202 @@
import { GetObjectTaggingCommand, ListObjectsV2Command, PutObjectTaggingCommand } from '@aws-sdk/client-s3';
import { integTest, randomString, withoutBootstrap } from '../../lib';

jest.setTimeout(2 * 60 * 60_000); // Includes the time to acquire locks, worst-case single-threaded runtime

integTest(
'Garbage Collection deletes unused assets',
withoutBootstrap(async (fixture) => {
const toolkitStackName = fixture.bootstrapStackName;
const bootstrapBucketName = `aws-cdk-garbage-collect-integ-test-bckt-${randomString()}`;
fixture.rememberToDeleteBucket(bootstrapBucketName); // just in case

await fixture.cdkBootstrapModern({
toolkitStackName,
bootstrapBucketName,
});

await fixture.cdkDeploy('lambda', {
options: [
'--context', `bootstrapBucket=${bootstrapBucketName}`,
'--context', `@aws-cdk/core:bootstrapQualifier=${fixture.qualifier}`,
'--toolkit-stack-name', toolkitStackName,
'--force',
],
});
fixture.log('Setup complete!');

await fixture.cdkDestroy('lambda', {
options: [
'--context', `bootstrapBucket=${bootstrapBucketName}`,
'--context', `@aws-cdk/core:bootstrapQualifier=${fixture.qualifier}`,
'--toolkit-stack-name', toolkitStackName,
'--force',
],
});

await fixture.cdkGarbageCollect({
rollbackBufferDays: 0,
type: 's3',
bootstrapStackName: toolkitStackName,
});
fixture.log('Garbage collection complete!');

// assert that the bootstrap bucket is empty
await fixture.aws.s3.send(new ListObjectsV2Command({ Bucket: bootstrapBucketName }))
.then((result) => {
expect(result.Contents).toBeUndefined();
});
}),
);

integTest(
'Garbage Collection keeps in use assets',
withoutBootstrap(async (fixture) => {
const toolkitStackName = fixture.bootstrapStackName;
const bootstrapBucketName = `aws-cdk-garbage-collect-integ-test-bckt-${randomString()}`;
fixture.rememberToDeleteBucket(bootstrapBucketName); // just in case

await fixture.cdkBootstrapModern({
toolkitStackName,
bootstrapBucketName,
});

await fixture.cdkDeploy('lambda', {
options: [
'--context', `bootstrapBucket=${bootstrapBucketName}`,
'--context', `@aws-cdk/core:bootstrapQualifier=${fixture.qualifier}`,
'--toolkit-stack-name', toolkitStackName,
'--force',
],
});
fixture.log('Setup complete!');

await fixture.cdkGarbageCollect({
rollbackBufferDays: 0,
type: 's3',
bootstrapStackName: toolkitStackName,
});
fixture.log('Garbage collection complete!');

// assert that the bootstrap bucket has the object
await fixture.aws.s3.send(new ListObjectsV2Command({ Bucket: bootstrapBucketName }))
.then((result) => {
expect(result.Contents).toHaveLength(1);
});

await fixture.cdkDestroy('lambda', {
options: [
'--context', `bootstrapBucket=${bootstrapBucketName}`,
'--context', `@aws-cdk/core:bootstrapQualifier=${fixture.qualifier}`,
'--toolkit-stack-name', toolkitStackName,
'--force',
],
});
fixture.log('Teardown complete!');
}),
);

integTest(
'Garbage Collection tags unused assets',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we get 1 more tests for verifying old s3 tagged assets do in fact get deleted if the buffer time is set to a > 0 value

withoutBootstrap(async (fixture) => {
const toolkitStackName = fixture.bootstrapStackName;
const bootstrapBucketName = `aws-cdk-garbage-collect-integ-test-bckt-${randomString()}`;
fixture.rememberToDeleteBucket(bootstrapBucketName); // just in case

await fixture.cdkBootstrapModern({
toolkitStackName,
bootstrapBucketName,
});

await fixture.cdkDeploy('lambda', {
options: [
'--context', `bootstrapBucket=${bootstrapBucketName}`,
'--context', `@aws-cdk/core:bootstrapQualifier=${fixture.qualifier}`,
'--toolkit-stack-name', toolkitStackName,
'--force',
],
});
fixture.log('Setup complete!');

await fixture.cdkDestroy('lambda', {
options: [
'--context', `bootstrapBucket=${bootstrapBucketName}`,
'--context', `@aws-cdk/core:bootstrapQualifier=${fixture.qualifier}`,
'--toolkit-stack-name', toolkitStackName,
'--force',
],
});

await fixture.cdkGarbageCollect({
rollbackBufferDays: 100, // this will ensure that we do not delete assets immediately (and just tag them)
type: 's3',
bootstrapStackName: toolkitStackName,
});
fixture.log('Garbage collection complete!');

// assert that the bootstrap bucket has the object and is tagged
await fixture.aws.s3.send(new ListObjectsV2Command({ Bucket: bootstrapBucketName }))
.then(async (result) => {
expect(result.Contents).toHaveLength(2); // also the CFN template
const key = result.Contents![0].Key;
const tags = await fixture.aws.s3.send(new GetObjectTaggingCommand({ Bucket: bootstrapBucketName, Key: key }));
expect(tags.TagSet).toHaveLength(1);
});
}),
);

integTest(
'Garbage Collection untags in-use assets',
withoutBootstrap(async (fixture) => {
const toolkitStackName = fixture.bootstrapStackName;
const bootstrapBucketName = `aws-cdk-garbage-collect-integ-test-bckt-${randomString()}`;
fixture.rememberToDeleteBucket(bootstrapBucketName); // just in case

await fixture.cdkBootstrapModern({
toolkitStackName,
bootstrapBucketName,
});

await fixture.cdkDeploy('lambda', {
options: [
'--context', `bootstrapBucket=${bootstrapBucketName}`,
'--context', `@aws-cdk/core:bootstrapQualifier=${fixture.qualifier}`,
'--toolkit-stack-name', toolkitStackName,
'--force',
],
});
fixture.log('Setup complete!');

// Artificially add tagging to the asset in the bootstrap bucket
const result = await fixture.aws.s3.send(new ListObjectsV2Command({ Bucket: bootstrapBucketName }));
const key = result.Contents!.filter((c) => c.Key?.split('.')[1] == 'zip')[0].Key; // fancy footwork to make sure we have the asset key
await fixture.aws.s3.send(new PutObjectTaggingCommand({
Bucket: bootstrapBucketName,
Key: key,
Tagging: {
TagSet: [{
Key: 'aws-cdk:isolated',
Value: '12345',
}, {
Key: 'bogus',
Value: 'val',
}],
},
}));

await fixture.cdkGarbageCollect({
rollbackBufferDays: 100, // this will ensure that we do not delete assets immediately (and just tag them)
type: 's3',
bootstrapStackName: toolkitStackName,
});
fixture.log('Garbage collection complete!');

// assert that the isolated object tag is removed while the other tag remains
const newTags = await fixture.aws.s3.send(new GetObjectTaggingCommand({ Bucket: bootstrapBucketName, Key: key }));

expect(newTags.TagSet).toEqual([{
Key: 'bogus',
Value: 'val',
}]);
}),
);
73 changes: 73 additions & 0 deletions packages/aws-cdk/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ The AWS CDK Toolkit provides the `cdk` command-line interface that can be used t
| [`cdk watch`](#cdk-watch) | Watches a CDK app for deployable and hotswappable changes |
| [`cdk destroy`](#cdk-destroy) | Deletes a stack from an AWS account |
| [`cdk bootstrap`](#cdk-bootstrap) | Deploy a toolkit stack to support deploying large stacks & artifacts |
| [`cdk gc`](#cdk-gc) | Garbage collect assets associated with the bootstrapped stack |
| [`cdk doctor`](#cdk-doctor) | Inspect the environment and produce information useful for troubleshooting |
| [`cdk acknowledge`](#cdk-acknowledge) | Acknowledge (and hide) a notice by issue number |
| [`cdk notices`](#cdk-notices) | List all relevant notices for the application |
Expand Down Expand Up @@ -876,6 +877,78 @@ In order to remove that permissions boundary you have to specify the
cdk bootstrap --no-previous-parameters
```

### `cdk gc`

CDK Garbage Collection.

> [!CAUTION]
> CDK Garbage Collection is under development and therefore must be opted in via the `--unstable` flag: `cdk gc --unstable=gc`.
>
> [!WARNING]
> `cdk gc` currently only supports garbage collecting S3 Assets. You must specify `cdk gc --unstable=gc --type=s3` as ECR asset garbage collection has not yet been implemented.

`cdk gc` garbage collects unused S3 assets from your bootstrap bucket via the following mechanism:

- for each object in the bootstrap S3 Bucket, check to see if it is referenced in any existing CloudFormation templates
- if not, it is treated as unused and gc will either tag it or delete it, depending on your configuration.

The most basic usage looks like this:

```console
cdk gc --unstable=gc --type=s3
```

This will garbage collect S3 assets from the current bootstrapped environment(s) and immediately delete them. Note that, since the default bootstrap S3 Bucket is versioned, object deletion will be handled by the lifecycle
policy on the bucket.

Before we begin to delete your assets, you will be prompted:

```console
cdk gc --unstable=gc --type=s3

Found X objects to delete based off of the following criteria:
- objects have been isolated for > 0 days
- objects were created > 1 days ago

Delete this batch (yes/no/delete-all)?
```

Since it's quite possible that the bootstrap bucket has many objects, we work in batches of 1000 objects. To skip the
prompt either reply with `delete-all`, or use the `--confirm=false` option.

```console
cdk gc --unstable=gc --type=s3 --confirm=false
```

If you are concerned about deleting assets too aggressively, there are multiple levers you can configure:

- rollback-buffer-days: this is the amount of days an asset has to be marked as isolated before it is elligible for deletion.
- created-buffer-days: this is the amount of days an asset must live before it is elligible for deletion.

When using `rollback-buffer-days`, instead of deleting unused objects, `cdk gc` will tag them with
today's date instead. It will also check if any objects have been tagged by previous runs of `cdk gc`
and delete them if they have been tagged for longer than the buffer days.

When using `created-buffer-days`, we simply filter out any assets that have not persisted that number
of days.

```console
cdk gc --unstable=gc --type=s3 --rollback-buffer-days=30 --created-buffer-days=1
```

You can also configure the scope that `cdk gc` performs via the `--action` option. By default, all actions
are performed, but you can specify `print`, `tag`, or `delete-tagged`.

- `print` performs no changes to your AWS account, but finds and prints the number of unused assets.
- `tag` tags any newly unused assets, but does not delete any unused assets.
- `delete-tagged` deletes assets that have been tagged for longer than the buffer days, but does not tag newly unused assets.

```console
cdk gc --unstable=gc --type=s3 --action=delete-tagged --rollback-buffer-days=30
```

This will delete assets that have been unused for >30 days, but will not tag additional assets.

### `cdk doctor`

Inspect the current command-line environment and configurations, and collect information that can be useful for
Expand Down
Loading
Loading