Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(deps): upgrade @guardian/cdk to latest (8.2.0) #155

Merged
merged 4 commits into from
Apr 14, 2021
Merged

Conversation

akash1810
Copy link
Member

@akash1810 akash1810 commented Mar 17, 2021

What does this change?

Updates to @guardian/cdk v8.2.0 🎉 . This has got a number of breaking changes (full diff here).

We'd need to manually apply the CFN template as the following parameters have been renamed (hopefully for the last time!):

  • AMI -> AMIPrism
  • InstanceType -> InstanceTypePrism

How to test

n/a

How can we measure success?

Keeping up to date.

Have we considered potential risks?

n/a

Images

n/a

@akash1810 akash1810 requested a review from a team March 17, 2021 16:00
jacobwinch
jacobwinch previously approved these changes Mar 19, 2021
@akash1810 akash1810 changed the title chore(deps): bump @guardian/cdk to v3.0.0 chore(deps): bump @guardian/cdk to v4.0.0 Mar 19, 2021
@akash1810 akash1810 marked this pull request as draft March 29, 2021 12:46
Looks like some @guardian/cdk constructs are not applying the App tag. I suspect since guardian/cdk#326.

Until that is fixed, we can safely, manually apply it to all constructs in tree from `this` as it's a single app stack.
@akash1810 akash1810 changed the title chore(deps): bump @guardian/cdk to v4.0.0 chore(deps): upgrade @guardian/cdk to latest (8.2.0) Apr 14, 2021
Comment on lines +17 to +23
/*
Looks like some @guardian/cdk constructs are not applying the App tag.
I suspect since https://github.com/guardian/cdk/pull/326.
Until that is fixed, we can safely, manually apply it to all constructs in tree from `this` as it's a single app stack.
TODO: remove this once @guardian/cdk has been fixed.
*/
AppIdentity.taggedConstruct(PrismAccess.app, this);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be seen in the snapshot diff on the first commit.

@akash1810 akash1810 marked this pull request as ready for review April 14, 2021 09:16
@@ -573,7 +573,7 @@ dpkg -i /prism/prism.deb",
},
"Type": "AWS::IAM::Role",
},
"LoadBalancer": Object {
"LoadBalancerBE9EEC3A": Object {
Copy link
Contributor

@jacobwinch jacobwinch Apr 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@akash1810 - I think we'll need to set existingLogicalId on the load balancer to avoid a replacement here? I'm a bit confused as to how this logical id has been kept until now, as overrideId doesn't currently appear to be set 😕

Copy link
Member Author

@akash1810 akash1810 Apr 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Argh! Good spot! Will add existingLogicalId.

I think this is the result of guardian/cdk#440. Looking at the removed tests, it looks like if migratedFromCloudFormation was true on the GuStack, the construct's logicalId was not auto-generated; and the value passed into the constructor was used.

I think this is another +1 for guardian/cdk#364 and encapsulating this logic in one place.

I'd also have hoped to see the message in the console, but then I realised I've been using Jest locally which sets NODE_ENV to test - oops guardian/cdk#397!

"Properties": Object {
"GroupDescription": "Allow all outbound traffic on port 443",
"GroupDescription": "Allow all outbound HTTPS traffic",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will result in a replacement 😢 . I wonder if guardian/cdk could send a warning on this event? Though maybe the major version number change is enough of a signal?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, good point. We have (at least) two different types of breaking change:

  1. We've changed the API and you'll need to update some code, but it shouldn't cause any breaking changes to the underlying infrastructure
  2. We've changed something which will have a significant impact on the underlying infrastructure (e.g. replacing a security group, updating load balancer version)

I think everyone will be familiar with type 1, but I wonder if a major version bump will be enough to prepare them for changes of type 2... hopefully these will be rare events once we have our production-ready version though!

Copy link
Contributor

@jacobwinch jacobwinch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work 💯

@akash1810
Copy link
Member Author

akash1810 commented Apr 14, 2021

This change updates security groups. This is normally tricky when they are in-use by instances as the instances need to be cycled out of service.

This manifests itself like this in the CFN event log:

resource sg-0af9f2fea82e07d81 has a dependent object (Service: AmazonEC2; Status Code: 400; Error Code: DependencyViolation; Request ID: af537518-65ca-454c-b652-e2b1d11ca2c9; Proxy: null)

When this happens, I usually perform an app only deploy to cycle the current instances out and launch new ones.

I think I did this too early on CODE, which lead to a few issues. That is, I didn't wait for the DependencyViolation message to perform an app only deploy, instead I deployed earlier at a point when it looks like the IAM policy changes hadn't finished. This resulted in new instances failing to launch as they couldn't get the deb file from S3:

[ 20.646057] cloud-init[1733]: fatal error: An error occurred (403) when calling the HeadObject operation: Forbidden
[ 20.787557] cloud-init[1733]: dpkg: error: cannot access archive '/prism/prism.deb': No such file or directory

I've deployed the CFN change manually to PROD, waiting for the DependencyViolation event before cycling the instances out. This went without issue.

@akash1810 akash1810 merged commit 9da2b20 into main Apr 14, 2021
@akash1810 akash1810 deleted the aa-gu-cdk-3.0.0 branch April 14, 2021 16:49
@jacobwinch
Copy link
Contributor

I think I did this too early on CODE, which lead to a few issues. That is, I didn't wait for the DependencyViolation message to perform an app only deploy, instead I deployed earlier at a point when it looks like the IAM policy changes hadn't finished.

Ah this makes sense now! Thanks for explaining!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants