-
Notifications
You must be signed in to change notification settings - Fork 83
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC 322: new CDK Pipelines API #323
Conversation
|
||
## Working Backwards | ||
|
||
The final API for CDK Pipelines is finally here. Compared to the previous API: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it will be better to just describe the final API here (basically, the README for the pipelines module) and maybe at the end add a section "Migrating from Dev Preview?".
Can we take the current README and rewrite it with the new API? It will be a good way to ensure we did not miss anything
Best of all: your old code--even though it uses no longer recommended idioms--still continues to work, so you | ||
can upgrade to the new style at your own pace. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is a goal. It's great if we can do this without breaking existing users, but it's not required.
Alternatively, we can offer the old API through a separate class CdkPIpelineLegacy
(and maybe implement it on top of the new API) so that dev-preview users would be able to migrate at their leisure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, fair. It's not a goal, but I think we can easily achieve it and it's a nice side effect to not have to break all existing 1st and 3rd party blog posts, videos and code samples when we move on to better solutions.
> The biggest change we want to make but don't have an answer for at time of this writing is how to give users the | ||
> ability to tweak every individual part of every indiviual CodeBuild project that gets generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we offer something like escape hatches for this? Basically loosely-typed way to get access to lower-level constructs and then interact with them
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not convinced escape hatches are a sufficient solution.
It also depends on what you mean by "escape hatch". If you mean fall down to L1 and we don't expose anything, it will look like:
pipeline.node.tryFindChild('Pipeline/CodeBuild10dbaa7/Resource').addPropertyOverride(...);
Which is... possible but not great.
If we want to expose L2s, it will look like:
pipeline.assets[3].codeBuild.addToRolePolicy(...);
Which is better, but presumes the Pipeline
construct knows to expose AWS-specific L2s, which means it's already backend-specific, so not an "escape hatch" in that sense.
Also most of our L2s aren't mutable, so any form of "escape hatching" will necessarily force people down to the L1 level.
Practical problems like "I want all CodeBuild projects to run in a given VPC" don't have a practical answer using escape hatches. There are too many properties that all need to be changed in tandem, with an awkward interface, for a request which is reasonable, obvious and common.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that "I want all CodeBuild projects should run in a VPC" is not something that should be solved via escape hatches. It should be a feature of the CodePipeline backend exposed through some abstract interface in the "top layer".
const codePipeline = new cdkp.CodePipelineBackend(this, 'CodePipelineBackend', {
runAllCodeBuildProjectsInAVpc: vpc,
someOtherCodeSuiteSettings: 'foo'
});
new CdkPipeline(this, 'Pipeline', {
backend: codePipeline,
// ...
});
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As for escape hatches - I meant the first version. Escape hatches are by definition untyped. Otherwise, they are just part of the API :-). They should be left for last resort of course. I think a combination of abstract interfaces at the "top layer" and escape hatches "from the bottom" should be sufficient for 1.0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, it should work like this. The question is whether that is the API we want to present to users. For users just interested in CodePipeline, they are now paying a tax to enable use cases they're not interested in.
We can remove that tax again by adding a specialized class like:
interface CodePipelinePipelineProps {
runAllCodeBuildProjectsInAVpc: IVpc,
someOtherCodeSuiteSettings: stirng;
}
class CodePipelinePipeline {
constructor(scope, id, props) {
const pipeline = new GenericPipeline(this, 'Resource', {
backend: new CodePipelineBackend(props)
});
}
}
But until we have another backend tested and implemented and are ready to advertise (which we are definitely not going to do in the current time line), it's probably safer to just present the above CodePipelinePipeline
class as the primary interface, have it implemented in terms of the more generic CdkPipeline
, but hide the internals until we're more sure about them. The fundamental parts will there to pick up the work later on when we'll have time to validate them.
(Also backend
needs to be a mandatory parameter for the generic pipeline...)
* We will also need to replace layer 1. This is also pretty obvious: for other backends, the set of | ||
supported Sources will be different, how we specify credentials will be different, the customizations offered | ||
will be different. The most natural API will be one that maps closely to the platform we're targeting. | ||
The `CdkPipeline` class (which currently takes `codepipeline.IAction`s, and `IPipeline`s, and will continue to do so | ||
for backwards compatibility) will also be reimplemented. For example, to a class like `CdkAppWorkflow`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't convince me that this means that the entire "top layer" has to be backend-specific. I think there's quite of a fairly large common denominator across all backends, and I'd be curious to try to come up with a top layer API that is not backend-specific (perhaps with a few intentional abstract touch points).
For example, if I take the snippet above, I don't see any CodePipeline specific semantics or details in it. I'd expect the exact same code to also apply to GitHub Workflows, no?
Happy to spend some time brainstorming on this together.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, backend-specific things would go in the pipelineBackend
, Source
and build action
s...
So potentially we can say that pipelines.Source
is CodePipeline-specific. GitHub sources are going to have CodePipeline specific things like auth tokens from Secrets Manager, POLL or WebHook configuration (depends on the permissions in the token), which don't necessarily apply to other backends.
And we probably also want to say that the build job is backend-specific, so people can twiddle IAM permissions and the caching settings and the 'testReporting` property of CodeBuild to their heart's content.
pipeline.addApplicationStage(stage, { | ||
approvals: [ | ||
cdkp.Approval.shellScript({ | ||
input: source, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is input
required here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually--yes. A CodeBuild job without any inputs cannot be added to a pipeline. Crazy but true!
On the other hand, no, we can always fake-supply the source input if the user didn't provide any.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"fake-supply" => "sensible default"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's sensible up to a point. The CodeBuild job is going to spend time and money downloading an artifact from S3 that it doesn't need. Sensible if you don't care what the artifact is and it's not too big. Not sensible if you don't care what the artifact is and it is.
But best we can do, probably.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we bind it to an empty source somehow? An empty s3 bucket for example?
|
||
In the most common case, it will add things like "shell script" actions to the approval workflow; but it might also | ||
choose to add actions before "Deploy" (`Approval.securityChanges()`), or it might choose to add actions *in between* | ||
`CreateChangeSet` and `DeployChangeSet` pairs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:-) see my comment above... If a backend ignores CreateChangeSet and DeployChangeSet how will an approval workflow that is executed between them work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that case, it will throw an error saying "this kind of validation can not be used with this type of backend".
Which again is fine, right. Other types of validations WILL work, just not this particular one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Who will throw this error? The backend? How will it even know that this validation is defined if it ignores the "Create/DeployChangeSet" actions?
* Prepublish all assets in individual CodeBuild projects | ||
* Prepublish all assets in one CodeBuild project | ||
* Publish assets just before each app deployment | ||
* Publish initial assets as usual but afterwards wait for ECR replication to finish |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Why do we need these degrees of freedom?
- If these are real requirements from users, I rather we implement them as "real" features and not as open hooks in the API.
Generally speaking, let's try to avoid "callback APIs" - that's a very fragile contract, and very hard to undo if we get wrong because we can never tell what people are doing with it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've also heard "trigger a StepFunctions workflow to do the asset publishing".
I definitely don't want to add support for everything on our end. There is lots of value in extensibility.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Supporting random customizations is not a requirement and usually not a good API design practice as it oftentimes leads to leaky APIs that are very hard to evolve. If there is a concrete use case with merit, we should consider how to support it and expose the correct API for it.
As for triggering step functions, what's the use case? What's the rationale? Or is it just one user's preference.
I rather we design our API with the minimal surface area and expand it as new requirements emerge then offer an extensible API that we won't be able to back away from.
* ShellScript | ||
- May have concepts like shell commands, Docker image, IAM permissions it should | ||
be possible to assume | ||
- Renders to CodeBuild for CodePipeline, a Step in GitHub Actions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A step in GitHub workflows cannot have a specific docker image. Maybe a "job"?
* Execute ChangeSet | ||
- With or without capturing outputs (backend may reject capturing outputs if it cannot implement that) | ||
|
||
In addition to these, a specific implementation may add backend-specific actions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is what I was expecting as the pattern to support backend specific things in the top-layer.
const source = cdkp.Source.gitHub('OWNER/REPO'); | ||
|
||
const pipeline = new cdkp.CdkPipeline(this, 'Pipeline', { | ||
synthAction: cdkp.Build.standardNpmSynth({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really don't like the name synthAction
. Maybe just build
? (the class is also called Build
).
Also, do we really need this standardNpmSynth
thing? Isn't it just [ 'npm install
, npm run build
]? If that's all there is to it, I think it's a redundant abstraction.
I recently ran into the term "mechanical sympathy", which I think applies here. The way I'm thinking about it that sometimes over-abstracting hides details from the user that makes it hard for them to understand how the system behaves.
``` | ||
┌──────────────────┬────────────────┬───────────────────────┐ | ||
│ │ │ Generic actions │ | ||
│ Backend, │ Rollout │ │ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really need to separate rollout from the workflow core now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Depends on how you want to look at it. In my head, those are data structures that the user doesn't necessarily directly interact with (the front-level API will be mostly in terms of CDK apps, which get translated into middle-level Actions).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I understand. So basically the middle layer is the model from which the backends render their concrete implementation. Did I get that right?
}); | ||
``` | ||
|
||
### Splitting definition and backend |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd leave only the proposed option and move the rest to "Alternative Considered" below.
Clear split of definition and instantiation. Very much like StepFunctions. | ||
|
||
```ts | ||
const pipelineDefinition = new cdkp.PipelineDefinition({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Everything in the CDK is a definition. No need to add that suffix.
const pipelineDefinition = new cdkp.PipelineDefinition({ | |
const pipelineDefinition = new cdkp.Pipeline({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a Definition
to distinguish it from the actual pipeline.
I agree "definition" is not a great word. Blueprint
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From a user perspective, this is the Pipeline (same as an sns.Topic
is a topic). Our naming guidelines are pretty consistent in that sense - we name things based on what they represent in the real world, not based on their mechanical role.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From a user perspective, this is the Pipeline (same as an sns.Topic is a topic)
What I'm saying is that in this specific code example it's not, because this is proposing a split between the pipeline "form" and the pipeline instantiation by means of 2 different classes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand this option and I still think this should be called Pipeline
. CDK pipeline is a concept that represents the deployment/rollout of the CDK app. From a user perspective, this object is the pipeline. The fact that you bind it to a backend later doesn't take away from that mental model.
Co-authored-by: Elad Ben-Israel <benisrae@amazon.com>
const wave = def.addWave('asdf', { | ||
stages: [...], | ||
}); | ||
wave.addCdkStage(new MyAppStage(...)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like to comment on the use of stage
terminology as part of waves. From my experience, customers are familiar with deployment
term (especially from Kubernetes world), but not so much with stage
. I understand we probably can’t change cdk.Stage
to cdk.Deployment
, but can we at least change the CDK Pipelines interface from wave.addCdkStage
to wave.addDeployment
? 🙂. Example to illustrate below.
pipeline = pipelines.Pipeline()
europe_wave = pipeline.add_wave('Europe')
eu_west_1_env = cdk.Environment(account='111111111111', region='eu-west-1')
# class Deployment(cdk.Stage): ...
eu_west_1_deployment = Deployment(self, APPLICATION_NAME, env=eu_west_1_env)
eu_west_1_smoke_test = pipelines.ShellScriptAction(
action_name='SmokeTest', commands=['curl https://eu-west-1.my.webservice.com/'])
europe_wave.add_deployment(eu_west_1_deployment, after=[eu_west_1_smoke_test])
eu_west_2_env = cdk.Environment(account='222222222222', region='eu-west-2')
# class Deployment(cdk.Stage): ...
eu_west_2_deployment = Deployment(self, APPLICATION_NAME, env=eu_west_2_env)
eu_west_2_smoke_test = pipelines.ShellScriptAction(
action_name='SmokeTest', commands=['curl https://eu-west-2.my.webservice.com/'])
europe_wave.add_deployment(eu_west_2_deployment, after=[eu_west_2_smoke_test])
+1 |
It is implemented and released in 1.114.0 |
@rix0rrr don't you want to merge this? |
Revive of #323, which should have been merged initially but wasn't. --- _By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license_
By submitting this pull request, I confirm that my contribution is made under
the terms of the Apache-2.0 license