Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cloud Assembly Specification #956

Closed
RomainMuller opened this issue Oct 17, 2018 · 15 comments · Fixed by #2636
Closed

Cloud Assembly Specification #956

RomainMuller opened this issue Oct 17, 2018 · 15 comments · Fixed by #2636
Assignees

Comments

@RomainMuller
Copy link
Contributor

RomainMuller commented Oct 17, 2018

In order to confidently design a CI/CD deployment experience that supports assets and multiple stack ordering constraints, the Cloud Assembly document format needs to be defined, at least in an initial version.

Use-cases

  1. An arbitrary number of stacks, some of which may be related
    • Nested stacks, too
  2. Embedded support files (for example: Runtime code for Lambda; Static assets to be loaded to S3; Docker Images; CloudFormation documents for nested stacks; AMI baking instructions; ...)
    • Some of the support files may be too large to "bundle in", so support for external storage (S3, ...) may be relevant
    • Managed build steps (specific build steps for specific asset types (e.g: Python lambda bundle, NodeJS lambda bundle, Docker image, ...).
      • Need to consider CodePipeline limits (can be raised, but defaults are quite low & would come in the way of un-capped step count) as well as CodeBuild billing model (aka cost implications of many fast builds versus one slower build).
    • Self-contained means it can be digitally signed, and provisioning mechanisms can be configured to refuse handling un-signed or tampered with assemblies.
  3. Phasing of deployments to honor dependencies between stacks & support files
    • Through use of the CDK CLI tools
    • Using a CI/CD pipeline
  4. Re-use of the exact same artifacts in several stacks (in a CI/CD environment, deploy to QA environment, run integration tests, roll forward to production environment)
  5. Interaction with context (in particular, when doing CI/CD, need to handle missing context correctly).
  6. Workflows similar to:
    1. Build assembly, send to QA
    2. QA tests assembly, signs it if it passes, send to release management
    3. Release management verifies assembly is signed by QA key, deploys it to production

Remarks

  • Phases: build then synthesize then package then deploy.
  • The output of a CDK App execution is not the Cloud Assembly, but would describe instructions on how to piece it together.
@eladb
Copy link
Contributor

eladb commented Oct 18, 2018

A Cloud Assembly is the result of synthesizing a CDK App (or a subset thereof). It is a self-sufficient set of artifacts that can be used to deploy the application to AWS accounts without using the CDK App or toolkit.

I don't think there is a need to support synthesis of subsets of applications. I think a cloud assembly is always 1:1 with an application. Keep it simple.

Deploying a Cloud Assembly

I would start with "Requirements". A list of use cases and scenarios that we want to support. Perhaps even prioritized or at least define which ones are supported in the current version of the spec and which are planned to be supported, and also which use cases are "non requirements" (i.e. not covered by this standard).

See #233 for list of requirements.

Specification

A few things come to mind:

  • I don't think we need to define the file system structure inside the assembly. We can leave it to the app to decide how to organize the files. Let's define the manifest as a single entrypoint which includes all the information and relatively reference files within the assembly.
  • I wonder if we can actually treat assets and stacks as the same "thing". Effectively an app includes various "things" that reference each other and have some dependencies between them. Then, our tools (toolkit, CI/CD) can deploy a set of "thing types" (very similar to asset handlers). Try to see if you can generalize it like this.
  • Dependencies are related to references. Ideally, dependencies can be deduced based on references and so the relationship between a stack and its assets is an implicit dependency that stems from the fact that the stack references the assets.

@RomainMuller
Copy link
Contributor Author

  • Dependencies are related to references. Ideally, dependencies can be deduced based on references and so the relationship between a stack and its assets is an implicit dependency that stems from the fact that the stack references the assets.

If you consider "assets" and "stacks" to be "things" (which I read as opaque data), then how are you supposed to discover dependencies between all those? This makes a case of each of them declaring their dependencies at assembly time, and the dependencies being expressed directly in the manifest.

@RomainMuller
Copy link
Contributor Author

  • I don't think we need to define the file system structure inside the assembly. We can leave it to the app to decide how to organize the files. Let's define the manifest as a single entrypoint which includes all the information and relatively reference files within the assembly.

Yeah I kinda intended the paths to be more akin to examples than a hard specification of what it should be like. I've changed the table heading to clear things up a bit there.

  • I wonder if we can actually treat assets and stacks as the same "thing". Effectively an app includes various "things" that reference each other and have some dependencies between them. Then, our tools (toolkit, CI/CD) can deploy a set of "thing types" (very similar to asset handlers). Try to see if you can generalize it like this.

Yeah I like the idea. There is no reason why there should be a different handling there.

@rix0rrr
Copy link
Contributor

rix0rrr commented Oct 23, 2018

It is a YAML 1.2 document

Why not JSON? No human will ever need to write it by hand, so why not lowest common denominator data structure?

The type of the artifact, referring to an entry of the artifactTypes attribute of the manifest

Why this level of indirection? Could put the handler directly on the asset type.

  • Topologically sort the assets in the document

Need to declare the dependencies then on the Artifact type.


Are we provisioning for negotiation between Cloud Assembly producer and consumer? I'm thinking context parameters between toolkit and CDK app. Seems to be out of scope for this document, but it's going to have to be implemented anyway. Will it be some out-of-band mechanism? Also, a method to report errors and other metadata? Will that be out-of-band with this as well?

If so, it seems like this describes what the output will be if synthesis succeeds, but we also need to have a standardized method to describe failing synthesis. Now it might be called something different, but it needs to be standardized as well because multiple tools are going to have to implement it.

@rix0rrr
Copy link
Contributor

rix0rrr commented Oct 23, 2018

Also, are handlers only called once? Or might there be multiple phases for asset handling.

For example: build/package/publish/deploy?

@RomainMuller
Copy link
Contributor Author

Why not JSON? No human will ever need to write it by hand, so why not lowest common denominator data structure?

I like computer-generated documents to be human-readable nonetheless, as it makes it easier for people to dive deep into the document in the event they have to. I'm not willing to die on this hill, though...

Why this level of indirection? Could put the handler directly on the asset type.

Handlers may have more configuration than just the path to the command, and they could be used by multiple assets. Call me old-school but I like to avoid repeating data.

Need to declare the dependencies then on the Artifact type.

This is what Artifact#references is for.

Are we provisioning for negotiation between Cloud Assembly producer and consumer?

That's a good point. I need t add this.

but we also need to have a standardized method to describe failing synthesis.

I'm not sure I understand the need for standardization beyond "returns in error when not successful"?

Also, are handlers only called once?

That's a part I haven't gone over entirely just yet, but yes, there may be multiple phases involved, and the handlers would somehow be told which phase they're invoked for.

@rix0rrr
Copy link
Contributor

rix0rrr commented Oct 24, 2018

Currently "missing context" is cause for a reinvocation, for example.

That's not a success (yet) but also not a failure.

And reporting warnings is something that we have, as are reporting errors via stack output (even though the called program does not fail)

@RomainMuller
Copy link
Contributor Author

I guess the missing context & warnings are consequences of how one generates the Cloud Assembly, and not features of the assembly itself...

@rix0rrr
Copy link
Contributor

rix0rrr commented Oct 26, 2018

I know. But we need a standard for those as well, and right now we have none. We might be tempted to say today:

"A CDK app is a program that generates a Cloud Assembly."

But if we did that'd be incomplete. Instead, we have to say something like:

"A CDK app is a program that adheres to the XXX protocol."

And in the XXX protocol it is written somewhere that under some conditions some output somewhere is a Cloud Assembly.

But without XXX protocol being written down, how do we do that?

@eladb
Copy link
Contributor

eladb commented Oct 28, 2018

I think I agree with @rix0rrr that the protocol needs to allow the app to return an error that says "missing context".

@ejholmes
Copy link

ejholmes commented Oct 31, 2018

/cc @phobologic

Would be really cool if we can define a general spec here that would work for both CDK and stacker.

The example manifest above looks really similar to how the internal graph is built within stacker, so going from that to a compiled "Cloud Assembly" would be relatively straight forward.

RomainMuller added a commit that referenced this issue Nov 8, 2018
Looking forward to supporting a more integrated deployment experience
with support of multiple stacks (with deployment order constraints
honored) as well as assets, this proposes a specification for a
*Cloud Assembly* that determines a document storage format.

Fixes: #956
@RomainMuller
Copy link
Contributor Author

RomainMuller commented Nov 8, 2018

I'm moving the document building to a PR (#1119), to make it easier to track the revision history of the document and to facilitate conversation & commenting inline.

RomainMuller added a commit that referenced this issue Nov 8, 2018
Looking forward to supporting a more integrated deployment experience
with support of multiple stacks (with deployment order constraints
honored) as well as assets, this proposes a specification for a
*Cloud Assembly* that determines a document storage format.

Fixes: #956
RomainMuller added a commit that referenced this issue Nov 8, 2018
Looking forward to supporting a more integrated deployment experience
with support of multiple stacks (with deployment order constraints
honored) as well as assets, this proposes a specification for a
*Cloud Assembly* that determines a document storage format.

Fixes: #956
@eladb eladb mentioned this issue Dec 17, 2018
@RomainMuller
Copy link
Contributor Author

The CNAB specification provides a mechanism for bundling arbitrary provisioning logic and configuration, but does not quite specify what that logic should look like. We'd have freedom to design the orchestration in whatever way we want.

An interesting work form Microsoft on that topic is deislabs/porter, which aims at providing composable CNAB bundle fragments.

The elements we have to figure out is how to instrument deployments externally to the invocation image (how to pass data in/out of the CNAB, so we can for example orchestrate deployment phases using CodePipelines).

@eladb
Copy link
Contributor

eladb commented Feb 5, 2019

@hjander asks:

What will be the future direction regarding the Cloudlet/CNAB topic ?

@eladb
Copy link
Contributor

eladb commented Feb 5, 2019

@hjander we are still thinking what is the best approach. At the moment, the protocol between CDK apps and the CDK CLI is a proprietary (and ugly) protocol we call the cx-api. It will serve in the short term, but we have a desire to standardize this interface and the cloud assembly was an attempt.

From our conversations with the CNAB folks, it seems like there might be an opportunity to collaborate and define a higher level standard on top of CNAB that will address the requirements of the cloud assembly.

Bottom line: at the moment, we are putting this work on hold, and we will get back to it later, most probably when we do some work on the CLI to clean it up and modularize it.

eladb pushed a commit that referenced this issue Feb 27, 2019
Start moving towards the cloud-assembly specification where an output
of a CDK program is a bunch of artifacts and those are processed by
the toolkit.

Related #956
Related #233
Related #1119
eladb pushed a commit that referenced this issue May 29, 2019
Formalize the concept of a cloud assembly to allow decoupling "synth" and "deploy". the main
motivation is to allow "deploy" to run in a controlled environment without needing to execute the app for security purposes.

`cdk synth` now produces a self-contained assembly in the output directory, which we call a "cloud assembly". this directory includes synthesized templates (similar to current behavior) but also a "manifest.json" file and the asset staging directories.

`cdk deploy -a assembly-dir` will now skip synthesis and will directly deploy the assembly.

To that end, we modified the behavior of asset staging such that all synth output always goes to the output directory, which is by default `cdk.out` (can be set with `--output`). if there's a single template, it is printed to stdout, otherwise the name of output directory is printed.

This PR also includes multiple clean ups and deprecations of stale APIs and modules such as:

- `cdk.AppProps` includes various new options.
- An error is thrown if references between stacks that are not in the same app (mostly in test cases).
- Added the token creation stack trace for errors emitted by tokens
- Added `ConstructNode.root` which returns the tree root.
 

**TESTING**: verified that all integration tests passed and added a few tests to verify zip and docker assets as well as cloud assemblies. See: https://github.com/awslabs/cdk-ops/pull/364

Closes #1893 
Closes #2093 
Closes #1954 
Closes #2310 
Related #2073 
Closes #1245
Closes #341
Closes #956
Closes #233

BREAKING CHANGE: *IMPORTANT*: apps created with the CDK version 0.33.0 and above cannot be used with an older CLI version.
- `--interactive` has been removed
- `--numbered` has been removed
- `--staging` is now a boolean flag that indicates whether assets should be copied to the `--output` directory or directly referenced (`--no-staging` is useful for e.g. local debugging with SAM CLI)
- Assets (e.g. Lambda code assets) are now referenced relative to the output directory.
- `SynthUtils.templateForStackName` has been removed (use `SynthUtils.synthesize(stack).template`).
- `cxapi.SynthesizedStack` renamed to `cxapi.CloudFormationStackArtifact` with multiple API changes.
- `cdk.App.run()` now returns a `cxapi.CloudAssembly` instead of `cdk.ISynthesisSession`.
- `cdk.App.synthesizeStack` and `synthesizeStacks` has been removed. The `cxapi.CloudAssembly` object returned from `app.run()` can be used as an alternative to explore a synthesized assembly programmatically (resolves #2016).
- `cdk.CfnElement.creationTimeStamp` may now return `undefined` if there is no stack trace captured.
- `cdk.ISynthesizable.synthesize` now accepts a `cxapi.CloudAssemblyBuilder` instead of `ISynthesisSession`.
- `cdk`: The concepts of a synthesis session and session stores have been removed (`cdk.FileSystemStore`, cdk.InMemoryStore`, `cdk.SynthesisSession`, `cdk.ISynthesisSession` and `cdk.SynthesisOptions`).
- No more support for legacy manifests (`cdk.ManifestOptions` member `legacyManifest` has been removed)
- All support for build/bundle manifest have been removed (`cxapi.BuildManifest`, `cxapi.BuildStep`, etc).
- Multiple deprecations and breaking changes in the `cxapi` module (`cxapi.AppRuntime` has been renamed to `cxapi.RuntimeInfo`, `cxapi.SynthesizeResponse`, `cxapi.SynthesizedStack` has been removed)
- The `@aws-cdk/applet-js` program is no longer available. Use [decdk](https://github.com/awslabs/aws-cdk/tree/master/packages/decdk) as an alternative.
This was referenced Dec 12, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants