Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dependency stacks cause update failure #3414

Closed
1 of 5 tasks
lkoniecz opened this issue Jul 24, 2019 · 16 comments
Closed
1 of 5 tasks

Dependency stacks cause update failure #3414

lkoniecz opened this issue Jul 24, 2019 · 16 comments
Assignees
Labels
bug This issue is a bug. cross-stack Related to cross-stack resource sharing effort/medium Medium work item – several days of effort p1 package/tools Related to AWS CDK Tools or CLI

Comments

@lkoniecz
Copy link

lkoniecz commented Jul 24, 2019

Note: for support questions, please first reference our documentation, then use Stackoverflow. This repository's issues are intended for feature requests and bug reports.

  • I'm submitting a ...

    • 🪲 bug report
    • 🚀 feature request
    • 📚 construct library gap
    • ☎️ security issue or vulnerability => Please see policy
    • ❓ support request => Please see note at the top of this template.
  • What is the current behavior?
    If the current behavior is a 🪲bug🪲: Please provide the steps to reproduce

stack_1 = FirstStack(
    app=app,
    id='FirstStack
)

stack_2 = SecondStack(
    app=app,
    id='SecondStack',
    construct_from_stack_1=stack1.some_construct
)

This causes a dependency via stack output. When I decide not to use construct_from_stack_1 anymore (by deleting its usage from stack_2), the stack_2 fails to update - for instance:

eks-dev
eks-dev: deploying...
eks-dev: creating CloudFormation changeset...
 0/1 | 12:13:45 | UPDATE_ROLLBACK_IN_P | AWS::CloudFormation::Stack              | eks-dev Export eks-dev:ExportsOutputFnGetAttEksElasticLoadBalancer4FCBC5E7SourceSecurityGroupOwnerAlias211654CC cannot be deleted as it is in use by ports-assignment-dev

 ❌  eks-dev failed: Error: The stack named eks-dev is in a failed state: UPDATE_ROLLBACK_COMPLETE
The stack named eks-dev is in a failed state: UPDATE_ROLLBACK_COMPLETE

Looks like CDK tries to delete resource in wrong order - starting from the output first rather than its usage in dependent stacks and then from the souce stack itself.

  • What is the expected behavior (or behavior of feature suggested)?
    Update removes resources that are no longer used

  • What is the motivation / use case for changing the behavior or adding this feature?

Life-time dependecies are created which prevents dependent stacks from being updated.

  • Please tell us about your environment:

    • CDK CLI Version: 1.0.0
    • Module Version: 1.0.0
    • OS: [all]
    • Language: [all ]
  • Other information (e.g. detailed explanation, stacktraces, related issues, suggestions how to fix, links for us to have context, eg. associated pull-request, stackoverflow, gitter, etc)

@lkoniecz lkoniecz added the needs-triage This issue or PR still needs to be triaged. label Jul 24, 2019
@eladb eladb added the package/tools Related to AWS CDK Tools or CLI label Aug 13, 2019
@evmin02
Copy link

evmin02 commented Aug 14, 2019

Observing similar behaviour with python based CDK, where I went for the Cfn* set of resources (for pure experimentation purposes):

  • An app contains two stacks:
  • The first stack contains VPC definition (stack-lab-edu2)
  • The second stack contains SecGroup and EC2 (stack-lab-ecc)

The dependency has been declared - stack-lab-ecc depends on stack-lab-edu2.

When EC2 is commented out (diff):

image

The deploy fails trying to delete the subnet export from the first stack BEFORE deleting the EC2 instance from the second:

image

Why:

image

CDK CLI Version: 1.3.0

Python:
aws-cdk.cdk 0.36.1
aws-cdk.core 1.3.0

@abelmokadem
Copy link
Contributor

@rix0rrr this is the issue I meant. My current workaround for this is to create a "dummy" resource and attach the dependencies to that dummy resource. Something like this:

import cloudformation = require("@aws-cdk/aws-cloudformation");
...
// Get all subnet ids
    const subnetIds = props.vpc.isolatedSubnets
      .concat(props.vpc.privateSubnets)
      .concat(props.vpc.publicSubnets)
      .map(subnet => {
        return subnet.subnetId;
      });

// Create dummy cloudformation resource with all dependencies attached
    const dummyWaitHandle = new cloudformation.CfnWaitConditionHandle(this, "DummyResource");
    dummyWaitHandle.cfnOptions.metadata = {
      dependencies: subnetIds
    };

@wwwpro
Copy link

wwwpro commented Sep 23, 2019

I'm encountering this as well, but with trying to update dependent stacks. In one example of my use case, I'm trying to separate the creation of ECS tasks from services. Ideally, I'd like to be able to destroy a service, without destroy the corresponding task (and its history).

By placing tasks and services in separate stacks, and just passing the relevant ref/arn information between stacks, I can accomplish destroying a service without the destroying the task, but I can't update the task stack, since I'm blocked by the "in use by services" error.

That's just one example. Overall, it helps from a code organization and re-usability standpoint for complex builds to separate and consolidate the creation of stacks according to the resources built. But the "in-use" dependency creates the need to consolidate complex builds into a single long, complex stack, with components that can't be reused, to ensure each component can be updated.

@wwwpro
Copy link

wwwpro commented Sep 25, 2019

A little more information on the above...

I'm using Cfn functions almost entirely, as the default VPC creates a decidedly expensive (for my purposes, anyway) arrangement, and follow up components expect that default VPC.

I connect stacks, notably almost everything shares the CfnVPC I created, using the "Accessing Resources in a Different Stack" method outlined here: https://docs.aws.amazon.com/cdk/latest/guide/resources.html

In that section, it details that the method uses "ImportValue" to transfer information across stacks.

However, when making changes to a "child" stack which is exported, I run into the issue outlined here: https://aws.amazon.com/premiumsupport/knowledge-center/cloudformation-stack-export-name-error/

In that article, it essentially says you should replace the ImportValue function with direct resource references.

I may be missing something, but it doesn't seem possible to create a cross-stack CDK that outputs the imported value instead of the ImportValue function.

@NGL321 NGL321 added bug This issue is a bug. cross-stack Related to cross-stack resource sharing and removed needs-triage This issue or PR still needs to be triaged. labels Oct 10, 2019
@pmaedel
Copy link

pmaedel commented Oct 22, 2019

see #4014 for a feature request regarding that would solve this issue

@nakedible
Copy link

This bug is caused by the automatic dependency resolution mechanism in the CDK CLI, which means that when you update stack_2, it will automatically update stack_1 first (which obviously fails as stack_2 is still using the exported resource). The solution is really simple - just say cdk deploy -e stack_2, which will update only stack_2, and afterwards you can say cdk deploy stack_1 to clean up the unused export.

This will fail if you at the same time add something to stack_1 that is needed by stack_2 - in this case, stack_2 cannot be updated first, but neither can stack_1 because of the export. This is an obvious limitation with CloudFormation that has nothing to do with CDK, and the simplest way to avoid this is to just to do smaller changes.

The proper way to solve all problems like this is to use NestedStack instead of Stack. Automated support for that landed in 1.12.0, and that allows CloudFormation to handle this case correctly - first by creating all new resources in all stacks in dependency order, then updating all references and then only finally doing a pass to remove all the replaced resources.

Not sure what should actually be done about this in CDK - one solution would be to just add a note when a stack update fails to an export being used that "perhaps try updating stack with --exclusively".

@austinbv
Copy link

Same as #7602

@maxerbubba
Copy link

My two cents:
A tool to detect these conditions at build time (instead of deploy time) is possible, and would be a big help. Example workflow: (I made up some new commands):

# New command, that locks down interface at users's request:
# This file only contains Imports, Exports:
cdk shrinkwrap --all > my_production_interface.json

# user should checkin their interface:
git add my_production_interface.json && git commit -m "Added current deployment interface"

# Now, regular build will fail if current cdk output is not compatible with interface:
cdk build --interface my_production_interface.json
# FAIL

@shivlaks shivlaks added the effort/medium Medium work item – several days of effort label Aug 26, 2020
@FirstWhack
Copy link

@nakedible given NestedStack is now deprecated (as is all of the aws-cloudformation package).. Do you know what is the correct way to solve this problem now?

This seems to be the most basic feature of dependencies. :/

@kichik
Copy link

kichik commented Nov 7, 2020

As @nakedible said, one of the workarounds is splitting the deploy into two steps. The -e flag must be used so CDK doesn't deploy all stacks. Here is an example of that.

# first step will remove the usage of the export
cdk deploy --exclusively SecondStack
# second step can now remove the export
cdk deploy --all

@NGL321 NGL321 assigned rix0rrr and unassigned shivlaks Jan 25, 2021
@kichik
Copy link

kichik commented Feb 5, 2021

I created the following script to help automate the process. It phases out the exports in two deployments. On the first deployment it restores any removed exports but marks them to be removed. This allows the first deployment to safely remove its usage. On the second deployment the script actually removes the exports after no other stack is using them.

To use the script you have to separate the synth and deploy steps and run the script in between them.

cdk synth && python phase-out-ref-exports.py && cdk deploy --app cdk.out --all

It requires permissions to read the stack, so it will probably not work well with cross-account deployments.

# phase-out-ref-exports.py
import json
import os
import os.path
 
from aws_cdk import cx_api
import boto3
import botocore.exceptions
 
 
def handle_template(stack_name, account, region, template_path):
    # get outputs from existing stack (if it exists)
    try:
        # TODO handle different accounts
        print(f"Checking exports of {stack_name}...")
        stack = boto3.client("cloudformation", region_name=region).describe_stacks(StackName=stack_name)["Stacks"][0]
        old_outputs = {
            o["OutputKey"]: o
            for o in stack.get("Outputs", [])
        }
    except botocore.exceptions.ClientError as e:
        print(f"Unable to phase out exports for {stack_name} on {account}/{region}: {e}")
        return
 
    # load new template generated by CDK
    new_template = json.load(open(template_path))
    if "Outputs" not in new_template:
        new_template["Outputs"] = {}
 
    # get output names for both existing and new templates
    new_output_names = set(new_template["Outputs"].keys())
    old_output_names = set(old_outputs.keys())
 
    # phase out outputs that are in old template but not in new template
    for output_to_phase_out in old_output_names - new_output_names:
        # if we already marked it for removal last deploy, remove the output
        if old_outputs[output_to_phase_out].get("Description") == "REMOVE ON NEXT DEPLOY":
            print(f"Removing {output_to_phase_out}")
            continue
 
        if not old_outputs[output_to_phase_out].get("ExportName"):
            print(f"This is an export with no name, ignoring {old_outputs[output_to_phase_out]}")
            continue
 
        # add back removed outputs
        print(f"Re-adding {output_to_phase_out}, but removing on next deploy")
        new_template["Outputs"][output_to_phase_out] = {
            "Value": old_outputs[output_to_phase_out]["OutputValue"],
            "Export": {
                "Name": old_outputs[output_to_phase_out]["ExportName"]
            },
            # mark for removal on next deploy
            "Description": "REMOVE ON NEXT DEPLOY",
        }
 
    # replace template
    json.dump(new_template, open(template_path, "w"), indent=4)
 
 
def handle_assembly(assembly):
    for s in assembly.stacks:
        handle_template(s.stack_name, s.environment.account, s.environment.region, s.template_full_path)
 
    for a in assembly.nested_assemblies:
        handle_assembly(a.nested_assembly)
 
 
def main():
    assembly = cx_api.CloudAssembly("cdk.out")
    handle_assembly(assembly)
 

if __name__ == "__main__":
    main()

@BenChaimberg
Copy link
Contributor

New feature that can be used as a workaround: #12778

I think this issue can probably be closed @rix0rrr

@rix0rrr rix0rrr closed this as completed Feb 5, 2021
@github-actions
Copy link

github-actions bot commented Feb 5, 2021

⚠️COMMENT VISIBILITY WARNING⚠️

Comments on closed issues are hard for our team to see.
If you need more assistance, please either tag a team member or open a new issue that references this one.
If you wish to keep having a conversation with other community members under this issue feel free to do so.

@dilunika
Copy link

This explains the problem and solution very nicely.
https://www.endoflineblog.com/cdk-tips-03-how-to-unblock-cross-stack-references

@jdiegosierra
Copy link

This was usefull for me. In my case I was using CDK and wanted to remove one stack (lets say stackA) with outputs referenced as input in another stacks (lets say stackB and stackC). So I broke all cross stack references puting the value manually on the cloudFormation template and then I deployed the template to every stack (stackB and stackC). Then I removed the stackA on CDK and deployed and it was successful.

@ajhool
Copy link

ajhool commented Aug 26, 2023

While I recognize that there is a solution for this, it still feels like a workaround rather than the library working as expected. In CICD pipelines where the deployments are more rigid, it's awkward for developers to keep track of this and prepare PRs specifically to address this problem before merging the real PRs with their desired updates.

The current approach feels too aggressive in removing the OutputValue from StackA when StackB deletes the dependency. Just because the OutputValue is no longer needed by StackB, why must it be deleted immediately from StackA? I don't see any harm in keeping the OutputValue on StackA until another deployment cycle. Maybe CDK doesn't have an easy way to track that state?

To me, it makes more sense for CDK to be lazier and simply leave the OutputValue in StackA and perform the update to StackB. Then, whenever StackA is deployed next, if CDK determines that it's safe to do so, remove the OutputValue from StackA. If developers do care about the OutputValue of StackA being deleted immediately, then there could be a flag that would trigger the current error and force the current workaround to be used (eg. --fail-on-cross-stack-dependency-change). Or, alternatively, a --lazy-delete-export-values flag if my desired behavior is opt-in.

Suggested workflow where developers intuitively work with StackB's dependency on StackA:

Day 1

  1. StackA - creates and exports an S3 bucket ARN - Deploy StackA
  2. StackB - uses the S3 bucket ARN as an env variable to a lambda function - Deploy StackB

Day 2

  1. StackB - remove the lambda function in the code. Shows a cdk diff (removal of lambda function).
  2. StackA - Shows no diff and a deploy would say "No changes" .
  3. StackB - Deploy StackB. Nothing happens to StackA.
  4. StackA - Diff StackA - Shows a diff (removal of ExportValue S3 ARN).

Day 2 or 3 or 74

  1. Somebody pushes code and CICD detects that StackA has a diff (removal of ExportValue S3 ARN). CICD deploys StackA to remove the OutputValue. This might be surprising to developers because they see a cdk diff for StackA but don't see any relevant code changes in StackA or StackB. The developers know that this is a quirk of cross-stack dependency management in CDK.

Not sure if others are more comfortable with the current solution, but I still spend a lot of time thinking about these cross-stack dependencies due to this issue and I don't fully understand why the above flow isn't the standard behavior.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug This issue is a bug. cross-stack Related to cross-stack resource sharing effort/medium Medium work item – several days of effort p1 package/tools Related to AWS CDK Tools or CLI
Projects
None yet
Development

No branches or pull requests