Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

custom-resource: Custom resource as a dependency for another custom resource #30875

Open
kaisic1224 opened this issue Jul 17, 2024 · 2 comments
Labels
@aws-cdk/custom-resources Related to AWS CDK Custom Resources bug This issue is a bug. effort/medium Medium work item – several days of effort p3

Comments

@kaisic1224
Copy link

kaisic1224 commented Jul 17, 2024

Describe the bug

I am attempting to create a Neptune global database and then add a database cluster inside.

I am creating the global database using the CustomResource class, and the database cluster with the AwsCustomResource class.

The error that I am running into is that when I try to add a cluster to the global database, the deployment fails at creation as it cannot find the global database even after specifying that the cluster depends on the global database.

Expected Behavior

I expect for the global databse to be fully created and available before the cluster is added.

Current Behavior

11:43:46 PM | CREATE_FAILED | Custom::NeptuneRegionalCluster | NeptuneCluster7FC72740
Received response status [FAILED] from custom resource. Message returned: Global cluster global-database-identifier not found (RequestId: 83af0fa8-cea4-44d1-8ec7-9c2948986ce2)

❌ GlobalDB failed: Error: The stack named GlobalDB failed creation, it may need to be manually deleted from the AWS console: ROLLBACK_FAILED (The following resource(s) failed to delete: [NeptuneCluster7FC72740]. ): Received response status [FAILED] from custom resource. Message returned: Global cluster global-cluster-identifier not found (RequestId: 83af0fa8-cea4-44d1-8ec7-9c2948986ce2), Received response status [FAILED] from custom resource. Message returned: Malformed db cluster arn dev-primary-cluster (RequestId: 2bfb458b-cfde-4169-b730-d8cfc0a258f7)
at FullCloudFormationDeployment.monitorDeployment (/usr/local/lib/node_modules/aws-cdk/lib/index.js:451:10568)
at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
at async Object.deployStack2 [as deployStack] (/usr/local/lib/node_modules/aws-cdk/lib/index.js:454:199716)
at async /usr/local/lib/node_modules/aws-cdk/lib/index.js:454:181438

❌ Deployment failed: Error: The stack named GlobalDB failed creation, it may need to be manually deleted from the AWS console: ROLLBACK_FAILED (The following resource(s) failed to delete: [NeptuneCluster7FC72740]. ): Received response status [FAILED] from custom resource. Message returned: Global cluster global-cluster-identifier not found (RequestId: 83af0fa8-cea4-44d1-8ec7-9c2948986ce2), Received response status [FAILED] from custom resource. Message returned: Malformed db cluster arn dev-primary-cluster (RequestId: 2bfb458b-cfde-4169-b730-d8cfc0a258f7)
at FullCloudFormationDeployment.monitorDeployment (/usr/local/lib/node_modules/aws-cdk/lib/index.js:451:10568)
at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
at async Object.deployStack2 [as deployStack] (/usr/local/lib/node_modules/aws-cdk/lib/index.js:454:199716)
at async /usr/local/lib/node_modules/aws-cdk/lib/index.js:454:181438

Reproduction Steps

neptune.ts

import { Stack } from "aws-cdk-lib";
import { Code, CodeSigningConfig, Runtime } from "aws-cdk-lib/aws-lambda";
import { Platform, SigningProfile } from "aws-cdk-lib/aws-signer";
import { NodejsFunction } from "aws-cdk-lib/aws-lambda-nodejs";

class GlobalDatabaseStack extends Stack {
    globalClusterIdentifier = "global-clulster-identifier"
    engineVersion = "1.2.0.0"

    constructor(scope: App, id: string) {

    const signingProfile = new SigningProfile(this, "SigningProfile", {
      platform: Platform.AWS_LAMBDA_SHA384_ECDSA,
    });

    const codeSigningConfig = new CodeSigningConfig(this, "CodeSigningConfig", {
      signingProfiles: [signingProfile],
    });

    const globalClusterOnEventHandler = new NodejsFunction(
      this,
      "NeptuneGlobalClusterOnEventHandler",
      {
        codeSigningConfig,
        runtime: Runtime.NODEJS_20_X,
        handler: "globalClusterOnEventHandler/globalClusterOnEventHandler.handler",
        code: Code.fromAsset(join(__dirname, "lambda", "globalClusterOnEventHandler", "globalClusterOnEventHandler.zip")),
        bundling: {
          externalModules: ["aws-sdk"],
        },
      }
    );

    const globalClusterProvider = new Provider(
      this,
      "NeptuneGlobalClusterProvider",
      {
        onEventHandler: globalClusterOnEventHandler,
      }
    );

    // create global cluster
    const globalCluster = new CustomResource(this, "NeptuneGlobalDatabase", {
      serviceToken: globalClusterProvider.serviceToken,
      properties: {
        // stack: this,
        GlobalClusterIdentifier: this.globalClusterIdentifier,
        engineVersion: this.engineVersion,
      },
      resourceType: "Custom::NeptuneGlobalCluster",
    });

    // add a cluster in the primary rgion
    const primaryCluster = new AwsCustomResource(this, "NeptuneCluster", {
      onCreate: {
        action: "CreateDBClusterCommand",
        service: "@aws-sdk/client-neptune",
        physicalResourceId: PhysicalResourceId.of(Date.now().toString()),
        parameters: {
          // required
          DBClusterIdentifier: `dev-primary-cluster`,
          Engine: "neptune",

          DatabaseName: `globalDatabase`,
          EngineVersion: this.engineVersion,
          GlobalClusterIdentifier: this.globalClusterIdentifier, // db name
        },
      },
      onDelete: {
        action: "RemoveFromGlobalClusterCommand",
        service: "@aws-sdk/client-neptune",
        parameters: {
          GlobalClusterIdentifier: this.globalClusterIdentifier,
          DbClusterIdentifier: `dev-primary-cluster`,
        },
      },
      resourceType: "Custom::NeptuneRegionalCluster",
      policy: AwsCustomResourcePolicy.fromSdkCalls({
        resources:AwsCustomResourcePolicy.ANY_RESOURCE 
      })
    });

    primaryCluster.node.addDependency(globalCluster);
}

lambda/globalClusterOnEventHandler/globalClusterOnEventHandler.ts

import { AwsCustomResource, AwsCustomResourcePolicy, PhysicalResourceId } from "aws-cdk-lib/custom-resources";
import { CloudFormationCustomResourceEvent, Context } from "aws-lambda";

export const handler = async (
  event: CloudFormationCustomResourceEvent,
  context: Context
) => {
  const { stack, GlobalClusterIdentifier, engineVersion, storageEncrypted } =
    event.ResourceProperties;

  let resp = {
    LogicalResourceId: event.LogicalResourceId,
    StackId: event.StackId,
    RequestId: event.RequestId,
    // PhysicalResourceId: context.functionName,
    Status: "FAILED",
    Reason: "",
    Data: {},
  };

  switch (event.RequestType) {
    case "Create":
      let global;
      try {
        global = new AwsCustomResource(stack, "NeptuneGlobalDatabase", {
          onCreate: {
            action: "CreateGlobalClusterCommand",
            service: "@aws-sdk/client-neptune",
            physicalResourceId: PhysicalResourceId.of(Date.now().toString()),
            parameters: {
              // required
              GlobalClusterIdentifier: GlobalClusterIdentifier, // db name

              Engine: "neptune",
              EngineVersion: engineVersion,
              StorageEncrypted: storageEncrypted,
            },
          },
          policy: AwsCustomResourcePolicy.fromSdkCalls({
            resources: AwsCustomResourcePolicy.ANY_RESOURCE
          })
        });
      } catch (error) {
        resp.Status = "FAILED";
        return resp;
      }

      resp.Status = "SUCCESS";
      resp.Data = {
        cluster: global
      }
      return resp;
    case "Update":
    case "Delete":
};

Possible Solution

No response

Additional Information/Context

No response

CDK CLI Version

2.146.0 (build b368c78)

Framework Version

No response

Node.js Version

v20.3.0

OS

Debian GNU/Linux 12 (bookworm) on Windows 10 x86_64 Home 22H2 | Kernel version: 5.15.153.1-microsoft-standard-WSL2

Language

TypeScript

Language Version

No response

Other information

No response

@kaisic1224 kaisic1224 added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Jul 17, 2024
@github-actions github-actions bot added the @aws-cdk/custom-resources Related to AWS CDK Custom Resources label Jul 17, 2024
@pahud
Copy link
Contributor

pahud commented Jul 19, 2024

11:43:46 PM | CREATE_FAILED | Custom::NeptuneRegionalCluster | NeptuneCluster7FC72740
Received response status [FAILED] from custom resource. Message returned: Global cluster global-database-identifier not found (RequestId: 83af0fa8-cea4-44d1-8ec7-9c2948986ce2)

Looks like when your custom resource tried to create the regional cluster using the specified global-database-identifier, it could not be found. It's very likely your global cluster was not ready yet.

I would troubleshoot this way:

  1. First, just create the global one using custom resource.
  2. After that custom resource is created. Use JS SDK or AWS CLI to create the regional primary one using that global-database-identifier and see if it works. This ensures it could technically be created using AWS CLI or SDK.
  3. If it works in step 2, you should be able to implement that using the custom resource. The key is you need to make sure the global one is ready before you start creating the regional one. The question is how to make sure the global one is ready. In SDK when you create a cluster, you probably will immediately get a response yet the operation is still ongoing. The trick is you need to define an isComplete handler in CDK to describe that cluster and check if that status is ready. With this design your custom resource would not immediate return completed, instead, only when isComplete handler completes would it return completed. So your dependent regional resources could start provisioning when your global one is really ready and available.

Generally we recommend using L2 or L1 constructs whenever possible unless you really have to use custom resource. But if you really have to use that, I do hope this trick helps. Let me know if it works for you.

@pahud pahud added p3 effort/medium Medium work item – several days of effort and removed needs-triage This issue or PR still needs to be triaged. labels Jul 19, 2024
@kaisic1224
Copy link
Author

This worked perfectly, thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
@aws-cdk/custom-resources Related to AWS CDK Custom Resources bug This issue is a bug. effort/medium Medium work item – several days of effort p3
Projects
None yet
Development

No branches or pull requests

2 participants