-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
(certificatemanager): DnsValidatedCertificate timeout while waiting for certificate approval #2914
(certificatemanager): DnsValidatedCertificate timeout while waiting for certificate approval #2914
Comments
Could you give some more steps to how you got to the error message? |
Sure, I've used this code fragment: new certificatemanager.DnsValidatedCertificate(this, 'id', {
domainName: 'some-name',
hostedZone: zone
}) And during Therefore I think its a timing issue, and in the lambda code of the dns validation there is a wait statement for 5 minutes. If I'm right this may be a bit too short. |
The runtime for the whole execution may not exceed 15 minutes. The function is currently waiting for up to 5 minutes for the DNS record to commit, then waits up to 5 minutes for the ACM validation to happen.... That does not leave much margin. |
Allow the Lambda function to wait up to 9 minutes and 20 seconds before bailing out waiting for the domain to be validated. It used to be waiting no more than 5 minutes and would occasionally timeout on users. Fixes #2914 (hopefully)
Allow the Lambda function to wait up to 9 minutes and 20 seconds before bailing out waiting for the domain to be validated. It used to be waiting no more than 5 minutes and would occasionally timeout on users. Fixes #2914 (hopefully)
@RomainMuller Thanks, that will probably help in a lot of situations. Unfortunately the certificate manager claims to approve pending certificate requests in at least 30 minutes. So there is still a lot of room to fail. But I think this will help a lot. |
Allow the Lambda function to wait up to 9 minutes and 20 seconds before bailing out waiting for the domain to be validated. It used to be waiting no more than 5 minutes and would occasionally timeout on users. Fixes #2914 (hopefully)
For those running with this problem, use instead the Certificate construct. It allows you to achieve the very same thing without time limit. Something like this:
|
For those experiencing this issue: Unless you absolutely need cross-region certificate issuance (e.g., requesting a us-east-1 certificate from another region for CloudFront), then converting to use the If you must use |
@njlynch Unfortunately I'm experiencing the same timeout issue, even with the
Also, both ways are unable to delete the failed stack because of DNS record sets created in the same deployment that pointed at a CloudFront alias (probably should be a separate issue).
Ran into this trying to deploy a static site (S3 bucket, CloudFront distribution, Route53 hosted zone, ACM certificate) with a domain registered already with Route53. I have noticed also what @acdoussan mentioned—the name servers for the registered domain do not match the hosted zone NS records made by Anything obvious that is causing this? My code: const websiteBucket = new s3.Bucket(this, "WebsiteBucket", {
autoDeleteObjects: true,
publicReadAccess: false,
removalPolicy: cdk.RemovalPolicy.DESTROY,
});
const websiteHostedZone = new route53.PublicHostedZone(this, "WebsiteHostedZone", {
zoneName: 'domain-name.com',
});
// Have also tried `DnsValidatedCertificate`
const websiteCertificate = new certificateManager.Certificate(this, "WebsiteCertificate", {
domainName: 'domain-name.com',
subjectAlternativeNames: ['www.domain-name.com'],
validation: certificateManager.CertificateValidation.fromDns(websiteHostedZone),
});
const websiteBucketDistribution = new cloudfront.Distribution(this, "WebsiteBucketDistribution", {
certificate: websiteCertificate,
defaultBehavior: {
origin: new origins.S3Origin(websiteBucket),
viewerProtocolPolicy: cloudfront.ViewerProtocolPolicy.REDIRECT_TO_HTTPS,
},
defaultRootObject: "index.html",
domainNames: ['domain-name.com'],
});
new route53.ARecord(this, "WebsiteARecord", {
target: route53.RecordTarget.fromAlias(new targets.CloudFrontTarget(websiteBucketDistribution)),
recordName: 'domain-name.com',
zone: websiteHostedZone,
});
new route53.AaaaRecord(this, "WebsiteAAAARecord", {
target: route53.RecordTarget.fromAlias(new targets.CloudFrontTarget(websiteBucketDistribution)),
recordName: 'domain-name.com',
zone: websiteHostedZone,
}); Edit: const websiteHostedZone = new route53.PublicHostedZone(this, "WebsiteHostedZone", {
zoneName: 'domain-name.com',
});
// Have also tried `DnsValidatedCertificate
const websiteCertificate = new certificateManager.Certificate(this, "WebsiteCertificate", {
domainName: 'domain-name.com',
subjectAlternativeNames: ['www.domain-name.com'],
validation: certificateManager.CertificateValidation.fromDns(websiteHostedZone),
}); |
I notice that when the zones are for domains that have not been purchased (lack NS registrar records) this happens. I suppose that makes some sort of sense since we're talking about domain ownership. I was doing testing and didn't want to buy a domain just for testing some cdk/cloudformation code. Maybe this note will help someone. Just sayin'. |
@BillyBunn, might be a long shot, but I switched to Certificate and my deploy started hanging as well. I never let it time out but I noticed in my gmail spam folder I had a bunch of emails from AWS re: Certificate Approval with a link that I had to click to approve the certificate. I marked them not as spam and tried again; clicking the approve link seemed to do the trick. I switched back to the DNS validated cert afterward, and that one seems to work if I wait for the hostedZone to get created, then use its name servers to update the |
I'm sorry but I believe this can only be properly fixed by Amazon internal team. The problem is that DnsValidatedCertificate works by creating a custom resource with lambda that adds those records and then waits for validation. But since this is a lambda, there is a max run time of 15 minutes. Yet based on comments above, validating certificates may take hours on us-east-1. I've been currently waiting on validation for 49 minutes and it's still not validated. As to why we have to use the DnsValidatedCertificate: We are a team in Europe, with our main region being Ireland: eu-west-1. There are many certificates that require certs placed in N. Virginia: us-east-1. That rules out the regular acm.Certificate class because that class will only deploy to the main region. We also don't want a separate stack that deploys into us-east-1 because then you cannot export certificate ARN and import it into another stack. Fn::importValue only works within the same region. Workarounds: The only workaround right now is to deploy it in a separate stack into us-east-1, then have a second stack that exports certificate values which are hard-coded as strings (manual step) and then have a third stack which actually uses those values. One other workaround is to retry stack deployment early in the morning when it seems to get validated in time - but that is highly unreliable. Solutions: Well ideally you could internally push for making certificate validations faster in that region and guarantee validations under 15 minutes. Or implement an API to do cross-region certificate creations, so CloudFormation would support this scenario natively (without the lambda). Or don't force us to deploy certificates to a specific region (us-east-1), then we could all happily use the acm.Certificate class. I've never really used CustomResource, so don't know much about that. But is there a way to run something else than a lambda that might run for longer? If you can't do any of that, you could at least make the stack deployments idempotent. Problem is that the custom resource lambda fails and triggers a rollback, which orphans the certificate and new re-deployment doesn't use the original cert that might be already validated. There would be no problem if I could: deploy a stack, wait for it to fail due to lambda timeout, wait until certificate is valdiated, re-deploy - and it will pickup the original certificate and successfully complete. Does it really need to fail and trigger rollback? How come the main acm.Certificate within one region works? At the very least this issue should be documented on the cdk page for DnsValidatedCertificate construct. |
I think should be fixable by using (I may be able to take as tab at this) |
Now that the official CloudFormation resource `AWS::CertificateManager::Certificate` (CDK's `Certificate` construct) supports DNS validation we do not want to recommend using the `DnsValidatedCertificate` construct. The `DnsValidatedCertificate` construct uses CloudFormation custom resources to perform the certificate creation and this creates a lot of maintenance burden on our team (see the list of linked issues). Currently the primary use case for using `DnsValidatedCertificate` over `Certificate` is for cross region use cases. For this use case I have updated the README to have our suggested solution. The example in the README is tested in this [integration test](https://github.com/aws/aws-cdk/blob/main/packages/@aws-cdk/aws-cloudfront/test/integ.cloudfront-cross-region-cert.ts) fixes #8934, #2914, #20698, #17349, #15217, #14519 ---- ### All Submissions: * [ ] Have you followed the guidelines in our [Contributing guide?](https://github.com/aws/aws-cdk/blob/main/CONTRIBUTING.md) ### Adding new Unconventional Dependencies: * [ ] This PR adds new unconventional dependencies following the process described [here](https://github.com/aws/aws-cdk/blob/main/CONTRIBUTING.md/#adding-new-unconventional-dependencies) ### New Features * [ ] Have you added the new feature to an [integration test](https://github.com/aws/aws-cdk/blob/main/INTEGRATION_TESTS.md)? * [ ] Did you use `yarn integ` to deploy the infrastructure and generate the snapshot (i.e. `yarn integ` without `--dry-run`)? *By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
|
Describe the bug
Creating certificates via certificate manager and route54 DNS validation fails with a timeout.
Error message:
Expected behavior
The lambda waiting for the approval should probably wait more than the hardcoded 5 minutes right now.
Version:
The text was updated successfully, but these errors were encountered: