Certificate manage order systematically failed first time after 1min provisionning and succeed the second time #2944

ifs-anthonylecarrer · 2021-08-05T16:03:21Z

Community Note

Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
If you are interested in working on this issue or have submitted a pull request, please leave a comment

Terraform CLI and Terraform IBM Provider Version

0.12.26 and 0.13.5

Affected Resource(s)

ibm_certificate_manager_order fail or succed after 1 minute exactly even with a timeout creation of 15min

Terraform Configuration Files

Please include all Terraform configurations required to reproduce the bug. Bug reports without a functional reproduction may be closed without investigation.

provider "ibm" {
  region           = var.region
  ibmcloud_timeout = 60
  ibmcloud_api_key = var.ibmcloud_api_key
}

provider "ibm" {
  alias            = "sharedcis"
  region           = var.region
  ibmcloud_timeout = 60
  ibmcloud_api_key = var.subdomain_api_key
}

resource "ibm_resource_instance" "cms_instance" {
  name     = "cms-${var.project_short_name}.${var.top_domain}"
  location = var.region
  service  = "cloudcerts"
  plan     = "free"

  timeouts {
    create = "15m"
  }
}

data "ibm_iam_account_settings" "account_settings" {
}

data "ibm_cis" "cis_instance" {
  name      = "sharedcis"
  provider  = ibm.sharedcis
}

resource "ibm_iam_authorization_policy" "policy" {
  source_service_account        = data.ibm_iam_account_settings.account_settings.account_id
  source_service_name           = "cloudcerts"
  source_resource_instance_id   = ibm_resource_instance.cms_instance.guid
  target_service_name           = "internet-svcs"
  target_resource_instance_id   = data.ibm_cis.cis_instance.guid
  roles                         = ["Manager"]
  provider                      = ibm.sharedcis
}

resource "ibm_certificate_manager_order" "certificate" {
  for_each = toset(var.subdomains)
  certificate_manager_instance_id = ibm_resource_instance.cms_instance.id
  name                            = "Certificate for ${each.value}"
  domains                         = ["*.${each.value}"]
  rotate_keys                     = true
  domain_validation_method        = "dns-01"
  auto_renew_enabled              = true
  dns_provider_instance_crn       = data.ibm_cis.cis_instance.id

  depends_on = [ibm_iam_authorization_policy.policy]

  timeouts {
    create = "15m"
  }
}

Debug Output

---------------------
Failure logs 
---------------------

module.cms.ibm_certificate_manager_order.certificate["landingzone1-sandbox.ibm.ifsalpha.com"]: Creating...
module.cms.ibm_certificate_manager_order.certificate["landingzone1-sandbox.ibm.ifsalpha.com"]: Still creating... [10s elapsed]
module.cms.ibm_certificate_manager_order.certificate["landingzone1-sandbox.ibm.ifsalpha.com"]: Still creating... [20s elapsed]
module.cms.ibm_certificate_manager_order.certificate["landingzone1-sandbox.ibm.ifsalpha.com"]: Still creating... [30s elapsed]
module.cms.ibm_certificate_manager_order.certificate["landingzone1-sandbox.ibm.ifsalpha.com"]: Still creating... [40s elapsed]
module.cms.ibm_certificate_manager_order.certificate["landingzone1-sandbox.ibm.ifsalpha.com"]: Still creating... [50s elapsed]
module.cms.ibm_certificate_manager_order.certificate["landingzone1-sandbox.ibm.ifsalpha.com"]: Still creating... [1m0s elapsed]

Error: Error waiting for Ordering Certificate (crn:v1:bluemix:public:cloudcerts:eu-de:a/c390be9ff090461d9cbd6ac76667ba0a:dbdc1d6f-053c-4731-b676-a3a95f72009b:certificate:d2dc434cca9ef289689ce5411b4ab93b) to be succeeded: The certificate crn:v1:bluemix:public:cloudcerts:eu-de:a/c390be9ff090461d9cbd6ac76667ba0a:dbdc1d6f-053c-4731-b676-a3a95f72009b:certificate:d2dc434cca9ef289689ce5411b4ab93b failed: <nil>

  on modules/cms/main.tf line 30, in resource "ibm_certificate_manager_order" "certificate":



---------------------
Succes execution
---------------------

module.cms.ibm_certificate_manager_order.certificate["landingzone1-sandbox.ibm.ifsalpha.com"]: Creating...
module.cms.ibm_certificate_manager_order.certificate["landingzone1-sandbox.ibm.ifsalpha.com"]: Still creating... [10s elapsed]
module.cms.ibm_certificate_manager_order.certificate["landingzone1-sandbox.ibm.ifsalpha.com"]: Still creating... [20s elapsed]
module.cms.ibm_certificate_manager_order.certificate["landingzone1-sandbox.ibm.ifsalpha.com"]: Still creating... [30s elapsed]
module.cms.ibm_certificate_manager_order.certificate["landingzone1-sandbox.ibm.ifsalpha.com"]: Still creating... [40s elapsed]
module.cms.ibm_certificate_manager_order.certificate["landingzone1-sandbox.ibm.ifsalpha.com"]: Still creating... [50s elapsed]
module.cms.ibm_certificate_manager_order.certificate["landingzone1-sandbox.ibm.ifsalpha.com"]: Still creating... [1m0s elapsed]
module.cms.ibm_certificate_manager_order.certificate["landingzone1-sandbox.ibm.ifsalpha.com"]: Creation complete after 1m1s [id=crn:v1:bluemix:public:cloudcerts:eu-de:a/c390be9ff090461d9cbd6ac76667ba0a:dbdc1d6f-053c-4731-b676-a3a95f72009b:certificate:576097fb2564275d228a22aa1a5a1f60]

Panic Output

Expected Behavior

Certificate ordered successfully

Actual Behavior

Certificate ordered failed each 1st time after 1 minute provisionning and succeed on 2nd time after 1 minute provisionning. API certificate order which is asynchronous always succeed

Steps to Reproduce

Pre-requesite : terraform code. Note we use 2 providers because our cis is shared and hosted on one account and our cms is on another account

terraform apply -auto-approve -var-file <varfile>

Important Factoids

Cis is hosted on one account and Cms on another account
We are experementing this way to order certificate to replace notification channel + cloudfunction because we were encountouring exactly the same effects (https://cloud.ibm.com/unifiedsupport/cases?number=CS2401764)
Certificate manage order always succeed when using api which is an asynchronous request

Here is the asynchronous working solution

curl -X POST https://iam.cloud.ibm.com/acms/v1/policies \
-H 'Accept: application/json' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer <BEARER>' \
-d '{ "type": "authorization", "subjects": [ { "attributes": [ { "name": "serviceName", "value": "cloudcerts" }, { "name": "accountId", "value": "c390be9ff090461d9cbd6ac76667ba0a" }, { "name": "serviceInstance", "value": "dbdc1d6f-053c-4731-b676-a3a95f72009b" } ] } ], "roles": [ { "role_id": "crn:v1:bluemix:public:iam::::serviceRole:Manager" } ], "resources": [ { "attributes": [ { "name": "serviceName", "value": "internet-svcs" }, { "name": "accountId", "value": "7e1a24d64340427fb0698c570b6a96fe"},{ "name": "serviceInstance", "value": "5bdebf90-b666-4b2c-be6f-6c4ec5a31fe5" } ] } ] }'




curl -v -k -X POST \
-H "Content-Type: application/json" \
-H "authorization: Bearer <BEARER> " \
-d "{ \"name\":\"test\", \"description\":\"Test\", \"domains\":[ \"*.landingzone1-sandbox.ibm.ifsalpha.com\" ], \"domain_validation_method\":\"dns-01\", \"issuer\": \"Let's Encrypt\", \"dns_provider_instance_crn\": \"crn:v1:bluemix:public:internet-svcs:global:a/7e1a24d64340427fb0698c570b6a96fe:5bdebf90-b666-4b2c-be6f-6c4ec5a31fe5::\", \"algorithm\": \"sha256WithRSAEncryption\", \"key_algorithm\": \"rsaEncryption 2048 bit\" }" \
https://eu-de.certificate-manager.cloud.ibm.com/api/v1/crn%3Av1%3Abluemix%3Apublic%3Acloudcerts%3Aeu-de%3Aa%2Fc390be9ff090461d9cbd6ac76667ba0a%3Adbdc1d6f-053c-4731-b676-a3a95f72009b%3A%3A/certificates/order

References

The text was updated successfully, but these errors were encountered:

kavya498 · 2021-08-06T06:09:35Z

@ifs-anthonylecarrer
Looks like the post call for order is successful in terraform as well..

We not only order certificate.. we wait for certificate order state to be valid.. wait is ntng but we do continuous GET calls on the certificate and check for status to be valid.. if it is in fail state.. terraform will also fail..

Looking at the error, It has ordered certificate but, status of ordered certificate is failed.. that is why terraform also failed..

Can you please check your instance and let us know if you find failed certificate..?

And also please route this issue to certificate manager team to know why status of certificate is in fail state..

ifs-anthonylecarrer · 2021-08-06T07:29:36Z

Hello @kavya498

I can not understand why the terraform resource always fail after 1min or succeed after 1min while my timeout is 15min. I can not understand also why calling is always successfull.

I have an issue opened on ibmcloud side

https://cloud.ibm.com/unifiedsupport/cases?number=CS2401764

But the the api is working good !

kavya498 · 2021-08-06T13:34:51Z

@ifs-anthonylecarrer ,

After ordering certificate.. Looks like certificate ended up in fail state..
If status of certificate is failed terraform provider will not wait anymore.. It ll exit immediately no matter what the timeout is..

Error: Error waiting for Ordering Certificate (crn:v1:bluemix:public:cloudcerts:eu-de:a/c390be9ff090461d9cbd6ac76667ba0a:dbdc1d6f-053c-4731-b676-a3a95f72009b:certificate:d2dc434cca9ef289689ce5411b4ab93b) to be succeeded: The certificate crn:v1:bluemix:public:cloudcerts:eu-de:a/c390be9ff090461d9cbd6ac76667ba0a:dbdc1d6f-053c-4731-b676-a3a95f72009b:certificate:d2dc434cca9ef289689ce5411b4ab93b failed: <nil>

When you get this error.. Can you please check your certificate from UI?
Is it in fail state? If yes, then this is expected behavior from terraform provider..

Kindly, Let us know if your certificate reaches failed ==> valid state after some time.. If that is the case, we can add retries in the code..

Thanks..

ifs-anthonylecarrer · 2021-08-06T14:32:00Z

@kavya498

For sure u know that my certificate is in failed status. My question is why this certificate alway fall in failure status at 1st provisionning after 1 minute and why it works and the answer is always after exactly 1 minute.

As my certificate is in failed status i have to destroy the resource and to rerun my ci/cd workflow to provision a new one that works. I need to understand why the resource always fail or succeed at exactly 1m. I don't know if it is about the different APIs you are calling during the provisionning. I need to understand why it fails or succeed after exactly one minute.

As i'm using terraform resource u can understand that my issue is about the terraform resource even if it is linked to an API call in the code of the terraform resource

Have a good week-end

kavya498 · 2021-08-09T12:50:28Z

@ifs-anthonylecarrer ,
After post we have a wait logic to check for the status of the certificate..
So there to call a GET after POST we wait for 1m..
We can reduce it to 10s for better time management

But, we are not aware why certificate goes to failed state during first time.. This has to be investigated by service team(Certificate Manager Service)..

kavya498 added the service/Certificate Manager Issues related to Certificate Manager Service label Aug 6, 2021

kavya498 mentioned this issue Aug 10, 2021

Fix: Reduce delay from 1m to 10s on cmr_order resource #2963

Merged

hkantare closed this as completed in #2963 Aug 10, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Certificate manage order systematically failed first time after 1min provisionning and succeed the second time #2944

Certificate manage order systematically failed first time after 1min provisionning and succeed the second time #2944

ifs-anthonylecarrer commented Aug 5, 2021

kavya498 commented Aug 6, 2021

ifs-anthonylecarrer commented Aug 6, 2021

kavya498 commented Aug 6, 2021

ifs-anthonylecarrer commented Aug 6, 2021

kavya498 commented Aug 9, 2021

Certificate manage order systematically failed first time after 1min provisionning and succeed the second time #2944

Certificate manage order systematically failed first time after 1min provisionning and succeed the second time #2944

Comments

ifs-anthonylecarrer commented Aug 5, 2021

Community Note

Terraform CLI and Terraform IBM Provider Version

Affected Resource(s)

Terraform Configuration Files

Debug Output

Panic Output

Expected Behavior

Actual Behavior

Steps to Reproduce

Important Factoids

References

kavya498 commented Aug 6, 2021

ifs-anthonylecarrer commented Aug 6, 2021

kavya498 commented Aug 6, 2021

ifs-anthonylecarrer commented Aug 6, 2021

kavya498 commented Aug 9, 2021