Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Certificate manage order systematically failed first time after 1min provisionning and succeed the second time #2944

Closed
ifs-anthonylecarrer opened this issue Aug 5, 2021 · 5 comments · Fixed by #2963
Labels
service/Certificate Manager Issues related to Certificate Manager Service

Comments

@ifs-anthonylecarrer
Copy link

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Terraform CLI and Terraform IBM Provider Version

0.12.26 and 0.13.5

Affected Resource(s)

  • ibm_certificate_manager_order fail or succed after 1 minute exactly even with a timeout creation of 15min

Terraform Configuration Files

Please include all Terraform configurations required to reproduce the bug. Bug reports without a functional reproduction may be closed without investigation.

provider "ibm" {
  region           = var.region
  ibmcloud_timeout = 60
  ibmcloud_api_key = var.ibmcloud_api_key
}

provider "ibm" {
  alias            = "sharedcis"
  region           = var.region
  ibmcloud_timeout = 60
  ibmcloud_api_key = var.subdomain_api_key
}

resource "ibm_resource_instance" "cms_instance" {
  name     = "cms-${var.project_short_name}.${var.top_domain}"
  location = var.region
  service  = "cloudcerts"
  plan     = "free"

  timeouts {
    create = "15m"
  }
}

data "ibm_iam_account_settings" "account_settings" {
}

data "ibm_cis" "cis_instance" {
  name      = "sharedcis"
  provider  = ibm.sharedcis
}

resource "ibm_iam_authorization_policy" "policy" {
  source_service_account        = data.ibm_iam_account_settings.account_settings.account_id
  source_service_name           = "cloudcerts"
  source_resource_instance_id   = ibm_resource_instance.cms_instance.guid
  target_service_name           = "internet-svcs"
  target_resource_instance_id   = data.ibm_cis.cis_instance.guid
  roles                         = ["Manager"]
  provider                      = ibm.sharedcis
}

resource "ibm_certificate_manager_order" "certificate" {
  for_each = toset(var.subdomains)
  certificate_manager_instance_id = ibm_resource_instance.cms_instance.id
  name                            = "Certificate for ${each.value}"
  domains                         = ["*.${each.value}"]
  rotate_keys                     = true
  domain_validation_method        = "dns-01"
  auto_renew_enabled              = true
  dns_provider_instance_crn       = data.ibm_cis.cis_instance.id

  depends_on = [ibm_iam_authorization_policy.policy]

  timeouts {
    create = "15m"
  }
}

Debug Output

---------------------
Failure logs 
---------------------

module.cms.ibm_certificate_manager_order.certificate["landingzone1-sandbox.ibm.ifsalpha.com"]: Creating...
module.cms.ibm_certificate_manager_order.certificate["landingzone1-sandbox.ibm.ifsalpha.com"]: Still creating... [10s elapsed]
module.cms.ibm_certificate_manager_order.certificate["landingzone1-sandbox.ibm.ifsalpha.com"]: Still creating... [20s elapsed]
module.cms.ibm_certificate_manager_order.certificate["landingzone1-sandbox.ibm.ifsalpha.com"]: Still creating... [30s elapsed]
module.cms.ibm_certificate_manager_order.certificate["landingzone1-sandbox.ibm.ifsalpha.com"]: Still creating... [40s elapsed]
module.cms.ibm_certificate_manager_order.certificate["landingzone1-sandbox.ibm.ifsalpha.com"]: Still creating... [50s elapsed]
module.cms.ibm_certificate_manager_order.certificate["landingzone1-sandbox.ibm.ifsalpha.com"]: Still creating... [1m0s elapsed]

Error: Error waiting for Ordering Certificate (crn:v1:bluemix:public:cloudcerts:eu-de:a/c390be9ff090461d9cbd6ac76667ba0a:dbdc1d6f-053c-4731-b676-a3a95f72009b:certificate:d2dc434cca9ef289689ce5411b4ab93b) to be succeeded: The certificate crn:v1:bluemix:public:cloudcerts:eu-de:a/c390be9ff090461d9cbd6ac76667ba0a:dbdc1d6f-053c-4731-b676-a3a95f72009b:certificate:d2dc434cca9ef289689ce5411b4ab93b failed: <nil>

  on modules/cms/main.tf line 30, in resource "ibm_certificate_manager_order" "certificate":



---------------------
Succes execution
---------------------

module.cms.ibm_certificate_manager_order.certificate["landingzone1-sandbox.ibm.ifsalpha.com"]: Creating...
module.cms.ibm_certificate_manager_order.certificate["landingzone1-sandbox.ibm.ifsalpha.com"]: Still creating... [10s elapsed]
module.cms.ibm_certificate_manager_order.certificate["landingzone1-sandbox.ibm.ifsalpha.com"]: Still creating... [20s elapsed]
module.cms.ibm_certificate_manager_order.certificate["landingzone1-sandbox.ibm.ifsalpha.com"]: Still creating... [30s elapsed]
module.cms.ibm_certificate_manager_order.certificate["landingzone1-sandbox.ibm.ifsalpha.com"]: Still creating... [40s elapsed]
module.cms.ibm_certificate_manager_order.certificate["landingzone1-sandbox.ibm.ifsalpha.com"]: Still creating... [50s elapsed]
module.cms.ibm_certificate_manager_order.certificate["landingzone1-sandbox.ibm.ifsalpha.com"]: Still creating... [1m0s elapsed]
module.cms.ibm_certificate_manager_order.certificate["landingzone1-sandbox.ibm.ifsalpha.com"]: Creation complete after 1m1s [id=crn:v1:bluemix:public:cloudcerts:eu-de:a/c390be9ff090461d9cbd6ac76667ba0a:dbdc1d6f-053c-4731-b676-a3a95f72009b:certificate:576097fb2564275d228a22aa1a5a1f60]

Panic Output

Expected Behavior

Certificate ordered successfully

Actual Behavior

Certificate ordered failed each 1st time after 1 minute provisionning and succeed on 2nd time after 1 minute provisionning. API certificate order which is asynchronous always succeed

Steps to Reproduce

Pre-requesite : terraform code. Note we use 2 providers because our cis is shared and hosted on one account and our cms is on another account

  1. terraform apply -auto-approve -var-file <varfile>

Important Factoids

  • Cis is hosted on one account and Cms on another account
  • We are experementing this way to order certificate to replace notification channel + cloudfunction because we were encountouring exactly the same effects (https://cloud.ibm.com/unifiedsupport/cases?number=CS2401764)
  • Certificate manage order always succeed when using api which is an asynchronous request
Here is the asynchronous working solution

curl -X POST https://iam.cloud.ibm.com/acms/v1/policies \
-H 'Accept: application/json' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer <BEARER>' \
-d '{ "type": "authorization", "subjects": [ { "attributes": [ { "name": "serviceName", "value": "cloudcerts" }, { "name": "accountId", "value": "c390be9ff090461d9cbd6ac76667ba0a" }, { "name": "serviceInstance", "value": "dbdc1d6f-053c-4731-b676-a3a95f72009b" } ] } ], "roles": [ { "role_id": "crn:v1:bluemix:public:iam::::serviceRole:Manager" } ], "resources": [ { "attributes": [ { "name": "serviceName", "value": "internet-svcs" }, { "name": "accountId", "value": "7e1a24d64340427fb0698c570b6a96fe"},{ "name": "serviceInstance", "value": "5bdebf90-b666-4b2c-be6f-6c4ec5a31fe5" } ] } ] }'




curl -v -k -X POST \
-H "Content-Type: application/json" \
-H "authorization: Bearer <BEARER> " \
-d "{ \"name\":\"test\", \"description\":\"Test\", \"domains\":[ \"*.landingzone1-sandbox.ibm.ifsalpha.com\" ], \"domain_validation_method\":\"dns-01\", \"issuer\": \"Let's Encrypt\", \"dns_provider_instance_crn\": \"crn:v1:bluemix:public:internet-svcs:global:a/7e1a24d64340427fb0698c570b6a96fe:5bdebf90-b666-4b2c-be6f-6c4ec5a31fe5::\", \"algorithm\": \"sha256WithRSAEncryption\", \"key_algorithm\": \"rsaEncryption 2048 bit\" }" \
https://eu-de.certificate-manager.cloud.ibm.com/api/v1/crn%3Av1%3Abluemix%3Apublic%3Acloudcerts%3Aeu-de%3Aa%2Fc390be9ff090461d9cbd6ac76667ba0a%3Adbdc1d6f-053c-4731-b676-a3a95f72009b%3A%3A/certificates/order

References

@kavya498
Copy link
Collaborator

kavya498 commented Aug 6, 2021

@ifs-anthonylecarrer
Looks like the post call for order is successful in terraform as well..

We not only order certificate.. we wait for certificate order state to be valid.. wait is ntng but we do continuous GET calls on the certificate and check for status to be valid.. if it is in fail state.. terraform will also fail..

Looking at the error, It has ordered certificate but, status of ordered certificate is failed.. that is why terraform also failed..

Can you please check your instance and let us know if you find failed certificate..?

And also please route this issue to certificate manager team to know why status of certificate is in fail state..

@kavya498 kavya498 added the service/Certificate Manager Issues related to Certificate Manager Service label Aug 6, 2021
@ifs-anthonylecarrer
Copy link
Author

Hello @kavya498

I can not understand why the terraform resource always fail after 1min or succeed after 1min while my timeout is 15min. I can not understand also why calling is always successfull.

I have an issue opened on ibmcloud side

https://cloud.ibm.com/unifiedsupport/cases?number=CS2401764

But the the api is working good !

@kavya498
Copy link
Collaborator

kavya498 commented Aug 6, 2021

@ifs-anthonylecarrer ,

After ordering certificate.. Looks like certificate ended up in fail state..
If status of certificate is failed terraform provider will not wait anymore.. It ll exit immediately no matter what the timeout is..

Error: Error waiting for Ordering Certificate (crn:v1:bluemix:public:cloudcerts:eu-de:a/c390be9ff090461d9cbd6ac76667ba0a:dbdc1d6f-053c-4731-b676-a3a95f72009b:certificate:d2dc434cca9ef289689ce5411b4ab93b) to be succeeded: The certificate crn:v1:bluemix:public:cloudcerts:eu-de:a/c390be9ff090461d9cbd6ac76667ba0a:dbdc1d6f-053c-4731-b676-a3a95f72009b:certificate:d2dc434cca9ef289689ce5411b4ab93b failed: <nil>

When you get this error.. Can you please check your certificate from UI?
Is it in fail state? If yes, then this is expected behavior from terraform provider..

Kindly, Let us know if your certificate reaches failed ==> valid state after some time.. If that is the case, we can add retries in the code..

Thanks..

@ifs-anthonylecarrer
Copy link
Author

@kavya498

For sure u know that my certificate is in failed status. My question is why this certificate alway fall in failure status at 1st provisionning after 1 minute and why it works and the answer is always after exactly 1 minute.

As my certificate is in failed status i have to destroy the resource and to rerun my ci/cd workflow to provision a new one that works. I need to understand why the resource always fail or succeed at exactly 1m. I don't know if it is about the different APIs you are calling during the provisionning. I need to understand why it fails or succeed after exactly one minute.

As i'm using terraform resource u can understand that my issue is about the terraform resource even if it is linked to an API call in the code of the terraform resource

Have a good week-end

@kavya498
Copy link
Collaborator

kavya498 commented Aug 9, 2021

@ifs-anthonylecarrer ,
After post we have a wait logic to check for the status of the certificate..
So there to call a GET after POST we wait for 1m..
We can reduce it to 10s for better time management

But, we are not aware why certificate goes to failed state during first time.. This has to be investigated by service team(Certificate Manager Service)..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
service/Certificate Manager Issues related to Certificate Manager Service
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants