Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KIM should stop processing any action on Gardener shoot after configured timeout #324

Closed
3 tasks
Tracked by #112
koala7659 opened this issue Aug 4, 2024 · 4 comments
Closed
3 tasks
Tracked by #112
Assignees
Labels
area/control-plane Related to all activities around Kyma Control Plane kind/feature Categorizes issue or PR as related to a new feature.

Comments

@koala7659
Copy link
Contributor

koala7659 commented Aug 4, 2024

Description

All operations processed by the KIM service in the Runtime CR controller reconciliation loop should be interrupted after the configured timeout period has elapsed. Each runtime operation kind (provisioning / deprovisioning / upgrade) should have its own duration timeout defined.

If an operation times out:

  • The Runtime CR should be marked as "Failed" state
  • A specific condition should be added, indicating which operation has timed out

AC:

  • KIM deployment should be configured with three time-out values for each runtime operation type (provisioning / deprovisioning / upgrade)
  • Runtime operations should be interrupted after when timeout period has passed
  • RuntimeCR Status should be set correctly when an operation has timed out

Relates to:
#193

Recovery
If the timeout during Update function occurs - user (KEB) can fix broken configuration that has timeout by manually deleting annotation in the Runtime instance:
kyma-project.io/runtime-operation-started:
In such a case Runtime reconciler will Patch Gardner shoot with fixed configuration and will start next upgrade cycle. If the operation succeedes on Gardener side the Runtime CR status will be switched in Ready state.

@koala7659 koala7659 added kind/feature Categorizes issue or PR as related to a new feature. area/control-plane Related to all activities around Kyma Control Plane labels Aug 4, 2024
@koala7659 koala7659 self-assigned this Aug 5, 2024
@koala7659
Copy link
Contributor Author

koala7659 commented Aug 9, 2024

Output examples:

Output

Provisioning timeout:

 status:
    conditions:
    - lastTransitionTime: "2024-08-08T11:17:23Z"
      message: Shoot creation timeout
      reason: ShootCreationTimeout
      status: "False"
      type: Provisioned
    state: Failed

Deprovisioning timeout:

 status:
    conditions:
    - lastTransitionTime: "2024-08-08T11:17:23Z"
      message: Runtime deprovisioning timeout
      reason: ShootDeletionTimeout
      status: "False"
      type: Deprovisioned
    state: Failed

Upgrade timeout:

 status:
    conditions:
    - lastTransitionTime: "2024-08-08T11:17:23Z"
      message: Shoot reconcile timeout
      reason: ShootProcessingTimeout
      status: "False"
      type: Provisioned
    state: Failed

@piotrmiskiewicz
Copy link
Member

Question: The recovery scenario described above is about "update" operation. What about deprovisioning, should KEB remove the annotation or not?

@piotrmiskiewicz
Copy link
Member

Question: update scenario. First update - KEB changes machineType, KIM do the proper changes in the shoot but a timeout occurs. KIM set the status failed. Then, let's say - one day after, KEB is sending another update (for example an autoscaler max change). Should KEB do something with the annotation mentioned above?

@tobiscr
Copy link
Contributor

tobiscr commented Aug 13, 2024

Decided to descope this feature as it could cause side-effects we don't want to cope with.

@tobiscr tobiscr closed this as completed Aug 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/control-plane Related to all activities around Kyma Control Plane kind/feature Categorizes issue or PR as related to a new feature.
Projects
None yet
Development

No branches or pull requests

3 participants