diff --git a/enhancements/machine-api/machine-health-checking.md b/enhancements/machine-api/machine-health-checking.md index f49bfa1db09..e529dac7d60 100644 --- a/enhancements/machine-api/machine-health-checking.md +++ b/enhancements/machine-api/machine-health-checking.md @@ -154,6 +154,8 @@ Out of band: - The machine controller provider deletes the instance. - The machine controller deletes the machine. +![Machine health check](./mhc.svg) + #### Out of tree remediation controller, e.g baremetal reboot: - An external remediation can plug in by setting the `healthchecking.openshift.io/strategy: reboot` on the MHC resource. - An external remediation controller remediation could then watch machines annotated with `healthchecking.openshift.io/remediation: reboot` and react as it sees fit. diff --git a/enhancements/machine-api/mhc.plantuml b/enhancements/machine-api/mhc.plantuml new file mode 100644 index 00000000000..ab87df51195 --- /dev/null +++ b/enhancements/machine-api/mhc.plantuml @@ -0,0 +1,20 @@ +@startuml +start; +:Machine Health Check controller; +repeat + repeat + :Watch MHCs; + :Find unhealthy targets: Need remediation or going towards timeout; + repeat while (unhealthyTargets > maxUnhealthy) is (yes) + -> no; +repeat while (API server machine deletion requests for machines that need remediation) is (requeue with minTime to timeout delay) + +-[#blue,dashed]-> Out of band; +#LightBlue:The machine owner controller watches deletion timestamp. +Reconciles towards desired number of replicas. +The process to create a new machine/node starts; +#LightBlue:The machine controller drains the unhealthy node; +#LightBlue:The machine controller provider deletes the unhealthy instance; +#LightBlue:The machine controller removes the unhealthy machine finalizer; +#LightBlue:The API server removes the unhealthy machine resource; +@enduml diff --git a/enhancements/machine-api/mhc.png b/enhancements/machine-api/mhc.png new file mode 100644 index 00000000000..16f4d538d68 Binary files /dev/null and b/enhancements/machine-api/mhc.png differ diff --git a/enhancements/machine-api/mhc.svg b/enhancements/machine-api/mhc.svg new file mode 100644 index 00000000000..af28d3c0c3c --- /dev/null +++ b/enhancements/machine-api/mhc.svg @@ -0,0 +1,32 @@ +Machine Health Check controllerWatch MHCsFind unhealthy targets: Need remediation or going towards timeoutunhealthyTargets > maxUnhealthyyesAPI server machine deletion requests for machines that need remediationrequeue with minTime to timeout delayThe machine owner controller watches deletion timestamp.Reconciles towards desired number of replicas.The process to create a new machine/node startsThe machine controller drains the unhealthy nodeThe machine controller provider deletes the unhealthy instanceThe machine controller removes the unhealthy machine finalizerThe API server removes the unhealthy machine resourcenoOut of band \ No newline at end of file