Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨ MS preflight checks to improve cluster stability #8595

Conversation

ykakarap
Copy link
Contributor

@ykakarap ykakarap commented May 3, 2023

What this PR does / why we need it:

This PR adds pre-flight checks before a new Machine is created in the MachineSet controller. These pre-flight checks will help with improving the stability of the cluster.

The following pre-flight checks are performed:

  • Ensure that the control plane is not provisioning
  • Ensure that the control plane is not upgrading
  • Ensure that the MachineSet and ControlPlane versions conform to Kubernetes version skew policy
  • If kubedam bootstrap providers is used, ensure that the MachineSet and ControlPlane versions conform to kubeadm version skew policy

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #8657

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels May 3, 2023
@ykakarap ykakarap changed the title ⚠️ [DO_NOT_REVIEW][EXPERIMENT] initial commit for MS preflight checks to avoid double rollouts ⚠️ [DO_NOT_REVIEW][EXPERIMENT] initial commit for MS preflight checks to improve cluster stability May 3, 2023
@ykakarap ykakarap changed the title ⚠️ [DO_NOT_REVIEW][EXPERIMENT] initial commit for MS preflight checks to improve cluster stability ⚠️ [DO_NOT_REVIEW][EXPERIMENT] MS preflight checks to improve cluster stability May 3, 2023
@ykakarap
Copy link
Contributor Author

/retest

@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels May 15, 2023
@ykakarap ykakarap force-pushed the pr-double-rollout_ms-preflight-check branch from ec31453 to c58ad0b Compare May 15, 2023 17:31
@ykakarap ykakarap changed the title ⚠️ [DO_NOT_REVIEW][EXPERIMENT] MS preflight checks to improve cluster stability ⚠️ MS preflight checks to improve cluster stability May 15, 2023
Comment on lines 1097 to 1109
// Check the Kubernetes version skew policy.
// => MS minor version cannot be greater than the Control Plane minor version.
// => MS minor version cannot be older than 2 minor version of Control Plane.
// Kubernetes skew policy: https://kubernetes.io/releases/version-skew-policy/#kubelet
floorVersion := semver.Version{
Major: cpSemver.Major,
Minor: cpSemver.Minor - 2,
Patch: 0,
}
if msSemver.GT(cpSemver) || msSemver.LT(floorVersion) {
log.Info(fmt.Sprintf("Pre-flight check failed: MachineSet version (%s) and ControlPlane version (%s) do not conform to the kubernetes version skew policy", msVersion, cpVersion))
return ctrl.Result{RequeueAfter: preflightFailedRequeueAfter}, nil
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be in a validation webhook?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which one? MD + MS?

At least for the latter check, the webhook is not enough as the CP version can just change and then existing MachineSets "fall" outside of the policy

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be on both sides?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll have to think a bit about it, but sounds reasonable at a first glance

Copy link
Member

@sbueringer sbueringer May 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's fine to add both checks to the MD + MS validating webhook.

But they should only block on create and on update only if the version is changing. They shouldn't block an update where the version is not changed in my opinion

I would prefer implementing the webhooks in a follow-up PR though.

Hm, but it wouldn't make sense to disable the validation on the webhooks via the skip preflight check annotation. As the webhook is not a preflight check. We could maybe choose to only return warnings in the webhook? (or maybe rather add an unsafe annotation to skip the validation in the webhook)

@vincepri WDYT?

ms.Spec.Template.Spec.Bootstrap.ConfigRef.Kind == "KubeadmConfigTemplate"
if kubeadmBootstrapProviderUsed {
if cpVersion != msVersion {
log.Info(fmt.Sprintf("Pre-flight check failed: MachineSet version (%s) and ControlPlane version (%s) do not conform to kubeadm version skew policy: kubeadm bootstrap provider only supports joining with the same version as the control plane", msVersion, cpVersion))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, shouldn't this allow users to create worker nodes with an older version that's still in the skew policy?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The kubeadm skew policy is pretty strict:

If new nodes are joined to the cluster, the kubeadm binary used for kubeadm join must match the last version of kubeadm used to either create the cluster with kubeadm init or to upgrade the same node with kubeadm upgrade.
https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/#version-skew-policy

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This skew policy is for the kubeadm binary, not for the kubelet version which joins the cluster (kubeadm even tests joining with an older kubelet binary).

So the preflight check would make the assumption, that the version set at the machineset matches kubeadm version used if I'm reading it correctly? 🤔

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like @chrischdi, the above reads like it's mostly for the binary that sets the very minimal version based on the control plane init; which makes sense to me, although I should still be able to scale up/down or rollout Machine Deployments on an older version if the overall Kubernetes skew policy allows

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chrischdi The version set here determines the VM image (or the appropriate infra specific thing) used and the assumption is that VM image uses the matching kubeadm version binary (which I believe is the is the case for the providers).

@vincepri This preflight check is only performed if using the kubeadm bootstrap provider. The preflight check will only fail when trying to create new Machines (scale down still works) using the old version because kubeadm join could fail as the kubeadm version skew is violated.
Along with that, the user can also just skip the kubeadm preflight check if the user has verified that new workers running an old version can join the control plane.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The preflight check will only fail when trying to create new Machines (scale down still works) using the old version because kubeadm join could fail as the kubeadm version skew is violated.

I'm not sure if I'm seeing the kubeadm skew issue here, given we should still allow to scale (up/down) pools of nodes on older versions, if the Kubernetes overall skew policy is respected.

cc @fabriziopandini

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is that joining new nodes with an older kubeadm version is not supported by kubeadm upstream. It mostly works but there is no test coverage for it.

An example can be seen here: #7533 (comment). The old kubeadm version wasn't able to parse the ClusterConfiguration written by the newer kubeadm version: (or in the Cluster API case, KCP)

error execution phase preflight: unable to fetch the kubeadm-config ConfigMap: failed to decode cluster configuration data: no kind "ClusterConfiguration" is registered for version "kubeadm.k8s.io/v1beta3" in scheme "k8s.io/kubernetes/cmd/kubeadm/app/apis/kubeadm/scheme/scheme.go:31"
But this is just an example we're aware of, as there is no testing on the kubeadm side there simply is no official support for this.

Our idea was to have a preflight check to ensure we're acting within the official support ranges of kubeadm & Kubernetes. For folks who are sure that other combinations work as well (I assume because they tested it) we wanted to provide a way to opt-out.

There are a few cases where it's better to not act if we're outside of the support range and thus can't tell if the Machine has a chance to come up:

  • MHC: It's better to keep an (at least momentarily) unhealty Machine vs deleting it and then the creation/join fails
  • If autoscaler scales up a MD/MS with an older version and the Machine creations/joins continously fail we can have a lot of Machines around which have no chance for success but can become pretty expensive

internal/controllers/machineset/machineset_controller.go Outdated Show resolved Hide resolved
Copy link
Member

@sbueringer sbueringer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First round of feedback, I'll take a closer look tomorrow

api/v1beta1/common_types.go Outdated Show resolved Hide resolved
api/v1beta1/common_types.go Outdated Show resolved Hide resolved
api/v1beta1/common_types.go Outdated Show resolved Hide resolved
api/v1beta1/common_types.go Outdated Show resolved Hide resolved
internal/contract/types.go Outdated Show resolved Hide resolved
ms.Spec.Template.Spec.Bootstrap.ConfigRef.Kind == "KubeadmConfigTemplate"
if kubeadmBootstrapProviderUsed {
if cpVersion != msVersion {
log.Info(fmt.Sprintf("Pre-flight check failed: MachineSet version (%s) and ControlPlane version (%s) do not conform to kubeadm version skew policy: kubeadm bootstrap provider only supports joining with the same version as the control plane", msVersion, cpVersion))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The kubeadm skew policy is pretty strict:

If new nodes are joined to the cluster, the kubeadm binary used for kubeadm join must match the last version of kubeadm used to either create the cluster with kubeadm init or to upgrade the same node with kubeadm upgrade.
https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/#version-skew-policy

Comment on lines 1097 to 1109
// Check the Kubernetes version skew policy.
// => MS minor version cannot be greater than the Control Plane minor version.
// => MS minor version cannot be older than 2 minor version of Control Plane.
// Kubernetes skew policy: https://kubernetes.io/releases/version-skew-policy/#kubelet
floorVersion := semver.Version{
Major: cpSemver.Major,
Minor: cpSemver.Minor - 2,
Patch: 0,
}
if msSemver.GT(cpSemver) || msSemver.LT(floorVersion) {
log.Info(fmt.Sprintf("Pre-flight check failed: MachineSet version (%s) and ControlPlane version (%s) do not conform to the kubernetes version skew policy", msVersion, cpVersion))
return ctrl.Result{RequeueAfter: preflightFailedRequeueAfter}, nil
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which one? MD + MS?

At least for the latter check, the webhook is not enough as the CP version can just change and then existing MachineSets "fall" outside of the policy

internal/controllers/machineset/machineset_controller.go Outdated Show resolved Hide resolved
internal/controllers/machineset/machineset_controller.go Outdated Show resolved Hide resolved
Copy link
Member

@sbueringer sbueringer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First round of feedback, I'll take a closer look tomorrow

@killianmuldoon
Copy link
Contributor

/area machineset

@k8s-ci-robot k8s-ci-robot added the area/machineset Issues or PRs related to machinesets label May 17, 2023
api/v1beta1/common_types.go Outdated Show resolved Hide resolved
api/v1beta1/common_types.go Show resolved Hide resolved
api/v1beta1/common_types.go Outdated Show resolved Hide resolved
api/v1beta1/common_types.go Outdated Show resolved Hide resolved
api/v1beta1/common_types.go Outdated Show resolved Hide resolved
api/v1beta1/common_types.go Outdated Show resolved Hide resolved
internal/controllers/machineset/machineset_controller.go Outdated Show resolved Hide resolved
internal/controllers/machineset/machineset_controller.go Outdated Show resolved Hide resolved
internal/controllers/machineset/machineset_controller.go Outdated Show resolved Hide resolved
internal/controllers/machineset/machineset_controller.go Outdated Show resolved Hide resolved
Comment on lines 1097 to 1109
// Check the Kubernetes version skew policy.
// => MS minor version cannot be greater than the Control Plane minor version.
// => MS minor version cannot be older than 2 minor version of Control Plane.
// Kubernetes skew policy: https://kubernetes.io/releases/version-skew-policy/#kubelet
floorVersion := semver.Version{
Major: cpSemver.Major,
Minor: cpSemver.Minor - 2,
Patch: 0,
}
if msSemver.GT(cpSemver) || msSemver.LT(floorVersion) {
log.Info(fmt.Sprintf("Pre-flight check failed: MachineSet version (%s) and ControlPlane version (%s) do not conform to the kubernetes version skew policy", msVersion, cpVersion))
return ctrl.Result{RequeueAfter: preflightFailedRequeueAfter}, nil
}
Copy link
Member

@sbueringer sbueringer May 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's fine to add both checks to the MD + MS validating webhook.

But they should only block on create and on update only if the version is changing. They shouldn't block an update where the version is not changed in my opinion

I would prefer implementing the webhooks in a follow-up PR though.

Hm, but it wouldn't make sense to disable the validation on the webhooks via the skip preflight check annotation. As the webhook is not a preflight check. We could maybe choose to only return warnings in the webhook? (or maybe rather add an unsafe annotation to skip the validation in the webhook)

@vincepri WDYT?

internal/controllers/machineset/machineset_controller.go Outdated Show resolved Hide resolved
@ykakarap ykakarap mentioned this pull request May 20, 2023
13 tasks
@ykakarap ykakarap force-pushed the pr-double-rollout_ms-preflight-check branch 3 times, most recently from c21ccfb to e63fa65 Compare May 20, 2023 08:21
api/v1beta1/common_types.go Show resolved Hide resolved
api/v1beta1/common_types.go Outdated Show resolved Hide resolved
config/manager/manager.yaml Outdated Show resolved Hide resolved
feature/feature.go Outdated Show resolved Hide resolved
internal/controllers/machineset/machineset_preflight.go Outdated Show resolved Hide resolved
internal/controllers/machineset/machineset_preflight.go Outdated Show resolved Hide resolved
internal/controllers/machineset/machineset_preflight.go Outdated Show resolved Hide resolved
internal/controllers/machineset/machineset_preflight.go Outdated Show resolved Hide resolved
internal/controllers/machineset/machineset_preflight.go Outdated Show resolved Hide resolved
internal/controllers/machineset/machineset_preflight.go Outdated Show resolved Hide resolved
@k8s-ci-robot k8s-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels May 23, 2023
@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 24, 2023
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: e3d9cbd57e8207a0a1ddecbee534b716b1cb300b

@ykakarap
Copy link
Contributor Author

Based on the discussion during the May 17th office hours we decided to place the MachineSet preflight checks behind a feature gate (MachineSetPreflightChecks).

The feature gate is alpha will be off by default and therefore there wont be any changes to any of the existing behavior.
Users can enable the feature gate at their own discretion.

This will give us time in the mean time as kubeadm works on its version skew.
Note that the kubeadm preflight check will still be needed for older Kubernetes versions.

@ykakarap ykakarap changed the title ⚠️ MS preflight checks to improve cluster stability ❇️ MS preflight checks to improve cluster stability May 24, 2023
@ykakarap ykakarap changed the title ❇️ MS preflight checks to improve cluster stability ✨ MS preflight checks to improve cluster stability May 24, 2023
@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 25, 2023
@ykakarap ykakarap force-pushed the pr-double-rollout_ms-preflight-check branch from 0241aa6 to c128409 Compare May 25, 2023 04:58
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 25, 2023
@k8s-ci-robot k8s-ci-robot requested a review from sbueringer May 25, 2023 04:58
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 25, 2023
@k8s-ci-robot
Copy link
Contributor

k8s-ci-robot commented May 25, 2023

@ykakarap: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-cluster-api-e2e-informing-dualstack-ipv6-main 397a2d7 link false /test pull-cluster-api-e2e-informing-dualstack-ipv6-main
pull-cluster-api-e2e-full-dualstack-ipv6-main c58ad0b link false /test pull-cluster-api-e2e-full-dualstack-ipv6-main

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@ykakarap
Copy link
Contributor Author

/retest

@sbueringer
Copy link
Member

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 25, 2023
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: 4b4ef83d318a6ebde5c80d7d497d7f808e59a744

api/v1beta1/common_types.go Outdated Show resolved Hide resolved
api/v1beta1/common_types.go Outdated Show resolved Hide resolved
api/v1beta1/common_types.go Outdated Show resolved Hide resolved
api/v1beta1/common_types.go Outdated Show resolved Hide resolved
api/v1beta1/common_types.go Outdated Show resolved Hide resolved
internal/controllers/machineset/machineset_preflight.go Outdated Show resolved Hide resolved
internal/controllers/machineset/machineset_preflight.go Outdated Show resolved Hide resolved
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 30, 2023
@sbueringer
Copy link
Member

/lgtm

approve pending squash

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 31, 2023
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: 06fb9ba8d474f56d9564a9b440bd7963552c4817

@fabriziopandini
Copy link
Member

/lgtm

@ykakarap ykakarap force-pushed the pr-double-rollout_ms-preflight-check branch from fdfd263 to 5b6a0f7 Compare May 31, 2023 17:04
@ykakarap
Copy link
Contributor Author

Squashed.

@sbueringer
Copy link
Member

/lgtm
/approve

Thank you very much!

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: sbueringer

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 31, 2023
@k8s-ci-robot k8s-ci-robot merged commit e2bab2f into kubernetes-sigs:main May 31, 2023
@k8s-ci-robot k8s-ci-robot added this to the v1.5 milestone May 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/machineset Issues or PRs related to machinesets cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

MachineSet does not consider current state of Cluster before creating new machines
7 participants