-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
⚠️ Set MachinePool feature flag default to true + Beta #10141
⚠️ Set MachinePool feature flag default to true + Beta #10141
Conversation
For some reason this change is preventing the "When following the Cluster API quick-start with ClusterClass" spec from deleting its cluster; it times out in |
Sounds really strange. I think the feature flag was already enabled in all e2e tests (https://github.com/kubernetes-sigs/cluster-api/blob/main/test/e2e/config/docker.yaml#L263) |
Modified the PR type so this change pops up in the release notes |
Any updates here? |
No updates, I was out for a week and have had other tasks. I'll dig into this again today. It looks like a latent bug somewhere in the DockerMP controller, when ClusterClass is also enabled. I made one attempt to fix the code but it wasn't sufficient. We thought the feature flag was on in all the tests, but perhaps not. |
f7f9fab
to
0e2905a
Compare
I'm clearly grasping at straws here, since I can't reproduce this locally (it always passes and cleanup doesn't error out). One thing I noticed is this at the end of every
Edit: never mind, that seems to be true for passing PRs as well. |
I'll take a closer look when I get around to it |
feature/feature.go
Outdated
@@ -70,7 +70,7 @@ func init() { | |||
// To add a new feature, define a key for it above and add it here. | |||
var defaultClusterAPIFeatureGates = map[featuregate.Feature]featuregate.FeatureSpec{ | |||
// Every feature should be initiated here: | |||
MachinePool: {Default: false, PreRelease: featuregate.Alpha}, | |||
MachinePool: {Default: true, PreRelease: featuregate.Alpha}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have no clue, but wondering if it should be changed to Beta
(but has nothing to do with the CI failure of course)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 in core k8s features are defaulted to true in beta.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we just change it to Beta
here then?
The graduation criteria for MachinePools in its proposal are still "TBD," and MachinePool Machines have landed and seem stable, so I think this makes sense; I'll go ahead and do so and squash. But please let me know if there is anything else we should consider.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good to me. After a few years it's fair to call it beta :)
@mboersma Pushed a commit to also enable MachinePools in CAPD per default. But I don't think this causes the issue. Based on the logs I can see that MachinePools is enabled in CAPD anyway (because we enable the feature gate via docker.yaml). This should also mean that changing the manager.yaml's shouldn't be relevant |
Thanks! I'm surprised I missed this... But I agree, it's probably not related. My guess is that there's some race condition-type bug in deleting a Docker MachinePool this happens to trigger. Locally, I've run e2e many times but only seen a couple of these failures. But clearly changing the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are we anticipating any users to complain "we don't want MP on by default" and if yes what would be their story?
I'm sure someone will object, and I understand from the point of view of "keep it simple." I don't think there's much cost to turning it on, especially if your clusters don't contain MachinePools. Changing the default here shouldn't be a breaking change (this PR notwithstanding 😛). In terms of this change, they can still set |
92e5bf2
to
0172398
Compare
Last one from my side, otherwise lgtm: #10141 (comment) I think it's fine to enable per default and also to eventually get rid of the feature flag in later releases. I think if folks have a problem with an additional controller being enabled per default they can implement a webhook blocking creations of MachinePools (or of course while we still have the feature flag disable it) |
ad34a9f
to
b13f043
Compare
/retitle |
426ac75
to
e961abe
Compare
/lgtm |
LGTM label has been added. Git tree hash: 09879fd4885b2042d5eb6a2ed29ca2a9dff325d2
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Last one
cc @mboersma ^^ |
b13369c
to
33def78
Compare
Thx! /lgtm |
LGTM label has been added. Git tree hash: 6b7170d400000b57e047c959fc2e85c64f3704bb
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/approve
@mboersma thanks for taking care of this!
When we will start thinking about GA, we should figure out how to move the code out of the exp folder, but that's not necessary for this step
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: fabriziopandini The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
What this PR does / why we need it:
Sets the default for
EXP_MACHINE_POOL
totrue
, so MachinePools are an "always on" feature (unless the user disables it).This seems like the first step en route to moving MachinePools to beta status based on discussion at last week's community meeting, but please let me know if there are other prerequisites or details I've missed.
Which issue(s) this PR fixes:
Refs #9005
/area machinepool