-
Notifications
You must be signed in to change notification settings - Fork 621
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Message to Suspend #4232
base: main
Are you sure you want to change the base?
Add Message to Suspend #4232
Conversation
accepts a reason for why the resource was suspended. Defines the resource metadata annotation key that stores the reason for the resource suspension. Updates the suspend and resume resource patching to add or remove the annotation holding the suspend reason. Signed-off-by: Travis Mattera <travis@mattera.io>
reason
cli flag to the suspend
command that
cmd/flux/suspend.go
Outdated
} | ||
|
||
var suspendArgs SuspendFlags | ||
|
||
func init() { | ||
suspendCmd.PersistentFlags().BoolVarP(&suspendArgs.all, "all", "", false, | ||
"suspend all resources in that namespace") | ||
suspendCmd.PersistentFlags().StringVarP(&suspendArgs.reason, "reason", "r", "suspended", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like to have some clarity about the intention here. This flag has a default value, which is always used to set the annotation on a suspended object. This results in the suspend command to always add an annotation on the object even when no reason is passed.
Name: test-1
Namespace: default
Labels: <none>
Annotations: suspend.fluxcd.io/reason: suspended
API Version: source.toolkit.fluxcd.io/v1
Kind: GitRepository
This makes every suspended object to have this annotation always. It touches a discussion we had a few times before about using annotation to suspend the object instead of a field in the spec. I think in the discussions, we usually conclude that it's too late now to remove suspend from the spec as some APIs are already GA.
Implementing this change such that there's a reason annotation on suspended object always, makes the API a little repetitive/redundant/open for multiple interpretation for everyone forever. Some users may assume that the annotation is how they can detect if the object is suspended. But if an object is resumed without using Flux CLI, the annotation will remain on the object.
We can avoid all these if this is implemented such that only when a reason is provided, the annotation is added, else it behaves as before. Users who know about this feature, will opt-in to have the annotation.
Is this an intentional design choice to always have the annotation with some benefits?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We would add this annotation only if a reason is provided. So we really need to remove the default from the flag.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While I see it beneficial to support the suspend event and optional reason metadata percolate out through flux events, that's outside the scope of this PR.
The default of --reason suspended
tightly coupling flux cli initiated suspensions to a resource annotation was an oversight of mine. I'm glad to have such a proper review @darkowlzz @stefanprodan :) and will change this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
During the community meeting today, I voiced concerns about this detachment and the possibility that the reason annotation would continue to exist when the .spec
would move on. For example, when e.g. the .spec.suspend
boolean is quickly changed in Git without making use of the Flux CLI.
Given this, I am wondering if we could not tackle two issues at once by using suspend.fluxcd.io/reason
as another way to mark the resource as suspended. This would force a user to remove the annotation to continue the reconciliation of the object (avoiding any potential detachment), while also allowing the object to be suspended without creating a bump in the generation of the object (solving a historical regret around including the suspend field in the spec of the object).
This would raise some questions around how we would e.g. display this in a kubectl get
column, but does prepare us for a better future in which we could eventually deprecate the .spec.suspend
in a new major API version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking more about this, it may not even explicitly have to be /reason
. The mechanics could also be that the presence of the key in the annotation map simply marks the resource as suspended, while the value can be arbitrary (including a reason string). To allow an explicit false, we could e.g. recognize false | 0 | False
as "disabled" while any other value would mean "enabled".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
During the community meeting today, I voiced concerns about this detachment and the possibility that the reason annotation would continue to exist when the
.spec
would move on. For example, when e.g. the.spec.suspend
boolean is quickly changed in Git without making use of the Flux CLI.
Given this, I am wondering if we could not tackle two issues at once by using
suspend.fluxcd.io/reason
as another way to mark the resource as suspended.
After discussing a bit with @darkowlzz at yesterday's flux bug scrub, I'd like to start an RFC for implementing a suspend-by-annotation and deprecating .spec.suspend
api. Perhaps we use suspend.toolkit.fluxcd.io/state
or suspend.toolkit.fluxcd.io/suspended
to signal to the controllers that the resource is suspended while allowing an optional reason under suspend.toolkit.fluxcd.io/reason
.
If that path is supported by the maintainers I would ask that we consider this PR again without any control semantics for the suspend reason annotation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Think reconcile.fluxcd.io/suspended
would be sufficient to suspend and provide a reason to start with, and fit nicely together with the existing annotation to request a reconciliation (reconcile.fluxcd.io/requestedAt
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think adding a suspend annotation is great and RFC is the most reasonable way to kick off a design around it. One thing to keep in mind is that the RFC should be very clear about co-existence with the existing .spec.suspend
field which we cannot remove in the v1 APIs (e.g. kustomization.v1.kustomize.toolkit.fluxcd.io), e.g. regarding precedence.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the suggestion to use reconcile.fluxcd.io/suspend
annotation for suspending a resource with support for a string that could be anything, including a reason-- maybe "message" is more open. However if we do want to have this eventually respected by controllers I'm concerned that adding it here will require some later migration. I'm not in a rush to get this feature PR incorporated and would be happy to get an RFC going to help the community get what they want with minimal churn.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After discussing the #4271 at the team meeting in September, 2023 @stefanprodan noted some complications with having suspended state set only as an annotation with a new version cli which would not be recognized by an old version controller. Possible long term options to support #4271 were suggested by @darkowlzz which included getting object status to indicate suspended state which would probably warrant a separate RFC.
I'd like to request another review of this PR which omits any controller interaction and simply focuses on metadata for human communication.
Set annotation key string to suspend.toolkit.fluxcd.io/reason Co-authored-by: Stefan Prodan <stefan.prodan@gmail.com> Signed-off-by: Travis Mattera <travis@mattera.io>
Set annotation key string to suspend.toolkit.fluxcd.io/reason Co-authored-by: Stefan Prodan <stefan.prodan@gmail.com> Signed-off-by: Travis Mattera <travis@mattera.io>
as default and changes the flux resource suspend behavior to only apply the suspend reason annotation when a non-empty reason is provided. Signed-off-by: Travis Mattera <travis@mattera.io>
Signed-off-by: Travis Mattera <travis@mattera.io>
62b749c
to
8483b45
Compare
Signed-off-by: Travis Mattera <travis@mattera.io>
7b9a066
to
b8a4a2f
Compare
User Story
Often there is a purpose behind suspending a resource with the
flux
cli, whether it be during incident response, source repository refactoring, and various other scenarios. Theflux diff
command provides a great UX for determining what will change if a suspended resource is resumed, but it doesn't help explain why something is paused or when it would be ok to resume reconciliation. On distributed teams this can become a point of friction as it needs to be communicated among group stakeholders.Flux users should have a way to succinctly signal to other users why a resource is suspended on the resource itself.
Other use cases include, but is not limited to, adding general context such as who suspended the object, the associated work ticket, and consequences of prematurely resuming a object reconciliation.
Changes
This PR adds support for an optional
--message
flag to theflux suspend
command that adds an annotation to the suspended resource. The annotation is removed when the resource is targeted by theflux resume
command.Detailed List of Changes
Adds an optional
message
cli flag to thesuspend
command that accepts a reason for why the resource was suspended.Defines the resource metadata annotation key that stores the message for the resource suspension.
Updates the suspend and resume resource patching to add or remove the annotation holding the suspend message.