-
Notifications
You must be signed in to change notification settings - Fork 463
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ClusterOperator Scheduling #650
Changes from 1 commit
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,108 @@ | ||
--- | ||
title: clusteroperator-scheduling | ||
authors: | ||
- "@michaelgugino" | ||
reviewers: | ||
- TBD | ||
approvers: | ||
- TBD | ||
creation-date: 2021-02-12 | ||
last-updated: 2021-02-12 | ||
status: implementable | ||
--- | ||
|
||
# ClusterOperator Scheduling | ||
|
||
|
||
## Release Signoff Checklist | ||
|
||
- [ ] Enhancement is `implementable` | ||
- [ ] Design details are appropriately documented from clear requirements | ||
- [ ] Test plan is defined | ||
- [ ] Operational readiness criteria is defined | ||
- [ ] Graduation criteria for dev preview, tech preview, GA | ||
- [ ] User-facing documentation is created in [openshift-docs](https://github.com/openshift/openshift-docs/) | ||
|
||
## Summary | ||
|
||
To ensure system components, especially ClusterOperators, have the correct | ||
set of tolerations. | ||
|
||
## Motivation | ||
|
||
The motivation is to create an easy to follow reference for developing and | ||
updating ClusterOperators and similar system components. | ||
|
||
### Goals | ||
|
||
* List best-practices for tolerations | ||
|
||
### Non-Goals | ||
|
||
* Non-standard cluster profiles such as single node | ||
* Add all practices here today, we can add them over time. | ||
|
||
## Proposal | ||
|
||
### User Stories | ||
|
||
#### Story 1 | ||
|
||
As a cluster administrator, I want upgrades to proceed safely and without | ||
intervention. | ||
|
||
#### Story 2 | ||
|
||
As an OpenShift developer, I want to be able to reference a guide to ensuring | ||
my ClusterOperators are properly defined for scheduling purposes. | ||
|
||
### Implementation Details/Notes/Constraints [optional] | ||
|
||
Not all elements of scheduling are defined here. There are several, and we | ||
can update this document as needed. | ||
|
||
### Risks and Mitigations | ||
|
||
Not all ClusterOperators behave identically. Careful consideration for each | ||
ClusterOperator is paramount. | ||
|
||
## Design Details | ||
|
||
### NoSchedule Tolerations | ||
|
||
#### node.kubernetes.io/unschedulable | ||
|
||
There are various `NoSchedule` taints that are commonly applied to nodes. | ||
ClusterOperators should not tolerate NoSchedule taints with key: | ||
`node.kubernetes.io/unschedulable` | ||
|
||
This is a special key that is applied when a node is cordoned, such as during | ||
drain. Since this is a NoSchedule taint, anything without this taint will | ||
not automatically be evicted, it will only prevent future scheduling. | ||
|
||
During upgrades, we cordon and drain all nodes. Sometimes the scheduler will | ||
place pods tolerating NoSchedule onto a node quicker than the loop for | ||
draining a node can complete. This results in hot-looping between draining | ||
and the scheduler. | ||
|
||
Any other NoSchedule toleration is fine. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In the spirit of Chesterton fence, I’d like to see a review of the components that thought they should set this and why (representative sample). That ensures we correct misconceptions about when the toleration is appropriate. If it was accidental, what would have prevented the accident? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The one I know of for sure is the CVO. I don't have data if anything else is doing it or not. I will take a look. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Looks like the CVO is the only-non daemonset pod that has this toleration, at least from a typical CI run. The daemonsets get that toleration dynamically, so everything else looks correct. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There was a discussion sometime ago on whether CVO should tolerate There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks for sharing that discussion. There is more than one 'NoSchedule' taint. Based on the discussion you linked, it seems There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I am not entire sure of that, IIUC I got a firm There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There are many taints that have the effect 'NoSchedule' but there is one 'node.kubernetes.io/unschedulable' taint with the effect 'NoSchedule'. It appears in the previous discussion only NoSchedule was considered rather than the specific taints (at the least, node.kubernetes.io/unschedulable was not itself considered directly). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. any There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm trying to use the k8s precise language. There are taints, and those taints have effects. The effect we're discussing is NoSchedule. That effect is caused by one or more taints. I did not see the taint node.kubernetes.io/unschedulable discussed in the link you provided. There were only discussions around the NoSchedule effect. |
||
|
||
### Open Questions [optional] | ||
|
||
|
||
|
||
### Test Plan | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I assumed we would enforce this via a test? Or did I miss that above? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we do There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Looks like that |
||
Mostly everything should work identically in CI. | ||
|
||
## Implementation History | ||
|
||
|
||
## Drawbacks | ||
|
||
Adjusting scheduling parameters can cause undesired effects in some cases. | ||
|
||
## Alternatives | ||
|
||
Without these guidelines, we might face difficult situations for automated | ||
operations such as upgrades. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possibly PR with adjustments to this existing enhancements content, instead of creating a new, parallel enhancement?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should have used grep! I looked in the scheduling folder but didn't find this obviously.