-
Notifications
You must be signed in to change notification settings - Fork 14.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update node-selection documentation with information about taints, tolerations, and alpha support for per-pod-configurable behavior when there are node problems #2774
Conversation
@gmarek if you have time to take a look too that would be great, but I know you're going on vacation so don't worry about it |
(either as a preference or a hard requirement). Taints are the opposite -- they allow | ||
a *node* to *repel* a set of pods. | ||
|
||
More conceretely: *taints* go on nodes, and *tolerations* go on pods. A taint on a node |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/conceretely/concretely
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
a *node* to *repel* a set of pods. | ||
|
||
More conceretely: *taints* go on nodes, and *tolerations* go on pods. A taint on a node | ||
means "repel all pods that do not toleate the taint." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/toleate/tolerate
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Also as a special case, empty `effect` matches all effects. | ||
|
||
The above example used `effect` of `NoSchedule`. Alternatively you can use `effect` of `PreferNoSchedule` | ||
which is a "preference" or "soft" version of `NoSchedule` -- the system will *try* to avoid placing a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about highlight as: preference
or soft
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are not official Kubernetes terminology (e.g. not an object or field name or field value), so I think putting them in quote marks is better than using backtick (which I believe is what you're suggesting?).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @davidopp , my thinking for when using backtick is that this may not limited to some Kubenretes terminology
, but also some key words
which we want to highlight something in the document, comments?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I disagree. I think people interpret monospaced fonts as "computer font" and thus something that would go into a config or come out of the system as output. For highlighting key words I think it's better to use quotes, italics, or bold.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, thanks @davidopp
More conceretely: *taints* go on nodes, and *tolerations* go on pods. A taint on a node | ||
means "repel all pods that do not toleate the taint." | ||
|
||
You add a taint to a node using `kubectl taint`. For example, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please add a link to kubectl taint
command here https://kubernetes.io/docs/user-guide/kubectl/kubectl_taint/
I have updated it a bit by adding NoExecute
:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
|
||
* if there is at least one unignored taint with effect `NoSchedule` then Kubernetes will not schedule | ||
the pod onto that node | ||
* if there is no unignored taint with effect `NoSchedule` but there is at least one unignored taint with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/unignored/un-ignored ?
Here an every other places
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
the special hardware *only* schedule onto the nodes that have the special hardware, you will need some | ||
additional mechanism, e.g. you could represent the special resource using | ||
[opaque integer resources](https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/#opaque-integer-resources-alpha-feature) | ||
and request it as a resource in the PodSpec, or you could label the nodes that have |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/PodSpec/PodSpec
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Everywhere else in the doc I used PodSpec without backticks. I don't have a strong preference either way, but since you only mentioned it once, wasn't sure if you wanted me to change it everywhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it, then I think that we can address this in a separate PR.
support for representing node problems (currently only "node unreachable" and | ||
"node not ready", corresponding to the NodeCondition "Ready" being "Unknown" or | ||
"False" respectively) as taints. When the `TaintBasedEvictions` alpha feature | ||
is enabled (`--feature-gates=TaintBasedEvictions`), pods can request to stay bound to a node that has one |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As --feature-gates=FooBar=true,TaintBasedEvictions=true
is also good, so how about update this as:
s/(--feature-gates=TaintBasedEvictions
)/(include TaintBasedEvictions=true
in --feature-gates
, such as --feature-gates=FooBar=true,TaintBasedEvictions=true
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, and reworded the first part of this paragraph a bit.
by the user already has a toleration for `node.alpha.kubernetes.io/notReady`, | ||
and likewise for `node.alpha.kubernetes.io/unknown`. This ensures that | ||
the default pod behavior of remaining bound for 5 minutes after one of these | ||
problems is detected (subject to rate lmiting -- more details are available |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/lmiting/limiting
How about
[subject to rate limiting] (https://kubernetes.io/docs/admin/node/#node-controller)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Fixed part of #2737 |
bound for the specified amount of time | ||
|
||
The above behavior is a beta feature. In addition, Kubernetes 1.6 has alpha | ||
support for representing node problems (currently only "node unreachable" and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about s/"node unreachable"/node unreachable
and others here using ""?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment here as for "preference" and "soft" -- "node unreachable" is not official terminology (not an object or field name or field value), so I think putting them in quote marks is better than using backtick.
tolerationSeconds: 6000 | ||
``` | ||
|
||
(The key for node not ready is `node.alpha.kubernetes.io/notReady`.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about update this a bit as:
For node not ready
case, just update the key
for tolerations
to node.alpha.kubernetes.io/notReady
.
Also do you want to add sth for daemonset (https://kubernetes.io/docs/admin/daemons/ ) here? I'm planning to update the daemonset a bit and perhaps we can add a link here, I can follow up with a PR once this got merged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For node not ready case, just update the key for tolerations to node.alpha.kubernetes.io/notReady.
Done
Also do you want to add sth for daemonset (https://kubernetes.io/docs/admin/daemons/ ) here?
Let's wait to see what you write for DaemonSet. I agree pointing from DaemonSet to here makes sense, not sure if we should add anything here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also have the same thinking as you, perhaps adding sth here for Daemonset
maybe better as you already have something for pod
in last paragraph, does it make sense to extend the last paragraph to cover Daemonset
then I can update https://kubernetes.io/docs/admin/daemons/ by adding a reference here same as your proposal for --enable-taint-manager and --use-taint-based-evictions.
that you mentioned in the kubernetes dev list?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Write sth for daemonset
here #2776
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done (added a paragraph).
|
||
places a taint on node `node1`. The taint has key `key`, value `value`, and taint effect `NoSchedule`. | ||
This means that no pod will be able to schedule on `node1` unless it has a matching toleration. | ||
The toleration is specified in the pod template. Both of the following tolerations "match" the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure that "pod template" is a good expression. Pod templates are only in "controllers", and one can (in theory) create a Pod directly. Tolerations are specified in PodSpec.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to PodSpec.
effect: "NoSchedule" | ||
``` | ||
|
||
A toleration "matches" a taint if the `key`s are the same and the `effect`s are the same, and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this should be put before examples in which Exists and Equal appear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem is that if you do that, then the reader hasn't seen "operator' anywhere yet. By putting an example toleration first, they at least have some idea what operator looks like.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
consider :
before the bullets.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
the pod onto that node | ||
* if there is no un-ignored taint with effect `NoSchedule` but there is at least one un-ignored taint with | ||
effect `PreferNoSchedule` then Kubernetes will *try* to not schedule the pod onto the node | ||
* if there is at least one un-ignored taint with effect `NoExecute` then the pod will be evicted from |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and won't be scheduled on the Node
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh right, you did that in kubernetes/kubernetes#41068. I actually wonder whether we should have done that, since ideally "no schedule" and "no execute" should be orthogonal (we used to call NoExecute "NoScheduleNoExecute" but then changed it to make them orthogonal). Anyway, too late now. Changed the documentation to reflect reality.
|
||
```shell | ||
kubectl taint nodes node1 key1=value1:NoSchedule | ||
kubectl taint nodes node1 key1=value1:NoExecute |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
root@k8s001:~/go/src/k8s.io/kubernetes# kubectl taint nodes 127.0.0.1 key1=value1:NoSchedule
node "127.0.0.1" tainted
root@k8s001:~/go/src/k8s.io/kubernetes# kubectl taint nodes 127.0.0.1 key1=value1:NoExecute
node "127.0.0.1" tainted
root@k8s001:~/go/src/k8s.io/kubernetes# kubectl taint nodes 127.0.0.1 key1=value2:NoSchedule
error: Node '127.0.0.1' already has a taint with key (key1) and effect (NoSchedule), and --overwrite is false
Seems we need to update the third command as kubectl taint nodes node1 key2=value2:NoSchedule
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. Changed as @gyliu513 suggested.
- key: "key1" | ||
operator: "Equal" | ||
value: "value1" | ||
effect: "NoExecute" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, what's wrong with the current indent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh never mind, fixed it.
`--feature-gates=FooBar=true,TaintBasedEvictions=true`), the taints are automatically | ||
added by the NodeController and the normal logic for evicting pods from nodes | ||
based on the Ready NodeCondition is disabled. | ||
Thus pods can request to stay bound to a node that has one |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you need at least mention that adding those taints is rate limited, to prevent "massive pod eviction". We rate limit adding taints, so we won't need to maintain special logic in taint controller for handling NC taints.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
These two default tolerations are added by the [DefaultTolerationSeconds | ||
admission controller](https://github.com/kubernetes/kubernetes/tree/master/plugin/pkg/admission/defaulttolerationseconds). | ||
|
||
[DaemonSet](https://kubernetes.io/docs/admin/daemons/) pods are created with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about mention the tolerations
are for NoExecute taint
? https://github.com/kubernetes/kubernetes/blob/master/pkg/controller/daemon/daemoncontroller.go#L843-L867
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
|
||
```shell | ||
kubectl taint nodes node1 key1=value1:NoSchedule | ||
kubectl taint nodes node1 key1=value1:NoExecute |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
root@k8s001:~/go/src/k8s.io/kubernetes# kubectl taint nodes 127.0.0.1 key1=value1:NoSchedule
node "127.0.0.1" tainted
root@k8s001:~/go/src/k8s.io/kubernetes# kubectl taint nodes 127.0.0.1 key1=value1:NoExecute
node "127.0.0.1" tainted
root@k8s001:~/go/src/k8s.io/kubernetes# kubectl taint nodes 127.0.0.1 key1=value2:NoSchedule
error: Node '127.0.0.1' already has a taint with key (key1) and effect (NoSchedule), and --overwrite is false
Seems we need to update the third command as kubectl taint nodes node1 key2=value2:NoSchedule
d537425
to
6d0d3fe
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to be thorough from a readability perspective. Overall the information is super valuable. There are a few sentences that should be split up more to avoid run-on sentences.
|
||
### Specifying taints and tolerations | ||
|
||
Node affinity, described earlier, allows a pod to "be attracted to" a set of nodes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Described earlier can probably be dropped. It would be best to make node affinity a hyperlink to it's description.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This paragraph could use some rework. Further comments below.
effect: "NoSchedule" | ||
``` | ||
|
||
A toleration "matches" a taint if the `key`s are the same and the `effect`s are the same, and: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If keys and effects are either singular or plural than the s after them should be in parens like so (s)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And: should probably be split out into another sentence.
As a special case, an empty `key` with operator `Exists` matches all keys and all values. | ||
Also as a special case, empty `effect` matches all effects. | ||
|
||
The above example used `effect` of `NoSchedule`. Alternatively, you can use `effect` of `PreferNoSchedule` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Alternatively sentence feels like a run-on sentence. It should be split up into several sentences.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Making The system will... the start of a new sentence should help.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Split into two sentences as suggested.
The above example used `effect` of `NoSchedule`. Alternatively, you can use `effect` of `PreferNoSchedule` | ||
which is a "preference" or "soft" version of `NoSchedule` -- the system will *try* to avoid placing a | ||
pod that does not tolerate the taint on the node, but it is not required. The third kind of `effect` is | ||
`NoExecute`, described later. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove comma and link to the description instead. Ex is described here
pod that does not tolerate the taint on the node, but it is not required. The third kind of `effect` is | ||
`NoExecute`, described later. | ||
|
||
You can put multiple taints and tolerations on the same node. The way Kubernetes processes multiple |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is quite a long sentence. It should probably be broken up into multiple sentences.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/taints and tolerations/taints
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/taints and tolerations/taints
Changed to "You can put multiple taints on the same node and multiple tolerations on the same pod."
support for representing node problems (currently only "node unreachable" and | ||
"node not ready", corresponding to the NodeCondition "Ready" being "Unknown" or | ||
"False" respectively) as taints. When the `TaintBasedEvictions` alpha feature | ||
is enabled (include `TaintBasedEvictions=true` in `--feature-gates`, such as |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sentence in parentheses should be a separate sentence
behavior of pod evictions due to node problems, the system actually adds the taints | ||
in a rate-limited way. This prevents massive pod evictions in scenarios such | ||
as the master becoming partitioned from the nodes.) | ||
Thus pods can request to stay bound to a node that has one |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thus doesn't read well in this sentence. Consider removing it and rephrasing the sentence a bit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rephrased
as the master becoming partitioned from the nodes.) | ||
Thus pods can request to stay bound to a node that has one | ||
or both of these problems, for a per-pod-configurable amount of time. | ||
For example, an application with a lot of local state might want to stay |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels like a comma splice. I think it should be rephrased or split into multiple sentences.
|
||
(For the node not ready case, change the key to `node.alpha.kubernetes.io/notReady`.) | ||
|
||
Note that Kubernetes automatically adds a toleration for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Run-on sentence
Note that Kubernetes automatically adds a toleration for | ||
`node.alpha.kubernetes.io/notReady` with `tolerationSeconds=300` | ||
unless the pod configuration provided | ||
by the user already has a toleration for `node.alpha.kubernetes.io/notReady`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider combining this information with the following to mitigate the run-on sentence and increase clarity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made some changes to make it clearer.
FYI: for taint and tolerations @mburke5678 |
* the `operator` is `Equal` and the `value`s are equal | ||
|
||
(`Operator` defaults to `Equal` if not specified.) | ||
As a special case, an empty `key` with operator `Exists` matches all keys and all values. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be helpful to mention about empty key in general, before stating this special case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Empty key is not allowed except in this one special case, so I don't think there is anything else to say about it.
|
||
(`Operator` defaults to `Equal` if not specified.) | ||
As a special case, an empty `key` with operator `Exists` matches all keys and all values. | ||
Also as a special case, empty `effect` matches all effects. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So it seems that a pod with a toleration with Exists operator and empty key, value and effect, can be placed on any node in a cluster? I wonder should be disallow this or do we do this already?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't have any security/access control around tolerations in general, so I'm not sure the case you're describing is any worse than what people could do anyway (e.g. examine all the taints in the cluster and create explicit tolerations for each of them).
toleration to their pods (this would be done most easily by writing a custom | ||
[admission controller](https://kubernetes.io/docs/admin/admission-controllers/)). | ||
The pods with the tolerations will then be allowed to use the tainted (dedicated) nodes as | ||
well as any other nodes in the cluster. If you want to dedicated the nodes to them *and* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding this "...nodes as well as any other nodes in the cluster." I think it would be helpful to specify this somewhere in the beginning to make it more explicit to avoid any confusion related to their placement as in this issue: kubernetes/kubernetes#39818
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: dedicated->dedicate
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: dedicated->dedicate
Fixed.
As for your other comment, the first sentence of the taints and tolerations section says
Node affinity, described earlier, allows a pod to "be attracted to" a set of nodes
(either as a preference or a hard requirement). Taints are the opposite -- they allow
a *node* to *repel* a set of pods.
I am worried that trying to explain the confusion in the issue you mentioned, will just confuse people. My hope is that the examples here make it clear that you additionally need to add a node affinity if you want to steer a pod (whether or not it has a toleation) towards particular node(s). But if you want to send a PR with some suggested wording after this PR merges, we can discuss. I just can't think of a way to do it without confusing people.
`node.alpha.kubernetes.io/notReady` with `tolerationSeconds=300` | ||
unless the pod configuration provided | ||
by the user already has a toleration for `node.alpha.kubernetes.io/notReady`, | ||
and likewise for `node.alpha.kubernetes.io/unknown`. This ensures that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
node.alpha.kubernetes.io/unreachable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
some minor comments, otherwise LGTM. |
Also I noticed that in the issue kubernetes/kubernetes#42753, the plan is that kubelet add infinite tolerations to static and mirror pods so that these pods are not evicted by taint controller ever. Should we mention that somewhere in this doc? |
Thanks again everyone. I added LGTM label. |
Thanks @davidopp , lgtm now, hope this can be merged soon ;-) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor changes, thanks.
More concretely: *taints* go on nodes, and *tolerations* go on pods. A taint on a node | ||
means "repel all pods that do not tolerate the taint." | ||
|
||
You add a taint to a node using [kubectl taint](https://kubernetes.io/docs/user-guide/kubectl/kubectl_taint/). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/You/You can/
``` | ||
|
||
places a taint on node `node1`. The taint has key `key`, value `value`, and taint effect `NoSchedule`. | ||
This means that no pod will be able to schedule on `node1` unless it has a matching toleration. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/schedule on/schedule onto/
ping @kubernetes/sig-docs-maintainers |
ping again |
@kevin-wangzefeng Thanks for the feedback, we can fix those in a followup, I'd like to get this merged. |
|
||
## Taints and tolerations (beta feature) | ||
|
||
### Specifying taints and tolerations |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Delete this heading; the paragraphs after explain what Taints and Tolerations are, not how to set them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
|
||
### Specifying taints and tolerations | ||
|
||
Node affinity, described earlier, allows a pod to "be attracted to" a set of nodes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This paragraph could use some rework. Further comments below.
|
||
### Specifying taints and tolerations | ||
|
||
Node affinity, described earlier, allows a pod to "be attracted to" a set of nodes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than trying to explain Taints and Tolerations as the opposite of Affinity (which assumes that people have read/want to read the Affinity documentation), I suggest starting like this:
"Taints and tolerations are features that work together to ensure that pods are not scheduled onto inappropriate nodes. Taints are applied to nodes, and are a way of marking that a node should not accept pods that have a given toleration. Tolerations are applied to pods, and specify that those pods should avoid nodes with a given taint.
"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a tough concept to explain. I think that contrasting with affinity is the most straightforward way to do it. Taints and tolerations are a newer-in-Kubernetes and more sophisticated concept than affinity, so I think people reading this section are likely to have some familiarity with affinity. Just to be on the safe side, I explain the salient properties of affinity in the sentence itself, so it should be self-contained even if the reader did not read the section on affinity.
I reworded the first paragraph a bit, and replaced the second paragraph with a modified version of the text you suggested.
(either as a preference or a hard requirement). Taints are the opposite -- they allow | ||
a *node* to *repel* a set of pods. | ||
|
||
More concretely: *taints* go on nodes, and *tolerations* go on pods. A taint on a node |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You haven't mentioned "tolerations" yet, so it seems weird to drop the term in here. The whole "more concretely" paragraph can probably be folded into the introduction, as in my suggested comment above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I used a modified version of your suggested text as the second paragraph, which I think resolves this issue.
|
||
places a taint on node `node1`. The taint has key `key`, value `value`, and taint effect `NoSchedule`. | ||
This means that no pod will be able to schedule on `node1` unless it has a matching toleration. | ||
The toleration is specified in the PodSpec. Both of the following tolerations "match" the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"You specify a toleration for a pod in the PodSpec
".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
"node not ready", corresponding to the NodeCondition "Ready" being "Unknown" or | ||
"False" respectively) as taints. When the `TaintBasedEvictions` alpha feature | ||
is enabled (you can do this by including `TaintBasedEvictions=true` in `--feature-gates`, such as | ||
`--feature-gates=FooBar=true,TaintBasedEvictions=true`), the taints are automatically |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider making a separate sentence out of the information in parentheses.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that makes it clearer.
- key: "key" | ||
operator: "Exists" | ||
effect: "NoSchedule" | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to list/define the values that can be entered here and in the taints?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The doc does it in prose, starting at A toleration "matches" a taint...
For a more computer-oriented description, they can look at the API definition.
effect: "NoExecute" | ||
``` | ||
|
||
In this case, the pod will not be able to shedule onto the node, because there is no |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/shedule/schedule
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
tolerations, and alpha support for per-pod-configurable behavior when there are node problems.
This PR is ready for a final docs review. |
Looks good. Thanks, @davidopp . |
Thanks! |
So this will only be merged to release 1.6 branch? How about master? Does all of the PRs here #2737 should goto release 1.6 branch? @davidopp @devin-donnelly |
@gyliu513 Yeah, the docs PRs are supposed to go in release-1.6 branch. I'm not sure what happens if you submit to master, but @kubernetes/sig-docs-maintainers should know. |
@gyliu513 yes those PRs should be against 1.6 branches. May be you could just change the base branch to 1.6 in docs repo when resubmitting prs? |
I already changed all of @gyliu513 RPs to 1.6 base branch. They are just waiting docs review. |
Yes, all of the PRs are now based on 1.6, but did not get review comments from @kubernetes/sig-docs-maintainers till now. @aveshagarwal |
@kevin-wangzefeng and/or @aveshagarwal could you review this?
cc/ @kubernetes/sig-scheduling-misc
This change is