Make "kubectl drain" use taint instead of Unschedulable #44944

davidopp · 2017-04-26T07:03:40Z

kubectl drain: Place NoSchedule taint, then do evictions as today
kubectl uncordon: Remove the taint

ref/ #25320

@kubernetes/sig-scheduling-feature-requests
cc/ @mml

ravisantoshgudimetla · 2017-04-26T17:19:09Z

@davidopp I can work on this, if no one else has started.

thockin · 2017-04-27T04:27:59Z

Does this deprecate the Unschedulable field?

davidopp · 2017-04-27T08:41:35Z

Define "deprecate"...

I don't think we can get rid of Unschedulable until we do a v2 of the API. Too many people are using it. Marking it "deprecated" seems fine though, to try to push people towards using NoSchedule taints instead.

davidopp · 2017-04-27T08:47:28Z

@ravisantoshgudimetla I don't have the bandwidth to review it for 1.7, but if @mml has time to review it then it's fine for you to implement it.

thockin · 2017-04-27T16:39:51Z

"deprecate" here means "tell people it is going away eventually, and to use taints instead". Deprecation policy is clear that this is a long, slow process, but if we want to ever get rid of it, start now.

…

On Thu, Apr 27, 2017 at 1:41 AM, David Oppenheimer ***@***.*** > wrote: Define "deprecate"... I don't think we can get rid of Unschedulable until we do a v2 of the API. Too many people are using it. Marking it "deprecated" seems fine though, to try to push people towards using NoSchedule taints instead. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#44944 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AFVgVOhhKw2ppBGnEIAaJUkCtC2LMhQrks5r0FTXgaJpZM4NIclV> .

davidopp · 2017-04-27T18:42:42Z

Yes, that's reasonable.

ravisantoshgudimetla · 2017-04-28T17:14:06Z

@davidopp Ok, I was under the impression that this is needed for 1.7, if not I can circle back to this later.

ravisantoshgudimetla · 2017-07-03T22:05:50Z

@davidopp @mml

So, I have started working on this and the way I want to implement this:

For drain:

Check for list of taints available on the node and if one the taints have a NoSchedule effect, no taints would be added to it.
If not, add a taint which has NoScheule effect.

Uncordon would be exact opposite operation.

I am a bit concerned about effect part. Please let me know WYT.

davidopp · 2017-07-05T06:27:34Z

I think just adding a NoSchedule taint (and opposite for uncordon) is sufficient -- no need to check whether there are already taints on the node. (Unless I'm missing some corner case you have in mind?)

The trickier part is making sure this doesn't break the current users of cordon, since Unschedulable is unfortunately not the exact same thing as a NoSchedule taint.

ref/ #42001

cc/ @kubernetes/sig-cluster-lifecycle-feature-requests

luxas · 2017-07-05T09:42:21Z

The trickier part is making sure this doesn't break the current users of cordon, since Unschedulable is unfortunately not the exact same thing as a NoSchedule taint.

@davidopp I think I have an idea, but can you elaborate on the edge cases?

ravisantoshgudimetla · 2017-07-07T18:32:45Z

@davidopp

The trickier part is making sure this doesn't break the current users of cordon, since Unschedulable is unfortunately not the exact same thing as a NoSchedule taint.

So, the way I want to do it - I will update both unSchedulable field and NoSchedule taint for cordon and uncordon with a comment mentioning that unSchedulable will be eventually deprecated. Would it cause any problem, looking at the code, it seems drain is mostly a client side functionality(no code in server for it), so we should be good.

no need to check whether there are already taints on the node. (Unless I'm missing some corner case you have in mind?)

Its not an edge-case as such but I am thinking of - why to add a NoSchedule taint if one already exists.

ravisantoshgudimetla · 2017-07-07T18:33:16Z

/sig cli

…e_conversion Automatic merge from submit-queue (batch tested with PRs 48082, 48815, 48901, 48824) Changes for typecasting node in drain **What this PR does / why we need it**: **Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #48059 **Special notes for your reviewer**: Precursor to #44944 **Release note**: ```release-note kubectl drain now uses PATCH instead of PUT to update the node. The node object is now of type v1 instead of using internal api. ```

aveshagarwal · 2017-07-18T03:18:07Z

Its not an edge-case as such but I am thinking of - why to add a NoSchedule taint if one already exists.

You can not use an existing NoSchedule taint on a node, because that means there are pods that tolerate it and can be scheduled. And that means it wouldn't work as Unschedulable.

I think we should have a standard NoSchedule taint for such drain cases and make sure during validation that no pod is allowed to have any toleration tolerating this standard NoSchedule to ensure it works as Unschedulable.

ravisantoshgudimetla · 2017-07-18T04:46:40Z

You can not use an existing NoSchedule taint on a node, because that means there are pods that tolerate it and can be scheduled. And that means it wouldn't work as Unschedulable.

Thanks for the input @aveshagarwal but should we care about that scenario? The user explicitly wants his/her pod to be placed on node that has NoSchedule taint. AFAIU even now daemonSet controller doesn't respect the unSchedulable field and as a result could schedule pods onto nodes that are being drained.

I think we should have a standard NoSchedule taint for such drain cases and make sure during validation that no pod is allowed to have any toleration tolerating this standard NoSchedule to ensure it works as Unschedulable.

Isn't that against the concept of taints and tolerations? We are creating a taint which will have no toleration.

aveshagarwal · 2017-07-18T14:11:31Z

Thanks for the input @aveshagarwal but should we care about that scenario? The user explicitly wants his/her pod to be placed on node that has NoSchedule taint. AFAIU even now daemonSet controller doesn't respect the unSchedulable field and as a result could schedule pods onto nodes that are being drained.

i did not say we should not use NoSchedule taint. i said we should not use an existing NoSchedule taint on a node as a signal for Unschedulable.

Isn't that against the concept of taints and tolerations?

I dont think so.

kevin-wangzefeng · 2017-07-24T06:35:58Z

/cc @kubernetes/huawei

davidopp · 2017-07-31T21:09:30Z

At the sig-scheduling meeting today we realized it is not possible to implement this in a backward-compatible way because we allow pods to tolerate wildcard taint, so controllers that generate pods that tolerate wildcard taint would have their pods no longer be drain-able after upgrading the master.

(I think that in the absence of wildcard taint, it would be OK to implement this.)

I think we should close this issue. In the v2 API we could theoretically get rid of wildcard toleration and then it would be OK to implement this. (Users who want to opt in to being able to schedule onto machines that are draining could do so, e.g. maybe a very short-running batch job.)

fejta-bot · 2018-01-02T01:48:39Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

Prevent issues from auto-closing with an /lifecycle frozen comment.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or @fejta.
/lifecycle stale

fejta-bot · 2018-02-10T07:33:12Z

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten
/remove-lifecycle stale

fejta-bot · 2018-03-12T08:19:44Z

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

wking · 2018-08-03T17:41:34Z

Something similar landed in #61161, although that PR didn't touch kubectl. There's some discussion here about accepting the possibility of pods which tolerate the unscedulable taint.

davidopp added the sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. label Apr 26, 2017

This was referenced Jun 26, 2017

kubectl drain needs to be typecasted to node object instead of using Values after introspection. #48059

Closed

Changes for typecasting node in drain #48082

Merged

k8s-ci-robot added sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. kind/feature Categorizes issue or PR as related to a new feature. labels Jul 5, 2017

k8s-ci-robot added the sig/cli Categorizes an issue or PR as relevant to SIG CLI. label Jul 7, 2017

mml mentioned this issue Sep 8, 2017

move "kubectl drain" into the server #25625

Closed

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 2, 2018

liggitt added the area/kubectl label Jan 6, 2018

k8s-ci-robot added the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Feb 10, 2018

k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 10, 2018

k8s-ci-robot closed this as completed Mar 12, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make "kubectl drain" use taint instead of Unschedulable #44944

Make "kubectl drain" use taint instead of Unschedulable #44944

davidopp commented Apr 26, 2017

ravisantoshgudimetla commented Apr 26, 2017

thockin commented Apr 27, 2017

davidopp commented Apr 27, 2017

davidopp commented Apr 27, 2017

thockin commented Apr 27, 2017 via email

davidopp commented Apr 27, 2017

ravisantoshgudimetla commented Apr 28, 2017

ravisantoshgudimetla commented Jul 3, 2017

davidopp commented Jul 5, 2017

luxas commented Jul 5, 2017

ravisantoshgudimetla commented Jul 7, 2017

ravisantoshgudimetla commented Jul 7, 2017

aveshagarwal commented Jul 18, 2017

ravisantoshgudimetla commented Jul 18, 2017 •

edited

Loading

aveshagarwal commented Jul 18, 2017

kevin-wangzefeng commented Jul 24, 2017

davidopp commented Jul 31, 2017

fejta-bot commented Jan 2, 2018

fejta-bot commented Feb 10, 2018

fejta-bot commented Mar 12, 2018

wking commented Aug 3, 2018

Make "kubectl drain" use taint instead of Unschedulable #44944

Make "kubectl drain" use taint instead of Unschedulable #44944

Comments

davidopp commented Apr 26, 2017

ravisantoshgudimetla commented Apr 26, 2017

thockin commented Apr 27, 2017

davidopp commented Apr 27, 2017

davidopp commented Apr 27, 2017

thockin commented Apr 27, 2017 via email

davidopp commented Apr 27, 2017

ravisantoshgudimetla commented Apr 28, 2017

ravisantoshgudimetla commented Jul 3, 2017

davidopp commented Jul 5, 2017

luxas commented Jul 5, 2017

ravisantoshgudimetla commented Jul 7, 2017

ravisantoshgudimetla commented Jul 7, 2017

aveshagarwal commented Jul 18, 2017

ravisantoshgudimetla commented Jul 18, 2017 • edited Loading

aveshagarwal commented Jul 18, 2017

kevin-wangzefeng commented Jul 24, 2017

davidopp commented Jul 31, 2017

fejta-bot commented Jan 2, 2018

fejta-bot commented Feb 10, 2018

fejta-bot commented Mar 12, 2018

wking commented Aug 3, 2018

ravisantoshgudimetla commented Jul 18, 2017 •

edited

Loading