Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make "kubectl drain" use taint instead of Unschedulable #44944

Closed
davidopp opened this issue Apr 26, 2017 · 21 comments
Closed

Make "kubectl drain" use taint instead of Unschedulable #44944

davidopp opened this issue Apr 26, 2017 · 21 comments
Labels
area/kubectl kind/feature Categorizes issue or PR as related to a new feature. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. sig/cli Categorizes an issue or PR as relevant to SIG CLI. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling.

Comments

@davidopp
Copy link
Member

kubectl drain: Place NoSchedule taint, then do evictions as today
kubectl uncordon: Remove the taint

ref/ #25320

@kubernetes/sig-scheduling-feature-requests
cc/ @mml

@davidopp davidopp added the sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. label Apr 26, 2017
@ravisantoshgudimetla
Copy link
Contributor

@davidopp I can work on this, if no one else has started.

@thockin
Copy link
Member

thockin commented Apr 27, 2017

Does this deprecate the Unschedulable field?

@davidopp
Copy link
Member Author

Define "deprecate"...

I don't think we can get rid of Unschedulable until we do a v2 of the API. Too many people are using it. Marking it "deprecated" seems fine though, to try to push people towards using NoSchedule taints instead.

@davidopp
Copy link
Member Author

@ravisantoshgudimetla I don't have the bandwidth to review it for 1.7, but if @mml has time to review it then it's fine for you to implement it.

@thockin
Copy link
Member

thockin commented Apr 27, 2017 via email

@davidopp
Copy link
Member Author

Yes, that's reasonable.

@ravisantoshgudimetla
Copy link
Contributor

@davidopp Ok, I was under the impression that this is needed for 1.7, if not I can circle back to this later.

@ravisantoshgudimetla
Copy link
Contributor

@davidopp @mml

So, I have started working on this and the way I want to implement this:

For drain:

  • Check for list of taints available on the node and if one the taints have a NoSchedule effect, no taints would be added to it.
  • If not, add a taint which has NoScheule effect.

Uncordon would be exact opposite operation.

I am a bit concerned about effect part. Please let me know WYT.

@davidopp
Copy link
Member Author

davidopp commented Jul 5, 2017

I think just adding a NoSchedule taint (and opposite for uncordon) is sufficient -- no need to check whether there are already taints on the node. (Unless I'm missing some corner case you have in mind?)

The trickier part is making sure this doesn't break the current users of cordon, since Unschedulable is unfortunately not the exact same thing as a NoSchedule taint.

ref/ #42001

cc/ @kubernetes/sig-cluster-lifecycle-feature-requests

@k8s-ci-robot k8s-ci-robot added sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. kind/feature Categorizes issue or PR as related to a new feature. labels Jul 5, 2017
@luxas
Copy link
Member

luxas commented Jul 5, 2017

The trickier part is making sure this doesn't break the current users of cordon, since Unschedulable is unfortunately not the exact same thing as a NoSchedule taint.

@davidopp I think I have an idea, but can you elaborate on the edge cases?

@ravisantoshgudimetla
Copy link
Contributor

@davidopp

The trickier part is making sure this doesn't break the current users of cordon, since Unschedulable is unfortunately not the exact same thing as a NoSchedule taint.

So, the way I want to do it - I will update both unSchedulable field and NoSchedule taint for cordon and uncordon with a comment mentioning that unSchedulable will be eventually deprecated. Would it cause any problem, looking at the code, it seems drain is mostly a client side functionality(no code in server for it), so we should be good.

no need to check whether there are already taints on the node. (Unless I'm missing some corner case you have in mind?)

Its not an edge-case as such but I am thinking of - why to add a NoSchedule taint if one already exists.

@ravisantoshgudimetla
Copy link
Contributor

/sig cli

@k8s-ci-robot k8s-ci-robot added the sig/cli Categorizes an issue or PR as relevant to SIG CLI. label Jul 7, 2017
k8s-github-robot pushed a commit that referenced this issue Jul 14, 2017
…e_conversion

Automatic merge from submit-queue (batch tested with PRs 48082, 48815, 48901, 48824)

Changes for typecasting node in drain

**What this PR does / why we need it**:

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #48059 

**Special notes for your reviewer**:
Precursor to #44944

**Release note**:

```release-note
kubectl drain now uses PATCH instead of PUT to update the node. The node object is now of type v1 instead of using internal api.
```
@aveshagarwal
Copy link
Member

Its not an edge-case as such but I am thinking of - why to add a NoSchedule taint if one already exists.

You can not use an existing NoSchedule taint on a node, because that means there are pods that tolerate it and can be scheduled. And that means it wouldn't work as Unschedulable.

I think we should have a standard NoSchedule taint for such drain cases and make sure during validation that no pod is allowed to have any toleration tolerating this standard NoSchedule to ensure it works as Unschedulable.

@ravisantoshgudimetla
Copy link
Contributor

ravisantoshgudimetla commented Jul 18, 2017

You can not use an existing NoSchedule taint on a node, because that means there are pods that tolerate it and can be scheduled. And that means it wouldn't work as Unschedulable.

Thanks for the input @aveshagarwal but should we care about that scenario? The user explicitly wants his/her pod to be placed on node that has NoSchedule taint. AFAIU even now daemonSet controller doesn't respect the unSchedulable field and as a result could schedule pods onto nodes that are being drained.

I think we should have a standard NoSchedule taint for such drain cases and make sure during validation that no pod is allowed to have any toleration tolerating this standard NoSchedule to ensure it works as Unschedulable.

Isn't that against the concept of taints and tolerations? We are creating a taint which will have no toleration.

@aveshagarwal
Copy link
Member

Thanks for the input @aveshagarwal but should we care about that scenario? The user explicitly wants his/her pod to be placed on node that has NoSchedule taint. AFAIU even now daemonSet controller doesn't respect the unSchedulable field and as a result could schedule pods onto nodes that are being drained.

i did not say we should not use NoSchedule taint. i said we should not use an existing NoSchedule taint on a node as a signal for Unschedulable.

Isn't that against the concept of taints and tolerations?

I dont think so.

@kevin-wangzefeng
Copy link
Member

/cc @kubernetes/huawei

@davidopp
Copy link
Member Author

At the sig-scheduling meeting today we realized it is not possible to implement this in a backward-compatible way because we allow pods to tolerate wildcard taint, so controllers that generate pods that tolerate wildcard taint would have their pods no longer be drain-able after upgrading the master.

(I think that in the absence of wildcard taint, it would be OK to implement this.)

I think we should close this issue. In the v2 API we could theoretically get rid of wildcard toleration and then it would be OK to implement this. (Users who want to opt in to being able to schedule onto machines that are draining could do so, e.g. maybe a very short-running batch job.)

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

Prevent issues from auto-closing with an /lifecycle frozen comment.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or @fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 2, 2018
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten
/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Feb 10, 2018
@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 10, 2018
@fejta-bot
Copy link

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

@wking
Copy link
Contributor

wking commented Aug 3, 2018

Something similar landed in #61161, although that PR didn't touch kubectl. There's some discussion here about accepting the possibility of pods which tolerate the unscedulable taint.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/kubectl kind/feature Categorizes issue or PR as related to a new feature. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. sig/cli Categorizes an issue or PR as relevant to SIG CLI. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling.
Projects
None yet
Development

No branches or pull requests

10 participants