Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Setting kubernetes.io/role and other restricted node labels #2363

Open
michaelbeaumont opened this issue Jun 23, 2020 · 25 comments
Open

Setting kubernetes.io/role and other restricted node labels #2363

michaelbeaumont opened this issue Jun 23, 2020 · 25 comments
Labels
area/nodegroup kind/feature New feature or request priority/backlog Not staffed at the moment. Help wanted. priority/blocked priority/important-longterm Important over the long term, but may not be currently staffed and/or may require multiple releases

Comments

@michaelbeaumont
Copy link
Contributor

With kubernetes version 1.16, it became an error to set certain labels using the --node-labels feature, which is how eksctl sets user-provided labels for nodes. kubelet can now only give itself a specific set of labels.
See the rationale behind this change in k8s for why kubelet shouldn't be able to dictate any arbitrary node label.

However users would still like to be able to set these labels somehow, which does not conflict with the intention behind the change in 1.16.

This is not trivial for eksctl to implement and would likely require a cluster component to maintain the labels without support from EKS, see #2197 (comment), #2197 (comment).

For some use cases, this workaround may suffice.

You can choose a different label to use for selectors (under the node.kubernetes.io/… namespace) if you want to keep using selectors that are vulnerable to node self-labeling
kubectl get nodes can also display custom labels as well (see the --show-labels or --label-columns options)

This issue will track this feature.
Related issues: #2296 and #2197

Please post more information about your use cases here.

@TBBle
Copy link
Contributor

TBBle commented Jun 23, 2020

My use-case is that I wanted to be able to designate other node groups with the same labels and taints as used to designate a master node, and hence take advantage of the 'role' column in kubectl get pods.

In the meantime, we've gone with the workaround quoted above, of using a label/taint from our own namespace, and using -L with kubectl get pods. I'm not too worried about the risk of someone running a rogue kubelet and stealing work from our nodes, because this is a development/testing cluster, but I'd be much more concerned about it in a production environment.


For reference, I posted a feature request for node-labelling support in the AWS Cloud Provider based on AWS tags, although on a surface investigation it might not have the right hooks to solve it without deeper k8s changes.

I'm hoping that either someone more-familiar will spot that AWS Cloud Provider can easily do this, so we perhaps get it fixed for k8s 1.20, or we'll decide that's the wrong place, and discussion will move towards some sort of other solution, that could be perhaps deployed sooner.

One of the possible solutions would be just a custom controller that applies the labels based on AWS tags, but this has risk of conflict with the Cluster Autoscaler, as during the time between Node appearing and the labelling-controller labelling it, the scheduler may fail to assign it a workload (due to lack of the appropriate label for nodeSelector), then CA may notice these pods are still not scheduled, and scale-up the Auto Scaling Group for a node which you didn't need.

This delay in labelling causing extra scale-up is the same issue described in the Cluster Autoscaler AWS instructions regarding GPU instances.

It's also worth noting that AWS tags might not be a secure idea source of labels, as someone who can run a rogue EC2 node in your VPC can probably also set their own tags on it. I don't have a better storage/datasource around though.

@github-actions
Copy link
Contributor

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

@github-actions
Copy link
Contributor

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

@github-actions github-actions bot added the stale label Feb 19, 2021
@michaelbeaumont michaelbeaumont added priority/important-longterm Important over the long term, but may not be currently staffed and/or may require multiple releases and removed priority/backlog Not staffed at the moment. Help wanted. stale labels Feb 19, 2021
@acjohnson
Copy link

acjohnson commented Apr 27, 2021

It would be nice if we could use the standard convention label names and for taints in eksctl
(below example doesn't work but should)

    labels:
      node-role.kubernetes.io/gitlab-worker: true
    taints:
      node-role.kubernetes.io/gitlab-worker: "true:NoSchedule"

Just seems strange this doesn't work as all k8s doc says to use labels/taints in this way

@TBBle
Copy link
Contributor

TBBle commented Apr 27, 2021

The k8s docs about the NodeRestriction Admission Controller explain why the above doesn't work for labels when they are set by the node itself (the current mechanism used by eksctl). This is the problem this ticket is tracking.

On the other hand, the taint should work, as they are not so-restricted, so if that's not working, that is probably a different issue.

@acjohnson
Copy link

I hit #2296 which says discussion is taking place here

@TBBle
Copy link
Contributor

TBBle commented Apr 27, 2021

For labels outside the NodeRestriction Admission Controller's supported list, this is the ticket, yes.

There's not a huge amount to discuss about the problem (it's a deliberate change by k8s for clear reasons), but no one has yet tried to implement a solution in eksctl; there's some interest in implementing a solution in kubernetes/cloud-provider-aws#110 (and then eksctl could be updated to use that) but it's not an easy problem to solve.

If you're having problems with the taint in your example not applying, then that's a totally separate problem, as the underlying change this ticket is tracking only affects labels.

@adv4000
Copy link

adv4000 commented Apr 30, 2021

Seems like prefix node-role.kubernetes.io restricted, I just removed it and rest of labels applied.

@acjohnson
Copy link

right, but why is node-role.kubernetes.io restricted? Isn't that the standard convention for how kubectl determines node roles?

@TBBle
Copy link
Contributor

TBBle commented May 1, 2021

It is the standard convention for determining node roles, and per kubernetes/kubernetes#84912, nodes are not allowed to self-assign roles.

Note that the link in that comment is now incorrect, it's moved to https://github.com/kubernetes/enhancements/tree/master/keps/sig-auth/279-limit-node-access

For example, this change ensures that nodes can't self-assign the node-role "master", and start having control-plane pods assigned to them, bypassing restrictions on Secrets or other resources by having direct access to etcd etc.

@acjohnson
Copy link

ah I think I understand now. So if I want to make use of standard convention node labels I'll need to apply like kubectl label nodes ip-172-27-121-8.ec2.internal node-role.kubernetes.io/worker=true on my eks cluster nodes as a post provision step.

@TBBle
Copy link
Contributor

TBBle commented May 2, 2021

That's correct. At some future point, i.e. when this ticket is resolved, eksctl should be able to apply restricted labels to nodes from the same config that currently works for unrestricted labels, but right now it has no mechanism to do that, so we're blocked until someone (perhaps kubernetes/cloud-provider-aws#110) implements such a mechanism, and then we can use it from eksctl.

As noted at #2197 (comment), manually labelling the nodes won't scale if you have the Cluster Autoscaler or similar creating nodes on-demand, as those nodes won't have the labels until you manually add them later.

@Skarlso Skarlso self-assigned this Oct 12, 2021
@Skarlso
Copy link
Contributor

Skarlso commented Oct 12, 2021

Let's see where this goes.

@Skarlso
Copy link
Contributor

Skarlso commented Oct 12, 2021

Without proper support, this won't go too far. Just adding labels after start doesn't help because of the persisting thing. Having a controller doesn't help because of timing. We can't add labels during registration either, so I'm struggling to find a good answer for this.

It's on our radar, but I don't think there is too much we can do right now.

@mrballcb
Copy link

Just adding labels after start doesn't help because of the persisting thing.

I'm missing something obvious. What's the problem with persistence? How does it affect or impact the node labeling itself with a separate shell script after it boots? Or is that the point, that the node should not have a kubeconfig.yaml present with sufficient auth that would let it do this restricted modification of itself in the first place?

@Skarlso
Copy link
Contributor

Skarlso commented Nov 10, 2021

@mrballcb "persisting thing" -> once the node cycles out by the autoscaling group the label is gone and will never be re-applied to a new node.

@TBBle
Copy link
Contributor

TBBle commented Nov 17, 2021

@mrballcb

the node should not have a kubeconfig.yaml present with sufficient auth that would let it do this restricted modification of itself in the first place?

Yes, that's the core issue. The node should only have sufficient data to authenticate in the system:nodes group, and that group is restricted from setting any kubernetes-managed labels with specific exceptions.

@matti
Copy link
Contributor

matti commented Mar 16, 2022

here's my workaround:

https://github.com/matti/k8s-node-watchdog/blob/cd8733d3a82f8b917685c9d906f74595a7bae97b/app/entrypoint.sh#L56-L71

it makes listing nodes much more usable for example in Lens:

image

@TBBle
Copy link
Contributor

TBBle commented Mar 16, 2022

Interesting workaround. It appears to be running a privileged daemonset on every node with total admin privileges for your k8s cluster?

I'd suggest at the very least trimming that ClusterRole down to the privileges it needs (in this case, editing the node labels) to reduce the risk to your cluster if this pod gets compromised.

@matti
Copy link
Contributor

matti commented Mar 16, 2022

yes, but I like to live dangerously and get things done.

@Himangini Himangini added priority/backlog Not staffed at the moment. Help wanted. and removed priority/important-longterm Important over the long term, but may not be currently staffed and/or may require multiple releases labels Sep 28, 2022
@github-actions
Copy link
Contributor

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

@github-actions github-actions bot added the stale label Oct 29, 2022
@matti
Copy link
Contributor

matti commented Oct 29, 2022

asdf

@github-actions github-actions bot removed the stale label Oct 30, 2022
@github-actions
Copy link
Contributor

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

@github-actions github-actions bot added the stale label Nov 30, 2022
@matti
Copy link
Contributor

matti commented Nov 30, 2022

ghjk

@cPu1 cPu1 added priority/important-longterm Important over the long term, but may not be currently staffed and/or may require multiple releases and removed stale labels Nov 30, 2022
@FreddieMcHeart
Copy link

Is there any updates?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/nodegroup kind/feature New feature or request priority/backlog Not staffed at the moment. Help wanted. priority/blocked priority/important-longterm Important over the long term, but may not be currently staffed and/or may require multiple releases
Projects
None yet
Development

No branches or pull requests

10 participants