Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runOnControlPlane/runOnMaster: true make the controller unschedulable #861

Closed
janfrederik opened this issue Nov 8, 2024 · 8 comments · Fixed by #882 or #886
Closed

runOnControlPlane/runOnMaster: true make the controller unschedulable #861

janfrederik opened this issue Nov 8, 2024 · 8 comments · Fixed by #882 or #886

Comments

@janfrederik
Copy link

janfrederik commented Nov 8, 2024

What happened:
Setting controller.runOnControlPlane: true makes the controller unschedulable.
The same applies to controller.runOnMaster.

This is the same problem as kubernetes-csi/csi-driver-nfs#787.

What you expected to happen:
The controller to get scheduled

How to reproduce it:

  • k3s 1.30
  • install csi-driver-smb v1.16.0 with following values.yaml
controller:
  runOnControlPlane: true

Environment:

  • CSI Driver version: 1.16.0
  • Kubernetes version (use kubectl version): v1.30.6+k3s1
  • OS (e.g. from /etc/os-release): openSUSE MicroOS 20241104
  • Kernel (e.g. uname -a): 6.11.5-2-default
@janfrederik
Copy link
Author

janfrederik commented Nov 8, 2024

I see the control plane nodes have label node-role.kubernetes.io/control-plane=true.
The nodeSelector is node-role.kubernetes.io/control-plane: ""

These doesn't match by the way kubernetes matches labels to selectors.

We probably want a simple selector of type "exists", without specifying a value: node-role.kubernetes.io/control-plane. However, this cannot be done with nodeSelector, but needs Affinity:

template:
  spec:
    affinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: node-role.kubernetes.io/control-plane
            operator: Exists
          - key: node-role.kubernetes.io/master
            operator: Exists

References:

@janfrederik janfrederik changed the title runOnControlPlane: true makes the controller unschedulable runOnControlPlane/runOnMaster: true makes the controller unschedulable Nov 8, 2024
@janfrederik janfrederik changed the title runOnControlPlane/runOnMaster: true makes the controller unschedulable runOnControlPlane/runOnMaster: true make the controller unschedulable Nov 8, 2024
@andyzhangx
Copy link
Member

andyzhangx commented Nov 10, 2024

there are other k8s clusters that the master node has label node-role.kubernetes.io/control-plane= with (empty value), so this one controller.runOnControlPlane cannot match the two conditions, if we fix one, then another breaks, and we already have tolerations defined here

  tolerations:
    - key: "node-role.kubernetes.io/master"
      operator: "Exists"
      effect: "NoSchedule"
    - key: "node-role.kubernetes.io/controlplane"
      operator: "Exists"
      effect: "NoSchedule"
    - key: "node-role.kubernetes.io/control-plane"
      operator: "Exists"
      effect: "NoSchedule"
    - key: "CriticalAddonsOnly"
      operator: "Exists"
      effect: "NoSchedule"

if you want to make this csi driver controller runs on the master node matching specific label, you could define controller.nodeSelector in helm chart install.

@janfrederik
Copy link
Author

janfrederik commented Nov 11, 2024

Wouldn't we simple remove controller.runOnControlPlane and controller.runOnMaster from values.yaml because

  • it doesn't work reliably anyway (in some cases);
  • the use of the labels node-role.kubernetes.io/control-plane and node-role.kubernetes.io/master is not exactly the same in all k8s cluster implementation (according to @andyzhangx above);
  • values.yaml already provides already controller.nodeSelector and controller.affinity for specifying node affinity;
  • it is impossible to blindly merge user provided affinity clauses with automatically generated affinity clauses, respecting the users intended and/or logic.

Instead, we can add some examples to the doc or as comments in values.yaml on how to use nodeSelector or affinity for the cases of runOnControlPlane and runOnMaster.

@scipsycho
Copy link

Either that or we should mention it in the docs that it may not work for some implementations.

@andyzhangx
Copy link
Member

I have fixed this issue by using

    affinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: node-role.kubernetes.io/control-plane
            operator: Exists

@janfrederik
Copy link
Author

@andyzhangx Thank you!

Note that exactly the same problem exists with runOnMaster. So it is probably a good idea to add the same fix for that one.

@janfrederik
Copy link
Author

I think it is useful to mention in the docs and as a comment in the values.yaml that runOnControlPlane=true only has an effect if the user doesn't have an affinity-block in their values.yaml.

@janfrederik
Copy link
Author

@andyzhangx Super, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants