Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(kubernetes): taint nodes on cluster upgrade #10705

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/vars.md
Original file line number Diff line number Diff line change
Expand Up @@ -245,7 +245,7 @@ node_labels:
label2_name: label2_value
```
* *node_taints* - Taints applied to nodes via kubelet --register-with-taints parameter.
* *node_taints* - Taints applied to nodes via `kubectl taint node`.
For example, taints can be set in the inventory as variables or more widely in group_vars.
*node_taints* has to be defined as a list of strings in format `key=value:effect`, e.g.:

Expand Down
1 change: 1 addition & 0 deletions playbooks/cluster.yml
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,7 @@
- { role: kubespray-defaults }
- { role: kubernetes/kubeadm, tags: kubeadm}
- { role: kubernetes/node-label, tags: node-label }
- { role: kubernetes/node-taint, tags: node-taint }
- { role: network_plugin, tags: network }
- { role: kubernetes-apps/kubelet-csr-approver, tags: kubelet-csr-approver }

Expand Down
1 change: 1 addition & 0 deletions playbooks/scale.yml
Original file line number Diff line number Diff line change
Expand Up @@ -91,6 +91,7 @@
- { role: kubespray-defaults }
- { role: kubernetes/kubeadm, tags: kubeadm }
- { role: kubernetes/node-label, tags: node-label }
- { role: kubernetes/node-taint, tags: node-taint }
- { role: network_plugin, tags: network }

- name: Apply resolv.conf changes now that cluster DNS is up
Expand Down
2 changes: 2 additions & 0 deletions playbooks/upgrade_cluster.yml
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,7 @@
- { role: kubernetes/control-plane, tags: master, upgrade_cluster_setup: true }
- { role: kubernetes/client, tags: client }
- { role: kubernetes/node-label, tags: node-label }
- { role: kubernetes/node-taint, tags: node-taint }
- { role: kubernetes-apps/cluster_roles, tags: cluster-roles }
- { role: kubernetes-apps, tags: csi-driver }
- { role: upgrade/post-upgrade, tags: post-upgrade }
Expand Down Expand Up @@ -87,6 +88,7 @@
- { role: kubernetes/node, tags: node }
- { role: kubernetes/kubeadm, tags: kubeadm }
- { role: kubernetes/node-label, tags: node-label }
- { role: kubernetes/node-taint, tags: node-taint }
- { role: upgrade/post-upgrade, tags: post-upgrade }

- name: Patch Kubernetes for Windows
Expand Down
35 changes: 35 additions & 0 deletions roles/kubernetes/node-taint/tasks/main.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
---
- name: Set role and inventory node taint to empty list
set_fact:
role_node_taints: []
inventory_node_taints: []

- name: Node taint for nvidia GPU nodes
set_fact:
role_node_taints: "{{ role_node_taints + ['nvidia.com/gpu=:NoSchedule'] }}"
when:
- nvidia_gpu_nodes is defined
- nvidia_accelerator_enabled | bool
- inventory_hostname in nvidia_gpu_nodes

- name: Populate inventory node taint
set_fact:
inventory_node_taints: "{{ inventory_node_taints + ['%s' | format(item)] }}"
loop: "{{ node_taints | d([]) }}"
when:
- node_taints is defined
- node_taints is not string
- node_taints is not mapping
- node_taints is iterable
- debug: # noqa name[missing]
var: role_node_taints
- debug: # noqa name[missing]
var: inventory_node_taints

- name: Set taint to node
command: >-
{{ kubectl }} taint node {{ kube_override_hostname | default(inventory_hostname) }} {{ (role_node_taints + inventory_node_taints) | join(' ') }} --overwrite=true
delegate_to: "{{ groups['kube_control_plane'][0] }}"
changed_when: false
when:
- (role_node_taints + inventory_node_taints) | length > 0
12 changes: 1 addition & 11 deletions roles/kubernetes/node/templates/kubelet.env.v1beta1.j2
Original file line number Diff line number Diff line change
Expand Up @@ -15,17 +15,7 @@ KUBELET_HOSTNAME="--hostname-override={{ kube_override_hostname }}"
--runtime-cgroups={{ kubelet_runtime_cgroups }} \
{% endset %}

{# Kubelet node taints for gpu #}
{% if nvidia_gpu_nodes is defined and nvidia_accelerator_enabled|bool %}
{% if inventory_hostname in nvidia_gpu_nodes and node_taints is defined %}
{% set dummy = node_taints.append('nvidia.com/gpu=:NoSchedule') %}
{% elif inventory_hostname in nvidia_gpu_nodes and node_taints is not defined %}
{% set node_taints = [] %}
{% set dummy = node_taints.append('nvidia.com/gpu=:NoSchedule') %}
{% endif %}
{% endif %}

KUBELET_ARGS="{{ kubelet_args_base }} {% if node_taints|default([]) %}--register-with-taints={{ node_taints | join(',') }} {% endif %} {% if kubelet_custom_flags is string %} {{kubelet_custom_flags}} {% else %}{% for flag in kubelet_custom_flags %} {{flag}} {% endfor %}{% endif %}{% if inventory_hostname in groups['kube_node'] %}{% if kubelet_node_custom_flags is string %} {{kubelet_node_custom_flags}} {% else %}{% for flag in kubelet_node_custom_flags %} {{flag}} {% endfor %}{% endif %}{% endif %}"
KUBELET_ARGS="{{ kubelet_args_base }} {% if kubelet_custom_flags is string %} {{kubelet_custom_flags}} {% else %}{% for flag in kubelet_custom_flags %} {{flag}} {% endfor %}{% endif %}{% if inventory_hostname in groups['kube_node'] %}{% if kubelet_node_custom_flags is string %} {{kubelet_node_custom_flags}} {% else %}{% for flag in kubelet_node_custom_flags %} {{flag}} {% endfor %}{% endif %}{% endif %}"
{% if kubelet_flexvolumes_plugins_dir is defined %}
KUBELET_VOLUME_PLUGIN="--volume-plugin-dir={{ kubelet_flexvolumes_plugins_dir }}"
{% endif %}
Expand Down