Kubernetes Master Failed : FAILED - RETRYING: Create kubeadm token for joining nodes with 24h expiration #5227

shrutishete · 2019-09-30T11:51:53Z

Environment:

Cloud provider or hardware configuration:
**OS (printf "$(uname -srm)\n$(cat /etc/os-release)\n"): "Ubuntu 16.04.3 LTS"
Version of Ansible (ansible --version): ansible 2.7.12

**Kubespray version (commit) (git rev-parse --short HEAD): 8712bdd

Network plugin used: calico

Copy of your inventory file:

all]
master-1 ansible_host=161.92.248.32 ip=161.92.248.32 ansible_user=philips ansible_sudo=yes
worker-1 ansible_host=161.92.248.33 ip=161.92.248.33 ansible_user=philips ansible_sudo=yes

[kube-master]
master-1

[kube-node]
worker-1

[etcd]
master-1

[k8s-cluster:children]
kube-master
kube-node

Command used to invoke ansible:
ansible-playbook -b --ask-become-pass --become-user=root -i inventory/mycluster/inventory.ini cluster.yml

Output of ansible run:

TASK [kubernetes/master : Create kubeadm token for joining nodes with 24h expiration (default)] **********************************************************************************************
Monday 30 September 2019 16:39:05 +0530 (0:00:00.100) 0:03:49.003 ******
FAILED - RETRYING: Create kubeadm token for joining nodes with 24h expiration (default) (5 retries left).
FAILED - RETRYING: Create kubeadm token for joining nodes with 24h expiration (default) (4 retries left).
FAILED - RETRYING: Create kubeadm token for joining nodes with 24h expiration (default) (3 retries left).
FAILED - RETRYING: Create kubeadm token for joining nodes with 24h expiration (default) (2 retries left).
FAILED - RETRYING: Create kubeadm token for joining nodes with 24h expiration (default) (1 retries left).
fatal: [master-1 -> 161.92.248.32]: FAILED! => {"attempts": 5, "changed": true, "cmd": ["/opt/bin/kubeadm", "--kubeconfig", "/etc/kubernetes/admin.conf", "token", "create"], "delta": "0:01:15.204831", "end": "2019-09-30 16:47:09.210040", "msg": "non-zero return code", "rc": 1, "start": "2019-09-30 16:45:54.005209", "stderr": "timed out waiting for the condition", "stderr_lines": ["timed out waiting for the condition"], "stdout": "", "stdout_lines": []}

The text was updated successfully, but these errors were encountered:

shrutishete · 2019-09-30T12:34:41Z

On running this /opt/bin/kubeadm --kubeconfig /etc/kubernetes/admin.conf token create
getting the following output :

I0930 18:00:52.823950 13595 token.go:115] [token] validating mixed arguments
I0930 18:00:52.823994 13595 token.go:122] [token] getting Clientsets from kubeconfig file
I0930 18:00:52.824900 13595 loader.go:359] Config loaded from file: /etc/kubernetes/admin.conf
I0930 18:00:52.825433 13595 token.go:221] [token] loading configurations
I0930 18:00:52.825648 13595 interface.go:384] Looking for default routes with IPv4 addresses
I0930 18:00:52.825658 13595 interface.go:389] Default route transits interface "ens160"
I0930 18:00:52.825850 13595 interface.go:196] Interface ens160 is up
I0930 18:00:52.825895 13595 interface.go:244] Interface "ens160" has 2 addresses :[161.92.248.32/24 fe80::955e:4706:d886:670f/64].
I0930 18:00:52.825913 13595 interface.go:211] Checking addr 161.92.248.32/24.
I0930 18:00:52.825920 13595 interface.go:218] IP found 161.92.248.32
I0930 18:00:52.825926 13595 interface.go:250] Found valid IPv4 address 161.92.248.32 for interface "ens160".
I0930 18:00:52.825931 13595 interface.go:395] Found active IP 161.92.248.32
I0930 18:00:52.826111 13595 feature_gate.go:216] feature gates: &{map[]}
I0930 18:00:52.826129 13595 token.go:233] [token] creating token
I0930 18:00:52.826187 13595 round_trippers.go:419] curl -k -v -XGET -H "Accept: application/json, /" -H "User-Agent: kubeadm/v1.15.3 (linux/amd64) kubernetes/2d3c76f" 'https://lb-apiserver.kubernetes.local:6443/api/v1/namespaces/kube-system/secrets/bootstrap-token-w27knb'
I0930 18:00:52.843460 13595 round_trippers.go:438] GET https://lb-apiserver.kubernetes.local:6443/api/v1/namespaces/kube-system/secrets/bootstrap-token-w27knb in 17 milliseconds
I0930 18:00:52.843493 13595 round_trippers.go:444] Response Headers:
I0930 18:00:52.843916 13595 request.go:947] Request Body: {"kind":"Secret","apiVersion":"v1","metadata":{"name":"bootstrap-token-w27knb","namespace":"kube-system","creationTimestamp":null},"data":{"auth-extra-groups":"c3lzdGVtOmJvb3RzdHJhcHBlcnM6a3ViZWFkbTpkZWZhdWx0LW5vZGUtdG9rZW4=","expiration":"MjAxOS0xMC0wMVQxODowMDo1MiswNTozMA==","token-id":"dzI3a25i","token-secret":"b2c1eWdtaHAxbHh0aDlzaQ==","usage-bootstrap-authentication":"dHJ1ZQ==","usage-bootstrap-signing":"dHJ1ZQ=="},"type":"bootstrap.kubernetes.io/token"}
I0930 18:00:52.843977 13595 round_trippers.go:419] curl -k -v -XPOST -H "Accept: application/json, /" -H "Content-Type: application/json" -H "User-Agent: kubeadm/v1.15.3 (linux/amd64) kubernetes/2d3c76f" 'https://lb-apiserver.kubernetes.local:6443/api/v1/namespaces/kube-system/secrets'
I0930 18:00:52.867204 13595 round_trippers.go:438] POST https://lb-apiserver.kubernetes.local:6443/api/v1/namespaces/kube-system/secrets in 23 milliseconds

atooki · 2019-11-04T18:01:19Z

I'm having this problem as well Ubuntu verstion 18.0.4 LTS

my host.ini file:
[all]
node01 ansible_host=10.44.16.3 ip=10.44.16.3
node02 ansible_host=10.44.16.1 ip=10.44.16.1
node03 ansible_host=10.44.16.2 ip=10.44.16.2
node04 ansible_host=10.44.16.4 ip=10.44.16.4

[kube-master]
node01

[etcd]
node01
node02
node03

[kube-node]
node02
node03
node04

[calico-rr]

[k8s-cluster:children]
kube-master
kube-node

RiaanLab · 2019-12-08T22:52:07Z

+1

samchal · 2020-02-11T21:01:35Z

I've also seen this problem with the kubeadm token occur after upgrading the underlying OS to Ubuntu 18.04.4 and adding a new node using the scale.yml playbook (using kubespray 2.11.0)

Looking at the logs on the nodes using journalctl -u kubelet on any of the nodes (including the master) it appears that there was a mismatch between the cgroup driver being using by docker and kubelet, preventing kubelet from starting up correctly:

Feb 10 22:23:04 lnx-node1 kubelet[4244]: F0210 22:23:04.280596 4244 server.go:273] failed to run Kubelet: failed to create kubelet: misconfiguration: kubelet cgroup driver: "systemd" is different from docker cgroup driver: "cgroupfs"

I fixed this by setting cgroupDriver: cgroupfs in /etc/kubernetes/kubelet-config.yaml on each cluster node and restarting the cluster. All the nodes including the master then start up OK after being rebooted. Re-running the scale playbook then works, but you need to repeat the change to set cgroupDriver: cgroupfs in /etc/kubernetes/kubelet-config.yaml after running the playbook. An ansible task for this is as follows:

    - name: Modify cgroupdriver
      become: true
      lineinfile:
        dest: /etc/kubernetes/kubelet-config.yaml
        regexp: '^cgroupDriver:'
        line: 'cgroupDriver: cgroupfs'
        state: present

Possible root cause: It appears that the docker info on Ubuntu 18.04.4 appends the message WARNING: no swap limit support, which I suspect may not be handled by kubespray when detecting the cgroup driver (see playbook roles/kubernetes/node/tasks/facts.yml).

The warning message can be resolved by updating the kernel command line boot options and updating grub as described the end of the article at https://docs.docker.com/install/linux/linux-postinstall/ under the section on Your kernel does not support cgroup swap limit capabilities.

I haven't tried re-running the playbook to confirm if this fixes the issue as manually setting the cgroupDriver in /etc/kubernetes/kubelet-config.yaml as described above worked OK for me.

KouriR · 2020-02-25T00:38:05Z

We also ran into the kubelet cgroup driver issue on Ubuntu 18 and Kubespray 2.11.0, and the fix samchal mentioned did work for us, but that was unrelated to the kubeadm token issue.

In our case, we were trying to reconfigure the cluster with an apiserver_loadbalancer_domain_name, and kubeadm --kubeconfig /etc/kubernetes/admin.conf was timing out because admin.conf was trying to use the new apiserver_loadbalancer_domain_name value, which was not yet in the apiserver cert's SAN list, so the validation failed.
The fix for this issue is documented here: kubernetes/kubeadm#1447 (comment)

fejta-bot · 2020-05-25T00:56:25Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

fejta-bot · 2020-06-24T01:35:46Z

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

fejta-bot · 2020-07-24T02:16:42Z

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

k8s-ci-robot · 2020-07-24T02:16:49Z

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

GSalah · 2020-10-15T20:08:20Z

Don't run the ansible-playbook from a master node use another separate VM to manipulate your cluster

ognjen-it · 2021-07-20T21:25:28Z

I resolved it when I did:

ansible-playbook -i inventory/mycluster/inventory.ini --become --user=root --become-user=root reset.yml -e ansible_python_interpreter=/usr/bin/python3
and then:
ansible-playbook -i inventory/mycluster/inventory.ini --become --user=root --become-user=root cluster.yml -e ansible_python_interpreter=/usr/bin/python3

shrutishete added the kind/bug Categorizes issue or PR as related to a bug. label Sep 30, 2019

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 25, 2020

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jun 24, 2020

k8s-ci-robot closed this as completed Jul 24, 2020

ledroide mentioned this issue Sep 4, 2020

TASK [kubernetes/master : Create kubeadm token for joining nodes with 24h expiration (default)] #5216

Closed

siaimes mentioned this issue May 16, 2022

Errors occurred after Are your cluster is in Azure cloud or not？ microsoft/pai#5776

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Kubernetes Master Failed : FAILED - RETRYING: Create kubeadm token for joining nodes with 24h expiration #5227

Kubernetes Master Failed : FAILED - RETRYING: Create kubeadm token for joining nodes with 24h expiration #5227

shrutishete commented Sep 30, 2019

shrutishete commented Sep 30, 2019

atooki commented Nov 4, 2019

RiaanLab commented Dec 8, 2019

samchal commented Feb 11, 2020 •

edited

Loading

KouriR commented Feb 25, 2020

fejta-bot commented May 25, 2020

fejta-bot commented Jun 24, 2020

fejta-bot commented Jul 24, 2020

k8s-ci-robot commented Jul 24, 2020

GSalah commented Oct 15, 2020

ognjen-it commented Jul 20, 2021

Kubernetes Master Failed : FAILED - RETRYING: Create kubeadm token for joining nodes with 24h expiration #5227

Kubernetes Master Failed : FAILED - RETRYING: Create kubeadm token for joining nodes with 24h expiration #5227

Comments

shrutishete commented Sep 30, 2019

shrutishete commented Sep 30, 2019

atooki commented Nov 4, 2019

RiaanLab commented Dec 8, 2019

samchal commented Feb 11, 2020 • edited Loading

KouriR commented Feb 25, 2020

fejta-bot commented May 25, 2020

fejta-bot commented Jun 24, 2020

fejta-bot commented Jul 24, 2020

k8s-ci-robot commented Jul 24, 2020

GSalah commented Oct 15, 2020

ognjen-it commented Jul 20, 2021

samchal commented Feb 11, 2020 •

edited

Loading