Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cluster upgrade to v1.30.2 fails on "Upgrade first Control Plane" #11350

Closed
bogd opened this issue Jul 3, 2024 · 12 comments · Fixed by #11352
Closed

Cluster upgrade to v1.30.2 fails on "Upgrade first Control Plane" #11350

bogd opened this issue Jul 3, 2024 · 12 comments · Fixed by #11352
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@bogd
Copy link

bogd commented Jul 3, 2024

What happened?

Attempted to upgrade a cluster from v1.29.3 to v1.30.2. The upgrade playbook fails on kubeadm upgrade apply, with error can not mix '--config' with arguments [allow-experimental-upgrades certificate-renewal etcd-upgrade force yes], in this task:

TASK [kubernetes/control-plane : Kubeadm | Upgrade first master] ************************************************
Wednesday 03 July 2024  17:18:34 +0000 (0:00:01.906)       0:31:56.562 ******** 
FAILED - RETRYING: [k8s-staging-01-master]: Kubeadm | Upgrade first master (3 retries left).
FAILED - RETRYING: [k8s-staging-01-master]: Kubeadm | Upgrade first master (2 retries left).
FAILED - RETRYING: [k8s-staging-01-master]: Kubeadm | Upgrade first master (1 retries left).
fatal: [k8s-staging-01-master]: FAILED! => {"attempts": 3, "changed": true, "cmd": ["timeout", "-k", "600s", "600s", "/usr/local/bin/kubeadm", "upgrade", "apply", "-y", "v1.30.2", "--certificate-renewal=True", "--config=/etc/kubernetes/kubeadm-config.yaml", "--ignore-preflight-errors=all", "--allow-experimental-upgrades", "--etcd-upgrade=false", "--force"], "delta": "0:00:00.083731", "end": "2024-07-03 17:18:55.750605", "failed_when_result": true, "msg": "non-zero return code", "rc": 1, "start": "2024-07-03 17:18:55.666874", "stderr": "can not mix '--config' with arguments [allow-experimental-upgrades certificate-renewal etcd-upgrade force yes]\nTo see the stack trace of this error execute with --v=5 or higher", "stderr_lines": ["can not mix '--config' with arguments [allow-experimental-upgrades certificate-renewal etcd-upgrade force yes]", "To see the stack trace of this error execute with --v=5 or higher"], "stdout": "", "stdout_lines": []}

What did you expect to happen?

Successful upgrade of the cluster

How can we reproduce it (as minimally and precisely as possible)?

Attempt to upgrade cluster from v.1.29 to v1.30

OS

Linux 5.15.0-113-generic x86_64
PRETTY_NAME="Ubuntu 22.04.4 LTS"
NAME="Ubuntu"
VERSION_ID="22.04"
VERSION="22.04.4 LTS (Jammy Jellyfish)"
VERSION_CODENAME=jammy
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=jammy

Version of Ansible

ansible [core 2.16.8]
  config file = None
  configured module search path = ['/root/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /usr/local/lib/python3.12/dist-packages/ansible
  ansible collection location = /root/.ansible/collections:/usr/share/ansible/collections
  executable location = /usr/local/bin/ansible
  python version = 3.12.3 (main, Apr 10 2024, 05:33:47) [GCC 13.2.0] (/usr/bin/python3)
  jinja version = 3.1.4
  libyaml = True

Version of Python

python version = 3.12.3

Version of Kubespray (commit)

474b259

Network plugin used

calico

Full inventory with variables

[ Removed, since it was huge and was making the issue difficult to read. Will provide a gist on request, if needed ]

Command used to invoke ansible

ansible-playbook on custom playbook that imports kubespray/playbooks/upgrade_cluster.yml

Output of ansible run

TASK [kubernetes/control-plane : Kubeadm | Upgrade first master] ************************************************
Wednesday 03 July 2024  17:18:34 +0000 (0:00:01.906)       0:31:56.562 ******** 
FAILED - RETRYING: [k8s-staging-01-master]: Kubeadm | Upgrade first master (3 retries left).
FAILED - RETRYING: [k8s-staging-01-master]: Kubeadm | Upgrade first master (2 retries left).
FAILED - RETRYING: [k8s-staging-01-master]: Kubeadm | Upgrade first master (1 retries left).
fatal: [k8s-staging-01-master]: FAILED! => {"attempts": 3, "changed": true, "cmd": ["timeout", "-k", "600s", "600s", "/usr/local/bin/kubeadm", "upgrade", "apply", "-y", "v1.30.2", "--certificate-renewal=True", "--config=/etc/kubernetes/kubeadm-config.yaml", "--ignore-preflight-errors=all", "--allow-experimental-upgrades", "--etcd-upgrade=false", "--force"], "delta": "0:00:00.083731", "end": "2024-07-03 17:18:55.750605", "failed_when_result": true, "msg": "non-zero return code", "rc": 1, "start": "2024-07-03 17:18:55.666874", "stderr": "can not mix '--config' with arguments [allow-experimental-upgrades certificate-renewal etcd-upgrade force yes]\nTo see the stack trace of this error execute with --v=5 or higher", "stderr_lines": ["can not mix '--config' with arguments [allow-experimental-upgrades certificate-renewal etcd-upgrade force yes]", "To see the stack trace of this error execute with --v=5 or higher"], "stdout": "", "stdout_lines": []}

Anything else we need to know

No response

@bogd bogd added the kind/bug Categorizes issue or PR as related to a bug. label Jul 3, 2024
@tmurakam
Copy link
Contributor

tmurakam commented Jul 3, 2024

Hmm.. It seems followng error is root cause.
can not mix '--config' with arguments [allow-experimental-upgrades certificate-renewal etcd-upgrade force yes]
I think we need fix kubeadm-upgrade.yml.

@bogd
Copy link
Author

bogd commented Jul 4, 2024

This seems to be a recent change (possibly as recent as K8s v1.30?) - not allowing any configuration-changing flags on upgrade.

I cannot find it in the release notes, but see for example here, and here (which is specifically related to --yes )

@tmurakam
Copy link
Contributor

tmurakam commented Jul 4, 2024

I think we need to upgrade the kubeadm configration from v1beta3 to v1beta4, and configure UpgradeApplyConfiguration instead of arguments.
https://kubernetes.io/docs/reference/config-api/kubeadm-config.v1beta4/

  • UpgradeApplyConfiguration
    • allowExperimentalUpgrades
    • certificateRenewal
    • etcdUpgrade
    • forceUpgrade

But it seems that the v1beta4 is not supported yet.

@tmurakam
Copy link
Contributor

tmurakam commented Jul 4, 2024

I asked a question at kubernetes/kubeadm#3084 (comment)

@tmurakam
Copy link
Contributor

tmurakam commented Jul 4, 2024

I got a answer kubernetes/kubeadm#3084 (comment)

I think we need to remove --config option from kubeadm upgrade.
Do you all have any concerns to remove the option?

tmurakam added a commit to tmurakam/kubespray that referenced this issue Jul 4, 2024
We can't mix some options with --config for kubeadm upgrade.
The --config on upgrade is deprecated, and should be removed.
@tmurakam
Copy link
Contributor

tmurakam commented Jul 5, 2024

I opened a PR to fix this.

@zzvara
Copy link

zzvara commented Jul 29, 2024

Kubespray master is broken because of this issue.

@ledroide
Copy link
Contributor

ledroide commented Aug 5, 2024

I confirm the same issue with master at commit dd51ef6.

The fix from @tmurakam worked fine for me.

@ccureau
Copy link

ccureau commented Aug 10, 2024

I can also confirm the PR mentioned above works. I created a new cluster this morning, and then upgraded it after.

@yankay yankay changed the title Cluster upgrade to v1.30.2 fails on "Upgrade first master" Cluster upgrade to v1.30.2 fails on "Upgrade first Control Plane" Aug 16, 2024
cheetahfox added a commit to cheetahfox/kubespray that referenced this issue Aug 22, 2024
@ArnCo
Copy link

ArnCo commented Aug 28, 2024

The referenced PR has the side effect that the variables that are modified in the kubeadm-config file are not reflected in the manifests anymore.
Example: modify the kube_scheduler_bind variable in the playbook. The variable is correctly set in the kubeadm-config.yaml file, but the according kube-scheduler.yaml manifest is not modified, thus configuration is not applied.

This is, in my opinion, a regression.

@tmurakam
Copy link
Contributor

tmurakam commented Aug 28, 2024

@ArnCo
I think we can't change configuration on upgrading anymore because kubeadm does not accept kubeadm-config.yaml file on upgrading.
If we want to change the configuration, I think we need to run kubespray with new configurations without upgrading first, then upgrade the cluster without configuration changes.
Please let me know if there is a better way.

@ArnCo
Copy link

ArnCo commented Aug 28, 2024

@tmurakam Well I'm fiddling with our cluster right now. It seems that the kubeadm upgrade command was not meant to reconfigure the cluster, my bad.
To apply the changes to our cluster, I backed-up the /etc/kubernetes folder and run
kubeadm init phase control-plane scheduler --config /etc/kubernetes/kubeadm-config.yaml

This had the effect to update the manifests and consequently my changes. Right now, I think that Kubespray does not execute kubeadm init if the manifest files already exist.

k8s-ci-robot pushed a commit that referenced this issue Aug 29, 2024
We can't mix some options with --config for kubeadm upgrade.
The --config on upgrade is deprecated, and should be removed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants