Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Calico config breaks if you use etcd #10721

Closed
bsiagrac opened this issue Dec 14, 2023 · 2 comments
Closed

Calico config breaks if you use etcd #10721

bsiagrac opened this issue Dec 14, 2023 · 2 comments
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@bsiagrac
Copy link

Environment:

  • Cloud provider or hardware configuration:
    bare metal

  • OS (printf "$(uname -srm)\n$(cat /etc/os-release)\n"):
    Ubuntu 20.04.6 LTS

  • Version of Ansible (ansible --version):
    ansible [core 2.14.6]

  • Version of Python (python --version):
    Python 3.8.10

Kubespray version (commit) (git rev-parse --short HEAD):
tag v2.23.0

Network plugin used:
calico with etcd

Description:
We encountered massive network problems, that if we use etcd in calico config, the configuration breaks during the upgrade to tag v2.23.0. The kubernetes internal network communication between the nodes was broken, after the upgrade. For example i/o timeouts and no route to host errors.

The calico-config configmap wrote the control-plane node name in the configmap and therefore all daemonsets wrote the same nodename in the config.

The fix was to replace the actual node name of the control plane with the variable __KUBERNETES_NODE_NAME__.

This was our fix

...
"plugins":[
        {
                                "nodename": "__KUBERNETES_NODE_NAME__",
                                "type": "calico",
            "log_level": "info",
                      "log_file_path": "/var/log/calico/cni/cni.log",
                                "etcd_endpoints": "https://[ETCD-IP]:2379",
            "etcd_cert_file": "/etc/calico/certs/cert.crt",
            "etcd_key_file": "/etc/calico/certs/key.pem",
            "etcd_ca_cert_file": "/etc/calico/certs/ca_cert.crt",
...

We assume that the error might be here roles/network_plugin/calico/templates/calico-config.yml.j2. If you use etcd you might need the same node configuration as if you use kdd but without the datastore variable.

This commit here might have broke the setting, when the config was changed to a configmap:
62f30a3

@bsiagrac bsiagrac added the kind/bug Categorizes issue or PR as related to a bug. label Dec 14, 2023
@VannTen
Copy link
Contributor

VannTen commented Dec 14, 2023

duplicate of #10436
/close

@k8s-ci-robot
Copy link
Contributor

@VannTen: Closing this issue.

In response to this:

duplicate of #10436
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

No branches or pull requests

3 participants