-
Notifications
You must be signed in to change notification settings - Fork 594
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Kubelet/etcd uses wrong IPv6 Address #9725
Comments
A potential solution could be to sort the IPs by preferred flags here: https://github.com/siderolabs/talos/blob/e26d0043e022eccf5ea9c9d9b4a57e4bff1f80cc/internal/app/machined/pkg/controllers/network/node_address.go#L154C1-L155C1 However this would mean addresses in NodeAddress objects are sorted by preference rather than alphabetically. If this is a valid solution I could draft up a PR. |
I agree it might be better for IPv6, but you can use also https://www.talos.dev/v1.8/introduction/prodnotes/#multihoming |
I am not sure how this helps here. Both addresses are from the same subnet. KubeVirt's Passt network binding (which is currently the only fully functional IPv6 option supporting the primary pod network) announces the Pod Subnet (of the hosting cluster) as Prefix via RA and Talos will derive a SLAAC/Temp and follow it up with DHCPv6. This means the SLAAC and DHCPv6 assigned IP are in the same subnet. I don't see a reasonable subnet filter to specify. The SLAAC address itself is not reachable by the underlying pod network of the KubeVirt hosting cluster. Using it for etcd or kubelet will break connectivity. This is stopping Talos from scaling beyond a single node in an IPv6 KubeVirt environment as an unreachable IP will be advertised. Is sorting the IPs alphanumerical and by preference based on flags a suitable solution (on top of the existing filtering)? If so, the changes required should be minimal and I might be able to draft up a PR. |
It might be worth mentioning that the kubelet will choose the correct IP if no node IP is specified. This is the case with a kubeadm setup based on KubeVirt. From my understanding the Kubelet is using https://github.com/kubernetes/apimachinery/blob/v0.31.2/pkg/util/net/interface.go#L468 under the hood to choose the address. |
I understand the issue, but I'd like to make sure we have a proper solution ground up for IPv6, so I don't want to rush into fixing this until we have a proper testbed for IPv6 we can use to ensure proper operations going forward. I know it doesn't sound too much fun, but the proper IP can be selected with |
Unfortunately the VM's IP is a Pod IP, so for KubeVirt IPv6 ( |
I think it does make sense to prefer IPv6 addresses based on flags (not sure if we can omit |
By the way, passt does this because you can't "turn off SLAAC" while sending router advertisements (the But passt also does this because it works with Linux, as addresses with the longest prefixes are preferred as source addresses, see __ipv6_dev_get_saddr() and ipv6_get_saddr_eval() (rule #8) in net/ipv6/addrconf.c for details. Now, without making this as generic as the Linux kernel, I guess it would be anyway reasonable to pick the longest matching prefix as preferred address. |
Funny enough in our bare-metal Talos setup we do not use DHCPv6 so the SLAAC address is used. A preference based on longest matching prefix sounds like a reasonable approach. |
The main reason why passt implements a (minimalistic) DHCPv6 server is that, I've been told, having the same exact address inside and outside the guest is convenient for integration with some container-oriented service meshes that assume "host networking" (hence, addressing). |
Yes, and it is also a necessity to run Kubernetes Clusters in KubeVirt either through CAPI or Omni/Talos. @smira Does Talos have a "feature gate" functionality allowing us to hide the changed behaviour behind a feature gate? |
Yes, we do have feature gates, if you could open a proposed PR, we can make a feature gate, and even enable it by default for new clusters on 1.9. |
Over the weekend I figured there might be a (dirty) workaround for the Kubevirt use-case (will not help for bare-metal IPv6 use-cases involving DHCPv6): This will leave the node in a non-functional state: # cat fdae:41e4:649b:9303:9cd5:e54b:8120:4adb/resources/nodeipconfigs.kubernetes.talos.dev.yaml
metadata:
namespace: k8s
type: NodeIPConfigs.kubernetes.talos.dev
id: kubelet
version: 1
owner: k8s.NodeIPConfigController
phase: running
created: 2024-11-18T11:29:36Z
updated: 2024-11-18T11:29:36Z
spec:
validSubnets:
- '!fd01:cafe::dcad:ff:fe00:beaf/128'
excludeSubnets:
- fd90:cafe::/64
- fd95:cafe::/108 This might be either outdated documentation or another bug report. I'll start working on a PR to establish a preference for IPv6 IPs ASAP. |
This is because there are no positive matches, you need to include |
Fixes siderolabs#9725 See siderolabs#9749 Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Thanks @smira, including a range before adding an ignore statement fixes it. |
Fixes siderolabs#9725 See siderolabs#9749 Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Fixes siderolabs#9725 See siderolabs#9749 Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Fixes siderolabs#9725 See siderolabs#9749 Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Fixes siderolabs#9725 See siderolabs#9749 Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
Fixes siderolabs#9725 See siderolabs#9749 Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com> (cherry picked from commit 7d65071)
Bug Report
Description
When Talos is run in an IPv6 Single-Stack environment and is assigned multiple IPs by DHCP and RA (although this will most likely apply to Dual-Stack as well) the Kubelet will use the wrong Address.
In our case Talos is running in KubeVirt with the Passt network binding plugin and gets an IP via RA followed by an /128 IP from DHCPv6. Only the latter has full bi-directional connectivity.
The preferred /128 address has the flag
permanent
while the RA address has the flagmngmtmpaddr
.The
permanent
address should be preferred.Logs
Relevant excerpts from
omnictl support
:AddressStatuses
NodeAddresses:
NodeIPs:
Environment
talosctl version --nodes <problematic nodes>
] v1.8.2kubectl version --short
] v1.30.1The text was updated successfully, but these errors were encountered: