Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explicitly Disable Duplicate Address Detection For Container Side Veth #695

Merged

Conversation

MikeZappa87
Copy link
Contributor

@MikeZappa87 MikeZappa87 commented Jan 26, 2022

Related Issue: #680

Signed-off-by: Michael Zappa Michael.Zappa@stateless.net

@squeed
Copy link
Member

squeed commented Jan 26, 2022

hey @EdDev , if we disable IPv6 DAD, is this going to potentially cause issues for you?

@EdDev
Copy link
Contributor

EdDev commented Jan 26, 2022

hey @EdDev , if we disable IPv6 DAD, is this going to potentially cause issues for you?

Thank you for exposing this to me!
I am missing context and reasoning here. Can this be added to the commit with a summary of the ref issue?

Specifically, I am interested in understanding why dad is not relevant for veths that connect to a bridge.

@MikeZappa87
Copy link
Contributor Author

MikeZappa87 commented Jan 26, 2022

hey @EdDev , if we disable IPv6 DAD, is this going to potentially cause issues for you?

Thank you for exposing this to me! I am missing context and reasoning here. Can this be added to the commit with a summary of the ref issue?

Specifically, I am interested in understanding why dad is not relevant for veths that connect to a bridge.

@EdDev I added this link to the description: #680

What we are trying to do is speed up pod creation and the SettleAddresses method was taking 1800-2200 ms to complete. Originally the idea (and the PR I just closed) made it optionally set via the runtime configuration or network configuration. Using the runtime configuration the end-user could apply an annotation to the pod spec and optionally set accept_dad = 0.

A few discussions have happened regarding this. One thing that was talked about how a duplicate address could be assigned and that led the discussion to talk about the IPAM erroneously assigning the same IP twice which is outside the scope of dad in that case.

@EdDev
Copy link
Contributor

EdDev commented Jan 26, 2022

@EdDev I added this link to the description: #680

I did follow up on it, but I was looking for an end result to that conversation.

I also prefer the summary posted on the commit. I do not trust github to be always available to me (yea, I still dream of the times where I was coding on the plane with no internet connection).

What we are trying to do is speed up pod creation and the SettleAddresses method was taking 1800-2200 ms to complete. Originally the idea (and the PR I just closed) made it optionally set via the runtime configuration or network configuration. Using the runtime configuration the end-user could apply an annotation to the pod spec and optionally set accept_dad = 0.

This part was the most clearer portion I manage to understand.

What I am not clear about is:

  • Why dad is not relevant for veths connected to bridges? Or maybe why it is less relevant?
  • The end result seems to create a backward compatibility with existing usages. I am unclear what makes it ok to break it. I guess the prev point may clarify this.

We use this bridge plugin at kubevirt for secondary networks extensively. As we take this veth and connect it to the VM through a bridge, I think we do not really need this dad at all. So kubevirt should not be affected at all.

@EdDev
Copy link
Contributor

EdDev commented Jan 27, 2022

One thing that was talked about how a duplicate address could be assigned and that led the discussion to talk about the IPAM erroneously assigning the same IP twice which is outside the scope of dad in that case.

So DaD is not treating cases when static IPv6 is set or when the address is assigned using a DHCP(v6) client? I'm sure missing something here.

@MikeZappa87
Copy link
Contributor Author

MikeZappa87 commented Jan 28, 2022

One thing that was talked about how a duplicate address could be assigned and that led the discussion to talk about the IPAM erroneously assigning the same IP twice which is outside the scope of dad in that case.

So DaD is not treating cases when static IPv6 is set or when the address is assigned using a DHCP(v6) client? I'm sure missing something here.

I wasn't very clear. When it comes to IP address assignment from whatever IPAM, I would say if it's assigning duplicate IP addresses that is the fault of that system, dad is still used here.

I am not certain how to answer your questions though. Since the accept_dad=0 flag is being set only in the container side veth what is the likelihood of the dad failing? We don't disable it for the host side. Since this is for the container network, the impact of a failure is isolated to the specific host and layer 2 domain as well. I have also tried finding other examples of people doing something similar to the SettleAddresses method and haven't been successful. That was actually my first question if that was implemented to fill in a hole with the Linux Kernel/Netlink. If you want, you can pull down my branch, build it, drop the bridge binary in /opt/cni/bin, and let it simmer. I haven't seen any issues in our environment however I don't want to go breaking other people's environments.

The original motivation to make it configurable was to allow the users to apply this via the pod spec so they could enable/disable with the tradeoff of speed and the small likelihood of some failure. If this didn't answer your questions well enough, we can hop on a Google Meet sometime? I think Casey/MCC had some additional thoughts on this as well

I did dig more into Calico and found this: https://github.com/projectcalico/calico/blob/f23a8c7c286cf49aadc07c74a52e8778b0c27c27/cni-plugin/pkg/dataplane/linux/dataplane_linux.go#L205

I was able to keep accept_dad=1, and use the modified code above which hit under 100 ms. I think the original motivation for the SettleAddresses is to prevent binding issues? It's checking for two different conditions however I wonder if it accomplishes the same goal in the end. Just thought I would mention that since its the closest thing I could find to the SettleAddress

@squeed
Copy link
Member

squeed commented Jan 31, 2022

@EdDev I pinged you on this because I know you have specific layer-2 usages for bridges.

My concern is that there could be ipv6 link-local address collisions, especially in layer-2 scenarios. Do you think it is safe to disable ipv6 DAD in your use-case?

This is all a bit funny, since anyone who is actually using ipv6 will surely never have collisions. So, instead, the risk is for v4 and layer-2 networks that incidentally have v6 LL addresses. If there's a collision, the consequences will basically be some annoying dmesg entries and nothing else.

So maybe we just don't care, ever.

@EdDev
Copy link
Contributor

EdDev commented Jan 31, 2022

@EdDev I pinged you on this because I know you have specific layer-2 usages for bridges.

My concern is that there could be ipv6 link-local address collisions, especially in layer-2 scenarios. Do you think it is safe to disable ipv6 DAD in your use-case?

Kubevirt has two main scenarios when using the Bridge CNI:

  • Masquerade mode: The veth in the pod is kept as is, with an IP address on it, which is used for NAT.
  • Bridge mode: The veth in the pod is connected to a bridge in the (same) pod. As it becomes a bridge port, it's IP addresses and any other related configuration is irrelevant.

Therefore, for the bridge mode, I do not think IPv6 DAD has any affect, as the L3 part is activated inside the VM and anything in the middle is L2 only with no IP configuration at all (most being connected to bridges as ports).

For the masquerade case, it behaves the same as any other pod. And I mainly focused my questions to this scenario where setting an IP does matter.
It is worth noting that the usage if the bridge CNI in this scenario is less common, as it serves only the pod network in a k8s setup (not secondary networks). And these are usually served by other CNI plugins.
But still, someone may use the bridge CNI.

This is all a bit funny, since anyone who is actually using ipv6 will surely never have collisions.

I think this is the part that I am unclear about. Why is this the case?
On a k8s cluster for example, someone may use an IPAM plugin that applies static addresses or let the whole task up to the application that runs in the pod.
Even someone that uses router-advertisement to set the IPv6 addresses is open to duplication if the mac address is set by mistake as a duplicate of another.

So, instead, the risk is for v4 and layer-2 networks that incidentally have v6 LL addresses. If there's a collision, the consequences will basically be some annoying dmesg entries and nothing else.

So maybe we just don't care, ever.

For such a scenario, we would not care probably indeed.

My concern is with the things I do not know or understand. I was under the impression, that DAD will protect a pod at creation so it will not collide with some other IPv6 on the network. If we had some level of protection before, this change removes it, assuming someone else will take care of it. I am unsure if we can assume that.

@EdDev
Copy link
Contributor

EdDev commented Jan 31, 2022

If this didn't answer your questions well enough, we can hop on a Google Meet sometime?

We can do that, sure.

@EdDev
Copy link
Contributor

EdDev commented Jan 31, 2022

@phoracek, is it safe to say that kubevirt has little to no interest to support primary networks with masquerade binding that uses a bridge CNI plugin?

@phoracek
Copy link

phoracek commented Feb 1, 2022

@phoracek, is it safe to say that kubevirt has little to no interest to support primary networks with masquerade binding that uses a bridge CNI plugin?

Yes, I think so

@EdDev
Copy link
Contributor

EdDev commented Feb 1, 2022

@phoracek, is it safe to say that kubevirt has little to no interest to support primary networks with masquerade binding that uses a bridge CNI plugin?

Yes, I think so

Great.
@squeed , this means, per my understanding, that IPv6 DAD is not an issue for kubevirt when applying the plugin on secondary networks.

If you think it is not an issue in general (my prev message), all is good from our side.

@squeed
Copy link
Member

squeed commented Feb 2, 2022

Great. I think we can merge this. Anyone else have any concerns?

@MikeZappa87 MikeZappa87 force-pushed the issue/680/explicitdaddisable branch 2 times, most recently from f6e117d to 825d75d Compare February 9, 2022 17:19
Michael Zappa added 3 commits February 9, 2022 10:29
Signed-off-by: Michael Zappa <Michael.Zappa@stateless.net>
Signed-off-by: Michael Zappa <Michael.Zappa@stateless.net>
Signed-off-by: Michael Zappa <Michael.Zappa@stateless.net>
@MikeZappa87 MikeZappa87 force-pushed the issue/680/explicitdaddisable branch from 5a284b7 to ba47b49 Compare February 9, 2022 17:30
@squeed
Copy link
Member

squeed commented Feb 9, 2022

lgtm!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants