Handle delays tied to V6 interfaces #1631

achevuru · 2021-09-22T00:00:59Z

What type of PR is this?
bug

What does this PR do / Why do we need it:
V6 addresses assigned to an interface might take a while before they transition from tentative state to stable state as all addresses need to go through Duplicate Address Detection (DAD). PR introduces a check to make sure the address is in stable state before CNI returns.

Testing done on this change:
Verified that there is no packet loss observed right after pod boot-up due to the issue documented above.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

srini-ram · 2021-09-22T01:12:18Z

cmd/routed-eni-cni-plugin/driver/driver.go

@@ -43,6 +44,8 @@ const (
 	fromContainerRulePriority = 1536
 	// Main routing table number
 	mainRouteTable = unix.RT_TABLE_MAIN
+
+	WAIT_INTERVAL = 50 * time.Millisecond


Good find ! Not sure if we plan to support anything beyond AL2 for initial cut. But it might be worth checking if this delay holds good in other distributions as well.

Side note : Going forward we will have multiple eth attachments on single pod with 5G to separate out different flows. Having this delay as configurable option would help until we characterize the actual number for different use cases.

So, the total wait time is actually 10s. WAIT_INTERVAL is essentially how long we wait before checking the status again. I'm assuming 10s might be long enough and the function in ip package that most of the CNI plugins rely on is capping it @10s as well. I see that it usually takes between 1-2s in my testing but if we do run in to a specific requirement/use-case, I guess we can definitely consider making the upper bound configurable.

Fwiw, I don't think it needs to be configurable - we should be able to just 'wait long enough' for every use case, and there's no benefit from timing out and aborting aggressively.

Re other distros: It would be odd to pick something wildly different from the current Linux kernel default values, and I expect the delay will always be around the few-seconds mark. One alternative here is that we either use 'optimistic DAD' which allows userspace to use the address for some purposes while it is still tentative. A better alternative is to just disable DAD altogether on veth interfaces, because we control both ends anyway so there are no surprises here. Meh, at best it gains 1-2s, and we can come back to this later. Even if we disable DAD on veth, we're still going to want this function at some point for "real" network interfaces (eg: trunk, EFA, ENI+ipvlan).

We could also remove the above timer by using netlink events rather than polling (see AddrSubscribe). Again, meh, we can come back to this if this 50ms poll ever becomes an issue.

anguslees · 2021-09-22T08:00:27Z

cmd/routed-eni-cni-plugin/driver/driver.go

@@ -43,6 +44,8 @@ const (
 	fromContainerRulePriority = 1536
 	// Main routing table number
 	mainRouteTable = unix.RT_TABLE_MAIN
+
+	WAIT_INTERVAL = 50 * time.Millisecond


Fwiw, I don't think it needs to be configurable - we should be able to just 'wait long enough' for every use case, and there's no benefit from timing out and aborting aggressively.

Re other distros: It would be odd to pick something wildly different from the current Linux kernel default values, and I expect the delay will always be around the few-seconds mark. One alternative here is that we either use 'optimistic DAD' which allows userspace to use the address for some purposes while it is still tentative. A better alternative is to just disable DAD altogether on veth interfaces, because we control both ends anyway so there are no surprises here. Meh, at best it gains 1-2s, and we can come back to this later. Even if we disable DAD on veth, we're still going to want this function at some point for "real" network interfaces (eg: trunk, EFA, ENI+ipvlan).

We could also remove the above timer by using netlink events rather than polling (see AddrSubscribe). Again, meh, we can come back to this if this 50ms poll ever becomes an issue.

cmd/routed-eni-cni-plugin/driver/driver.go

anguslees · 2021-09-22T08:28:03Z

(nice, code style comments only)

Co-authored-by: Angus Lees <gus@inodes.org>

…v6_dad

Address delays tied to V6 interfaces

9f82527

achevuru requested a review from anguslees September 22, 2021 00:00

Formatting changes

b987867

achevuru requested a review from jayanthvn September 22, 2021 00:28

srini-ram reviewed Sep 22, 2021

View reviewed changes

srini-ram self-requested a review September 22, 2021 01:13

anguslees reviewed Sep 22, 2021

View reviewed changes

srini-ram requested a review from anguslees September 22, 2021 16:18

achevuru and others added 3 commits September 22, 2021 10:19

Update cmd/routed-eni-cni-plugin/driver/driver.go

eca06a1

Co-authored-by: Angus Lees <gus@inodes.org>

Address CR comments

9b39cd7

Merge branch 'v6_dad' of github.com:achevuru/amazon-vpc-cni-k8s into …

378ff42

…v6_dad

anguslees approved these changes Sep 22, 2021

View reviewed changes

achevuru merged commit 5bb9811 into aws:master Sep 22, 2021

woehrl01 mentioned this pull request Apr 26, 2024

How to disable IPv6 DAD to reduce startup delay of pods on IPv6 cluster bottlerocket-os/bottlerocket#3878

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Handle delays tied to V6 interfaces #1631

Handle delays tied to V6 interfaces #1631

achevuru commented Sep 22, 2021

srini-ram Sep 22, 2021

achevuru Sep 22, 2021 •

edited

Loading

anguslees Sep 22, 2021

anguslees Sep 22, 2021

anguslees commented Sep 22, 2021

Handle delays tied to V6 interfaces #1631

Handle delays tied to V6 interfaces #1631

Conversation

achevuru commented Sep 22, 2021

srini-ram Sep 22, 2021

Choose a reason for hiding this comment

achevuru Sep 22, 2021 • edited Loading

Choose a reason for hiding this comment

anguslees Sep 22, 2021

Choose a reason for hiding this comment

anguslees Sep 22, 2021

Choose a reason for hiding this comment

anguslees commented Sep 22, 2021

achevuru Sep 22, 2021 •

edited

Loading