-
Notifications
You must be signed in to change notification settings - Fork 148
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wait for carrier before announcing IPs via GARP/NA #301
Wait for carrier before announcing IPs via GARP/NA #301
Conversation
37ac7e3
to
d4e80b6
Compare
@wizhaoredhat previous version had a bug. Should be fixed now: diff --git a/pkg/utils/packet.go b/pkg/utils/packet.go
index 809b4ded399f..88a52bc78cc1 100644
--- a/pkg/utils/packet.go
+++ b/pkg/utils/packet.go
@@ -221,7 +221,12 @@ func WaitForCarrier(ifName string, waitTime time.Duration) bool {
return false
}
- if linkObj.Attrs().RawFlags&unix.IFF_NO_CARRIER != 0 {
+ /* Wait for carrier, i.e. IFF_UP|IFF_RUNNING. Note that there is also
+ * IFF_LOWER_UP, but we follow iproute2 ([1]).
+ *
+ * [1] https://git.kernel.org/pub/scm/network/iproute2/iproute2.git/tree/ip/ipaddress.c?id=f9601b10c21145f76c3d46c163bac39515ed2061#n86
+ */
+ if linkObj.Attrs().RawFlags&(unix.IFF_UP|unix.IFF_RUNNING) == (unix.IFF_UP | unix.IFF_RUNNING) {
return true
}
} |
Pull Request Test Coverage Report for Build 9346881405Details
💛 - Coveralls |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, consider adding a unit test for WaitForCarrier
(see f168bc2)
d4e80b6
to
966a219
Compare
I took your patch and integrated it in the branch. How about this? (I don't think the test is yet correct. WIP). |
966a219
to
bd25374
Compare
Unit test should work now... IMO it's ready. |
Hi @thom311 the CI is still not happy can you please take a look? |
It says "race detected during execution of test". Is this due to the assignment of |
@adrianchiris Could you also help PTAL? |
Hi @thom311 yes that is the reason please take a look
@wizhaoredhat please first fix the unit-test so we can continue with the reviews on the PR. |
bd25374
to
4779c1b
Compare
Race in unit test should be fixed now. How is that. |
@thom311 could you sort the PR commits? i see it now got 167 commits (openshift fork?) |
Signed-off-by: Thomas Haller <thaller@redhat.com>
Signed-off-by: Thomas Haller <thaller@redhat.com>
Previously, we called AnnounceIPs() right after configuring the interface. At that point, the interface was just recently set IFF_UP, and it might not yet have carrier. In that case, the messages will be lost. We will need to wait a bit for carrier. Since AnnounceIPs() is an optional performance improvement, let's first do all the important things. Move the non-critical call to the end. This will be interesting next, when we will do some additional waiting for the device to have carrier. Let's not do the waiting in the middle of the critical operations, but only at the end. Signed-off-by: Thomas Haller <thaller@redhat.com>
4779c1b
to
e9611a2
Compare
sorry, that was a wrong rebase. Fixed, and addressed other comments too. |
After setting up the interface, it might take a bit for kernel to detect carrier. If we then already send the GARP/NA packets, they are lost. Instead, wait for up to 200 msec for the interface to get carrier. This time is chosen somewhat arbitrarily. We don't want to block the process too long, but we also need to wait long enough, that kernel and driver has time to detect carrier. Also, while busy waiting, sleep with an exponential back-off time (growth factor 1.5). Fixes: c241dcb ('Send IPv4 GARP and IPv6 Unsolicited NA in "cmdAdd"') See-also: https://issues.redhat.com/browse/OCPBUGS-30549 Signed-off-by: Thomas Haller <thaller@redhat.com>
Co-authored-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Thomas Haller <thaller@redhat.com>
e9611a2
to
fc25698
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This is a follow up to c241dcb.
what happens is that after sriov-cni sets up the interface, initially the interface does not have carrier. Sent GARP/IPv6 NA at that point are lost.
Instead, wait a bit until the interface has carrier.
See-also: https://issues.redhat.com/browse/OCPBUGS-30549