-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix for bz1400609 #12107
fix for bz1400609 #12107
Conversation
cc @knobunc @danwinship @openshift/networking |
FWIW, a later comment on the support case claimed that connections reliably worked when the HostSubnet record pointed to one IP, and failed when they pointed to the other. So presumably it wasn't actually the OVS reloading causing the problem, it was that the (OpenShift) router only had an (IP) route to one of the two node addresses, or something like that. |
In which case this patch would mean that if it picked the wrong IP the first time, it would stick with it forever? We need to make sure the HostSubnet record gets created with the "right" IP.. should we be looking at the NodeAddress.Type field to decide which IP to use rather than just always using Addresses[0]? |
So should we use the type 'internal' when available? Probably. This PR however is irrespective of what is chosen the first time around. |
@rajatchopra we probably do want to use "Internal" since presumably that's how the nodes would reach each other. I looked at all the cloud providers, and they all currently seem to return only one internal address. But nothing seems to preclude returning more than one... If there's no internal address, we need to fall back to Legacy. |
Yes, we can use NodeInternalIP if available otherwise fallback to first one in the node address list. Some providers may not support NodeLegacyHostIP. For example, openstack provider has only NodeInternalIP or NodeExternalIP. |
Yes we definitely want this patch, but I don't think it really fixes the customer's bug without the other patch too. (Also, you need to run gofmt) |
…ses (when there are multiple NICs to report), do not let the SDN chase it
3b5c1f4
to
a5e26ff
Compare
[test] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
[test] eveything flaked yesterday... retry |
Evaluated for origin test up to a5e26ff |
continuous-integration/openshift-jenkins/test SUCCESS (https://ci.openshift.redhat.com/jenkins/job/test_pr_origin/12330/) (Base Commit: 34b4f58) |
[merge] |
Evaluated for origin merge up to a5e26ff |
continuous-integration/openshift-jenkins/merge SUCCESS (https://ci.openshift.redhat.com/jenkins/job/test_pr_origin/12377/) (Base Commit: 4c96ea4) (Image: devenv-rhel7_5545) |
Check for validity of a recorded ip address of the node among all the addresses provided in the status. It is possible that the order is not maintained between status reports from the node. Picking the first address would cause flip-flopping of SDN destinations which trouble traffic going to the pods because of OVS reload.
https://bugzilla.redhat.com/show_bug.cgi?id=1400609