Duplicate IP Addresses for pods w/o host network #711

uruddarraju · 2019-11-08T21:51:54Z

We are running a 1.12.6 cluster provisioned with Kops.

Networking setup: aws-k8s-cni with calico for network policies
Using: 602401143452.dkr.ecr.us-west-2.amazonaws.com/amazon-k8s-cni:v1.5.2

We saw a few networking issues for pods created in our integration testing cluster. This cluster experiences a lot of churn where we consistently spin up 100s of pods and delete them periodically. We observed a few pods experiencing some network issues and upon digging a little more, we found the following:

admin@ip-10-20-89-225:~$ kubectl get pods --all-namespaces -o wide | grep Running | awk '{print $7}' |  sort | uniq -c | grep "   2"
      2 10.20.106.106
      2 10.20.126.85
      2 10.20.33.133
      2 10.20.35.126
....<10s of more entries>

Trying to find if these pods are using host network,

admin@ip-10-20-89-225:~$ kubectl get pods --all-namespaces -o wide | grep 10.20.72.196
uday                       test-data-ui-6644c8f79-k8g5v                                               0/1     Running             1          9d      10.20.72.196    ip-10-20-75-71.us-west-2.compute.internal     <none>
marco                       mock-server                                                              1/1     Running             0          25m     10.20.72.196    ip-10-20-75-71.us-west-2.compute.internal     <none>

And you can see both those pods using the ip 10.20.72.196

We cleaned up the pods experiencing these duplicate IP issue a few times and that does not seem to solve the problem(which makes sense as we still dont know the root cause)

One common scenario we observer in all cases above, atleast one pod has a restart count > 0 or they are in CrashLoopBackoff/Error/Completed states.

Ideally, the cni should only be reclaiming IPs of pods not running that have a restartPolicy of Never. I am not sure if this is the behavior today.

The text was updated successfully, but these errors were encountered:

mogren · 2019-11-08T23:34:44Z

Hi @uruddarraju, the CNI will not reuse IPs directly, but after a 1 minute cool down they will be available to be assigned to pods again.

Do you use the default configuration? What kind of nodes are you running on, and how many pods per node? If you have a lot of churn, it helps to pre-allocate the IPs.

Also, how come you use v1.5.2? I'd recommend upgrading to v1.5.3.

uruddarraju · 2019-11-11T01:41:15Z

Hi @uruddarraju, the CNI will not reuse IPs directly, but after a 1 minute cool down they will be available to be assigned to pods again.

Thanks @mogren. I am not sure that is the problem though. As you can see in the above two examples I gave, the pods have been spun up at 9d ago and 25m ago. So I don't really think reconciliation races are a problem here. And yes, we run default configuration, are there any best practices guides you want us to refer to for our deployment?

jaypipes · 2019-11-26T16:37:24Z

@uruddarraju sorry for delay in getting back to you on this! As @mogren mentioned, it would be good to upgrade to at least 1.5.3 and see if the issue with duplicate IPs goes away.

ataranto · 2019-11-30T01:57:00Z

I'm working with @uruddarraju on this problem. We have been able to consistently observe pods receiving duplicate IP addresses over the past few days which is usually correlated with the startup time of the L-IPAMD process. Here is the subset of log lines that show the outline of the problem scenario:

On node ip-10-20-118-72.us-west-2.compute.internal, 2 pods (fluent-es-v2.4.0-bnvqs and pod-a) have received the same IP address (10.20.115.153).

Timeline:

Tue, 19 Nov 2019 15:24:47 -0800: fluentd-es-v2.4.0-bnvqs started
Fri, 29 Nov 2019 10:28:59 -0800: aws-node restarted
Fri, 29 Nov 2019 10:30:34 -0800: pod-a started

Relevant subset of log:

2019-11-29T18:28:59.808Z [INFO] 	Starting L-IPAMD v1.6.0-rc4  ...

2019-11-29T18:28:59.842Z [INFO] 	Waiting for controller cache sync
2019-11-29T18:29:01.342Z [INFO] 	Synced successfully with APIServer

2019-11-29T18:29:01.343Z [DEBUG] 	GetLocalPods start ...
2019-11-29T18:29:01.343Z [DEBUG] 	getLocalPodsWithRetry() found 0 local pods

2019-11-29T18:29:01.348Z [DEBUG] 	Found pod fluentd-es-v2.4.0-bnvqs with container ID: docker://b8c192e5b00b4ce5c19722fc09254f64b0649b6e621c67adbcb6dc454d229a07
2019-11-29T18:29:01.348Z [INFO] 	Add/Update for Pod fluentd-es-v2.4.0-bnvqs on my node, namespace = kube-system, IP = 10.20.115.153

2019-11-29T18:30:34.058Z [DEBUG] 	No container ID found for pod-a
2019-11-29T18:30:34.058Z [INFO] 	Add/Update for Pod pod-a on my node, namespace = namespace-1, IP = 
2019-11-29T18:30:34.066Z [DEBUG] 	No container ID found for pod-a
2019-11-29T18:30:34.066Z [INFO] 	Add/Update for Pod pod-a on my node, namespace = namespace-1, IP = 
2019-11-29T18:30:34.712Z [INFO] 	Received AddNetwork for NS /proc/30864/ns/net, Pod pod-a, NameSpace namespace-1, Container e8c88083893cf91d26843ae506dd6d3816481deb14c1a42343d98acb51d7f90f, ifname eth0
2019-11-29T18:30:34.712Z [DEBUG] 	AssignIPv4Address: IP address pool stats: total: 180, assigned 0
2019-11-29T18:30:34.712Z [INFO] 	AssignPodIPv4Address: Assign IP 10.20.115.153 to pod (name pod-a, namespace namespace-1 container e8c88083893cf91d26843ae506dd6d3816481deb14c1a42343d98acb51d7f90f)
2019-11-29T18:30:34.712Z [INFO] 	Send AddNetworkReply: IPv4Addr 10.20.115.153, DeviceNumber: 0, err: <nil>
2019-11-29T18:30:34.957Z [DEBUG] 	AssignIPv4Address: IP address pool stats: total: 180, assigned 1
2019-11-29T18:30:35.368Z [DEBUG] 	Found pod pod-a with container ID: docker://b6993fffa5bc874d16eec739d440faeb0cd1d5b0bd73169eda72b25b1f764dd0
2019-11-29T18:30:35.368Z [INFO] 	Add/Update for Pod pod-a on my node, namespace = namespace-1, IP = 10.20.115.153
2

First, it's unclear why getLocalPodsWithRetry() found 0 local pods, as there were many pods (including fluentd-es-v2.4.0-bnvqs running at the time that L-IPAMD was restarted.

Second, even through the code progresses to the point of Found pod fluentd-es-v2.4.0-bnvqs, the discovery mechanism doesn't seem to modify the state of the datastore to reflect that 10.20.115.153 is in use. We can see that when we receive an AddNetwork request for the newly scheduled pod-a, that datastore considers zero of our 180 addresses to be assigned.

uruddarraju · 2019-12-02T22:03:45Z

Took a stab here: #738. @jaypipes can you take a look at it please? Difficult adding a test case for this usecase.

mogren · 2020-01-29T19:10:39Z

The related PR needs to be rebased, and still has some required changes.

uthark · 2020-05-06T22:53:29Z

We've seen this too.
Happened after we had API outage and then CNI assigned ip address that was used by other pod running on the same node.

mogren · 2020-05-07T21:23:35Z

Thanks for confirming @uthark, I'll try to make an updated version of #738 and test this.

mogren · 2020-06-07T20:46:50Z

Since #972, we read this directly from the CRI socket instead of using the watcher.

mogren · 2020-08-23T19:57:58Z

@uthark @uruddarraju Late update, but is this still an issue with v1.6.4 or v1.7.0?

mogren · 2020-09-04T01:14:47Z

v1.7.2-rc1 is a release candidate that includes a lot of fixes to resolve this type of issues.

mogren · 2020-09-10T04:49:47Z

@uruddarraju, @ataranto Since v1.5.x, we have moved away from using the informer and instead we query the CRI socket directly. These changes should prevent this issue from happening again.

jaypipes added the needs investigation label Nov 26, 2019

uruddarraju mentioned this issue Dec 2, 2019

Wait until the discover controller cache is synced #738

Closed

mogren mentioned this issue Jun 5, 2020

Persist IPAM state to local file and use across restarts #972

Merged

fawadkhaliq mentioned this issue Jul 24, 2020

[WIP] Use docker/CRI to discover pods at node init #1102

Closed

fawadkhaliq mentioned this issue Aug 6, 2020

Use docker/CRI to discover pods at node init #1118

Merged

mogren mentioned this issue Aug 6, 2020

Prepare for v1.6.4-rc1 release #1119

Merged

mogren added bug priority/P0 Highest priority. Someone needs to actively work on this. and removed needs investigation labels Aug 10, 2020

mogren removed the priority/P0 Highest priority. Someone needs to actively work on this. label Aug 23, 2020

mogren closed this as completed Sep 10, 2020

jayanthvn mentioned this issue May 3, 2023

Mismatching pod IP and datastore IP during deletion #2352

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Duplicate IP Addresses for pods w/o host network #711

Duplicate IP Addresses for pods w/o host network #711

uruddarraju commented Nov 8, 2019 •

edited

Loading

mogren commented Nov 8, 2019

uruddarraju commented Nov 11, 2019

jaypipes commented Nov 26, 2019

ataranto commented Nov 30, 2019

uruddarraju commented Dec 2, 2019

mogren commented Jan 29, 2020

uthark commented May 6, 2020

mogren commented May 7, 2020

mogren commented Jun 7, 2020

mogren commented Aug 23, 2020

mogren commented Sep 4, 2020

mogren commented Sep 10, 2020

Duplicate IP Addresses for pods w/o host network #711

Duplicate IP Addresses for pods w/o host network #711

Comments

uruddarraju commented Nov 8, 2019 • edited Loading

mogren commented Nov 8, 2019

uruddarraju commented Nov 11, 2019

jaypipes commented Nov 26, 2019

ataranto commented Nov 30, 2019

uruddarraju commented Dec 2, 2019

mogren commented Jan 29, 2020

uthark commented May 6, 2020

mogren commented May 7, 2020

mogren commented Jun 7, 2020

mogren commented Aug 23, 2020

mogren commented Sep 4, 2020

mogren commented Sep 10, 2020

uruddarraju commented Nov 8, 2019 •

edited

Loading