-
Notifications
You must be signed in to change notification settings - Fork 748
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pod Connectivity is broken randomly #721
Comments
@spikewang Hi, what version of the CNI are you using? v1.5.4 had an issue with |
hi @mogren, thanks for the quick reply. Yes, I am aware of that issue with v.1.5.4 and we already downgraded all CNI from 1.5.4 to 1.5.3 on all our clusters last week.... However, those pods were created a while back.... |
@spikewang Yes, that is the
To add the IP for one of your pods that were created with v1.5.4, do:
|
I see. Cool, appreciate for the clarification. I will try it out! |
POD connectivity is broken with EKS in the region: us-west-1 (Oregon)
Connectivity between pods is broken for 1 ETCD pod. To isolate further removed the etcd service and am trying to ping the etcd pods directly from the source pods.
Source pods:
orchestrator-us-west-8-5db22211e2e90e0db2d1f856-orchestratcsl76 1/1 Running 0 16h 172.16.0.137 ip-172-16-0-216
Destination pods:
etcd-cluster-5db355dbee30e565b6e1459d-69hdw2gqxr 1/1 Running 0 172.16.0.85 ip-172-16-0-111.us-west-2.compute.internal
etcd-cluster-5db355dbee30e565b6e1459d-fpr4h7g547 1/1 Running 0 172.16.0.71 ip-172-16-0-56.us-west-2.compute.internal
etcd-cluster-5db355dbee30e565b6e1459d-pft5tsbd4k 1/1 Running 0 172.16.0.176 ip-172-16-0-216.us-west-2.compute.internal
Ping from source pods:
ping 172.16.0.85 (works)
ping 172.16.0.71 (works)
ping 172.16.0.176 (fails)
Packet capture on the node is showing time exceeded error:
ip-172-16-0-216.us-west-2.compute.internal Ready 21d v1.12.7 172.16.0.216
sh-4.2# tcpdump -ni eni1daa9b475a7 icmp
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eni1daa9b475a7, link-type EN10MB (Ethernet), capture size 262144 bytes
--> WORKING case:
17:54:40.882449 IP 172.16.0.137 > 172.16.0.85: ICMP echo request, id 56411, seq 0, length 64
17:54:40.887135 IP 172.16.0.85 > 172.16.0.137: ICMP echo reply, id 56411, seq 0, length 64
17:54:41.887705 IP 172.16.0.137 > 172.16.0.85: ICMP echo request, id 56411, seq 1, length 64
17:54:41.888421 IP 172.16.0.85 > 172.16.0.137: ICMP echo reply, id 56411, seq 1, length 64
17:54:45.300603 IP 172.16.0.137 > 172.16.0.71: ICMP echo request, id 56667, seq 0, length 64
17:54:45.301375 IP 172.16.0.71 > 172.16.0.137: ICMP echo reply, id 56667, seq 0, length 64
17:54:46.301119 IP 172.16.0.137 > 172.16.0.71: ICMP echo request, id 56667, seq 1, length 64
17:54:46.301925 IP 172.16.0.71 > 172.16.0.137: ICMP echo reply, id 56667, seq 1, length 64
--> FAILED case:
17:54:50.225198 IP 172.16.0.137 > 172.16.0.176: ICMP echo request, id 56923, seq 0, length 64
17:54:50.232979 IP 172.16.0.216 > 172.16.0.137: ICMP time exceeded in-transit, length 92
17:54:51.225334 IP 172.16.0.137 > 172.16.0.176: ICMP echo request, id 56923, seq 1, length 64
17:54:51.237460 IP 172.16.0.216 > 172.16.0.137: ICMP time exceeded in-transit, length 92
17:54:52.225519 IP 172.16.0.137 > 172.16.0.176: ICMP echo request, id 56923, seq 2, length 64
17:54:52.234741 IP 172.16.0.216 > 172.16.0.137: ICMP time exceeded in-transit, length 92
Any hints here, should I dump a CNI admin tech support?
The text was updated successfully, but these errors were encountered: