You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
What happened:
On 1.19 Clusters, pods using security group is seeing high latency when performing DNS resolution.
Attach logs
[ec2-user@ip-10-10-35-247 ~]$ sudo tcpdump -i eth2
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth2, link-type EN10MB (Ethernet), capture size 262144 bytes
00:38:52.883343 IP ip-10-10-42-165.eu-west-1.compute.internal.55758 > ip-10-10-57-24.eu-west-1.compute.internal.domain: 8261+ A? app1.test-dns.svc.cluster.local.test-dns.svc.cluster.local. (76)
00:38:52.883362 IP ip-10-10-42-165.eu-west-1.compute.internal.55758 > ip-10-10-57-24.eu-west-1.compute.internal.domain: 8609+ AAAA? app1.test-dns.svc.cluster.local.test-dns.svc.cluster.local. (76)
00:38:52.884027 IP ip-10-10-57-24.eu-west-1.compute.internal.domain > ip-10-10-42-165.eu-west-1.compute.internal.55758: 8261 NXDomain*- 0/1/0 (169)
00:38:55.385621 IP ip-10-10-42-165.eu-west-1.compute.internal.55758 > ip-10-10-57-24.eu-west-1.compute.internal.domain: 8609+ AAAA? app1.test-dns.svc.cluster.local.test-dns.svc.cluster.local. (76)
00:38:55.385977 IP ip-10-10-57-24.eu-west-1.compute.internal.domain > ip-10-10-42-165.eu-west-1.compute.internal.55758: 8609 NXDomain*- 0/1/0 (169)
00:38:55.386073 IP ip-10-10-42-165.eu-west-1.compute.internal.46259 > ip-10-10-57-24.eu-west-1.compute.internal.domain: 32938+ A? app1.test-dns.svc.cluster.local.svc.cluster.local. (67)
00:38:55.386085 IP ip-10-10-42-165.eu-west-1.compute.internal.46259 > ip-10-10-57-24.eu-west-1.compute.internal.domain: 33304+ AAAA? app1.test-dns.svc.cluster.local.svc.cluster.local. (67)
00:38:55.386274 IP ip-10-10-57-24.eu-west-1.compute.internal.domain > ip-10-10-42-165.eu-west-1.compute.internal.46259: 32938 NXDomain*- 0/1/0 (160)
00:38:57.888876 IP ip-10-10-42-165.eu-west-1.compute.internal.46259 > ip-10-10-57-24.eu-west-1.compute.internal.domain: 33304+ AAAA? app1.test-dns.svc.cluster.local.svc.cluster.local. (67)
00:38:57.889175 IP ip-10-10-57-24.eu-west-1.compute.internal.domain > ip-10-10-42-165.eu-west-1.compute.internal.46259: 33304 NXDomain*- 0/1/0 (160)
00:38:57.889262 IP ip-10-10-42-165.eu-west-1.compute.internal.33329 > ip-10-10-57-24.eu-west-1.compute.internal.domain: 45991+ A? app1.test-dns.svc.cluster.local.cluster.local. (63)
00:38:57.889274 IP ip-10-10-42-165.eu-west-1.compute.internal.33329 > ip-10-10-57-24.eu-west-1.compute.internal.domain: 46373+ AAAA? app1.test-dns.svc.cluster.local.cluster.local. (63)
00:38:57.889455 IP ip-10-10-57-24.eu-west-1.compute.internal.domain > ip-10-10-42-165.eu-west-1.compute.internal.33329: 46373 NXDomain*- 0/1/0 (156)
00:38:57.889556 IP ip-10-10-57-24.eu-west-1.compute.internal.domain > ip-10-10-42-165.eu-west-1.compute.internal.33329: 45991 NXDomain*- 0/1/0 (156)
00:38:57.889608 IP ip-10-10-42-165.eu-west-1.compute.internal.56650 > ip-10-10-57-24.eu-west-1.compute.internal.domain: 10901+ A? app1.test-dns.svc.cluster.local.eu-west-1.compute.internal. (76)
00:38:57.889619 IP ip-10-10-42-165.eu-west-1.compute.internal.56650 > ip-10-10-57-24.eu-west-1.compute.internal.domain: 11481+ AAAA? app1.test-dns.svc.cluster.local.eu-west-1.compute.internal. (76)
00:38:57.889752 IP ip-10-10-57-24.eu-west-1.compute.internal.domain > ip-10-10-42-165.eu-west-1.compute.internal.56650: 10901 NXDomain* 0/1/0 (189)
00:39:00.391787 IP ip-10-10-42-165.eu-west-1.compute.internal.56650 > ip-10-10-57-24.eu-west-1.compute.internal.domain: 11481+ AAAA? app1.test-dns.svc.cluster.local.eu-west-1.compute.internal. (76)
00:39:00.392055 IP ip-10-10-57-24.eu-west-1.compute.internal.domain > ip-10-10-42-165.eu-west-1.compute.internal.56650: 11481 NXDomain* 0/1/0 (189)
00:39:00.392145 IP ip-10-10-42-165.eu-west-1.compute.internal.54742 > ip-10-10-57-24.eu-west-1.compute.internal.domain: 6742+ A? app1.test-dns.svc.cluster.local. (49)
00:39:00.392152 IP ip-10-10-42-165.eu-west-1.compute.internal.54742 > ip-10-10-57-24.eu-west-1.compute.internal.domain: 6991+ AAAA? app1.test-dns.svc.cluster.local. (49)
What you expected to happen:
DNS Resolution to happen within few milliseconds.
How to reproduce it (as minimally and precisely as possible):
Run CoreDNS and pod using security group on same node and ping any kubernetes service from the pod.
Anything else we need to know?:
This happens only when CoreDNS and pods using security group run on same worker node and also only on 1.19 clusters.
Environment:
[ec2-user@ip-10-10-35-247 ~]$ uname -a
Linux ip-10-10-35-247.eu-west-1.compute.internal 5.4.95-42.163.amzn2.x86_64 #1 SMP Thu Feb 4 12:50:05 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
The text was updated successfully, but these errors were encountered:
SaranBalaji90
changed the title
DNS resolution from pods using security group is high on 1.19 clusters
High latency for DNS resolution on pods using security group on 1.19 clusters
Mar 10, 2021
SaranBalaji90
changed the title
High latency for DNS resolution on pods using security group on 1.19 clusters
High latency for DNS resolution on pods using security groups on 1.19 clusters
Mar 10, 2021
SaranBalaji90
changed the title
High latency for DNS resolution on pods using security groups on 1.19 clusters
Intermittent DNS resolution issue when using security groups for Pods and EKS AMI 1.19 which has 5.4 kernel version
Mar 10, 2021
What happened:
On 1.19 Clusters, pods using security group is seeing high latency when performing DNS resolution.
Attach logs
What you expected to happen:
DNS Resolution to happen within few milliseconds.
How to reproduce it (as minimally and precisely as possible):
Run CoreDNS and pod using security group on same node and ping any kubernetes service from the pod.
Anything else we need to know?:
This happens only when CoreDNS and pods using security group run on same worker node and also only on 1.19 clusters.
Conntrack stats
Any workarounds?
by adding following in the pod spec, I was able to work around the issue.
References:
https://medium.com/techmindtickle/intermittent-delays-in-kubernetes-e9de8239e2fa
https://blog.quentin-machu.fr/2018/06/24/5-15s-dns-lookups-on-kubernetes/
https://tech.xing.com/a-reason-for-unexplained-connection-timeouts-on-kubernetes-docker-abd041cf7e02
Environment:
[ec2-user@ip-10-10-35-247 ~]$ uname -a
Linux ip-10-10-35-247.eu-west-1.compute.internal 5.4.95-42.163.amzn2.x86_64 #1 SMP Thu Feb 4 12:50:05 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
The text was updated successfully, but these errors were encountered: