-
Notifications
You must be signed in to change notification settings - Fork 597
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KubePrism generates a lot of DNS queries #7690
Comments
ruifung
changed the title
Talos 1.5.1 controlplane endpoint DNS query flood
KubePrism generates a lot of DNS queries
Aug 31, 2023
Talos does health checks with KubePrism enabled on all controlplane endpoints. Talos doesn't use the local DNS cache, but I see the problem - the checks are run too aggressively (too fast), and that needs to be fixed |
smira
added a commit
to smira/talos
that referenced
this issue
Sep 5, 2023
The default timeouts are very aggressive, and we should use explicit timeouts so that healh checks don't run that often. Fixes siderolabs#7690 Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com> (cherry picked from commit 79bbdf4)
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Discussed in #7689
Originally posted by ruifung August 31, 2023
Is it just me, or does it seem like talos 1.5.1 with kubeprism enabled seem to constantly do DNS queries for the controlplane endpoint?
I've noticed that since I've updated to 1.5.1, the top domain queried is my controlplane endpoint DNS at over 40k queries over the last hour from 6 nodes (3 control, 3 worker).
I just noticed this when looking at the stats on my local DNS server (Technitium DNS)
Addendum:
I just tested, it does seem like KubePrism is indeed what's causing what seems like (compared to everything else on my network) an excessive amount of queries for the controlplane DNS (i.e. controlplane.cluster.home.arpa) to the point that 6 nodes (3 control, 3 worker) generated in excess of 40k queries for that per hour. Disabling KubePrism seems to resolve it.
On average, it appears to be generating 2 queries per second per node.
Is something not respecting the TTL set on the DNS records?
I'll leave KubePrism disabled for now because it's been filling the query logs and query stats.
The text was updated successfully, but these errors were encountered: