Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dnsmasq Pod locks on infra-ops server #503

Closed
Zarquan opened this issue Jun 8, 2021 · 9 comments · Fixed by #572
Closed

dnsmasq Pod locks on infra-ops server #503

Zarquan opened this issue Jun 8, 2021 · 9 comments · Fixed by #572
Labels
bug Something isn't working infrastructure

Comments

@Zarquan
Copy link
Collaborator

Zarquan commented Jun 8, 2021

After time dnsmasq becomes unresponsive on the infra-ops server.

Direct requests to the DNS service timeout:

dig '@infra-ops.aglais.uk' 'zeppelin.gaia-prod.aglais.uk'
; <<>> DiG 9.11.28-RedHat-9.11.28-1.fc32 <<>> @infra-ops.aglais.uk zeppelin.gaia-prod.aglais.uk
; (1 server found)
;; global options: +cmd
;; connection timed out; no servers could be reached

Booting dnsmasq with a SIGHUP signal doesn't solve the problem.

podman kill --signal SIGHUP dnsmasq

Restarting the Pod on the infra-ops server makes it go away.

podman stop dnsmasq
sleep 5
podman start dnsmasq

Direct requests to the DNS service work fine:

dig '@infra-ops.aglais.uk' 'zeppelin.gaia-prod.aglais.uk'

; <<>> DiG 9.11.28-RedHat-9.11.28-1.fc32 <<>> @infra-ops.aglais.uk zeppelin.gaia-prod.aglais.uk
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 887
;; flags: qr aa rd ad; QUERY: 1, ANSWER: 1, AUTHORITY: 1, ADDITIONAL: 1

;; QUESTION SECTION:
;zeppelin.gaia-prod.aglais.uk.	IN	A

;; ANSWER SECTION:
zeppelin.gaia-prod.aglais.uk. 300 IN	A	128.232.227.188

;; AUTHORITY SECTION:
gaia-prod.aglais.uk.	300	IN	NS	infra-ops.aglais.uk.

;; Query time: 14 msec
;; SERVER: 46.101.32.198#53(46.101.32.198)
;; WHEN: Tue Jun 08 12:58:16 BST 2021
;; MSG SIZE  rcvd: 125
@Zarquan Zarquan added the bug Something isn't working label Jun 8, 2021
@Zarquan
Copy link
Collaborator Author

Zarquan commented Jun 16, 2021

and again

@Zarquan
Copy link
Collaborator Author

Zarquan commented Jul 8, 2021

Happened again ~ 7th July

@Zarquan
Copy link
Collaborator Author

Zarquan commented Jul 23, 2021

Happened again Friday 23rd July

@Zarquan
Copy link
Collaborator Author

Zarquan commented Aug 9, 2021

Happened again, between 7th and 9th Aug

@Zarquan
Copy link
Collaborator Author

Zarquan commented Aug 17, 2021

Happened again ~17th Aug

@Zarquan
Copy link
Collaborator Author

Zarquan commented Sep 11, 2021

Happened again ~11th Sept

@Zarquan
Copy link
Collaborator Author

Zarquan commented Sep 14, 2021

Happened again ~14th Sept

@Zarquan
Copy link
Collaborator Author

Zarquan commented Sep 20, 2021

Happened ~ 20th Sept

@Zarquan Zarquan linked a pull request Sep 21, 2021 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working infrastructure
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant