-
Notifications
You must be signed in to change notification settings - Fork 101
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add NGINX reload counters #1049
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to report a reload error on every stage of where a reload could fail (e.g. - no main process found, can't get child process file, HUP signal fails, no new processes, no new config version), or something different?
I think that would be too much granularity.
5c36d4d
to
46614a5
Compare
46614a5
to
c00d784
Compare
c00d784
to
cdb7cc5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
cdb7cc5
to
46fc72f
Compare
* Add NGINX reload counters
Proposed changes
Problem: As an operator of an environment running NKG
I want to track the total number of NGINX reloads and failures for NGINX processes across my environment
So that I can correlate availability issues with excessive NGINX reloads or failures
And so that I can let the NKG know when reloads become a problem.
Solution: Total NGINX reloads and failed reloads are counted and reported via a Prometheus endpoint as a counter.
Also included reload duration histogram and a stale config gauge.
Testing: Manual testing with metrics enabled and disabled. Confirmed that reloads reported = HUP signals observed in the NGINX logs = config version reported in the version endpoint. Example output:
Closes #887
Checklist
Before creating a PR, run through this checklist and mark each as complete.