-
Notifications
You must be signed in to change notification settings - Fork 98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add health endpoint #73
Comments
Pretty sure this is done, yeah? @iffyio |
It hasn't been worked on yet. coming to think of it not sure what a health check of the proxy would look like, but at least the server is still a metrics server in the code rather than a generic admin server |
Weird, I swore we did. Clearly I'm working on too many things at the same time. |
Thought I'd take a look at this one too - even if was just something super basic we could expand upon later, and had a couple of thoughts: If we make the metric port configurable as per #101 - that will come under Or, should the health endpoint run on it's own port? And be configured through something like That is easier config, and also easier code, because from review - passing around the hyper server may end up being more tricky than it's worth - otherwise it may be worth moving back to warp. I'm not sure if you would even want to separate access to a health check endpoint from metrics -- but maybe someone will? Thoughts? |
I think we can have them all under the same admin server/port. it'll likely be simpler and we don't want to create a server for each use case either |
We can use: To track if a panic has occurred. If it has, we should probably (a) log it 😄 but also (b) respond on the health point that we are unhealthy, since we don't know what part of the system has been broken. Sound good? Down the line, we can extra specific checks to it, but I think it's a good start. |
I don't see panicking being a proper use case for health check, I think if we panic we should let the program crash as usual and restart rather than handle it specially |
Ah that's a good point. I'm showing my Kubernetes bias 😄 in which case, we should definitely add a In which case, I'll just setup a simple Probably makes sense to have the panic hook in the same I've got the admin server split pretty much working for #101 over in https://github.com/googleforgames/quilkin/tree/mm/admin-server -- will probably see how this parts fits into it, and then start taking it apart and submitting PRs. |
Ah that makes sense, we can have set_hook e.g log the stack trace and mark the proxy as unhealthy or something like that to cover panicking from other threads yeah |
Create a `Health` module which tracks if a panic occurs anywhere in the code base (which may or may not be on the main thread), and moves the system to unhealthy. In the future we could add extra checks to this module as we discover more things that impact proxy health. Closes #73
Create a `Health` module which tracks if a panic occurs anywhere in the code base (which may or may not be on the main thread), and moves the system to unhealthy. In the future we could add extra checks to this module as we discover more things that impact proxy health. Closes #73
#65 Includes a http server with endpoint for serving metrics. We want to reuse this server to include an endpoint for health checks
The text was updated successfully, but these errors were encountered: