-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
(pre-)Stop/Kill action/command #9872
Comments
Hi! We would love to have the ability to configure a pre-stop script. It will help us implement smoother upgrades that require load-balancer reconfiguration. Right now our load-balancer coupled with consul services and consul-template will detect that a service is no longer running on a node and will forward traffic to a different node. But this happens after a small delay and only after the job has been terminated. If we had a pre-stop script we can switch the traffic on the load-balancer before the job started to die. This way we won't have to wait for consul to detect and propagate changes. Also, the prestop hook is the only one missing in the family: prestart, poststart,poststop are there. Personally, I would add it for the sake of symmetry.
|
This is impacting our ability to kick off connection draining for our HAProxy containers running in Nomad - similar to @shcherbachev's use-case. I'll take a look at the code for |
Hi @tgross Forked and setup a Nomad dev environment (very smooth on-boarding. The contrib guide was excellent). After reviewing how For reference here are the docs for blog: https://www.hashicorp.com/blog/hashicorp-nomad-task-dependencies 1. Should we support This section of the structs code points to sidecar support for based off of https://www.nomadproject.io/docs/job-specification/lifecycle#init-task-pattern
A use case I could think of would be making sure any buffered logs or other crucial data has been shipped off-box to the telemetry service of choice. 2. Are there any UI components we need to update? Pre-start / Post-stop task hooks have this UX which is quite nice when there are several lifecycle tasks. Could this PR be just encompass the Go side changes? 3. How should task kill timeouts be handled? Example from
In the Related issues: Task Lifecycle PostStart Hook: #8366 I'll keep digging into the code but figured I'd pose these higher level Qs to get the 🤔 going. |
Our use case is exactly the same with shcherbachev , is there any progress on this? |
Hi @liemlhdbeatvn and others on this issue; this is unfortunately not currently on our near-term roadmap. The team will provide updates as soon as there are any. |
I just wanted to drop in and say a prestop feature would be very useful for my use case as well. Due to the architecture of the system I'm working on, it takes about 10 minutes for traffic to stop flowing to a task once it's removed from our load balancer. It would be great if I could have a prestop job that removes it from the load balancer, then sleeps for 10 minutes before allowing the main task to be stopped. |
Of course, as I continue reading the docs... it appears that shutdown_delay will actually meet my needs. @shcherbachev and @liemlhdbeatvn would this work for you as well? https://www.nomadproject.io/docs/job-specification/group#shutdown_delay With that in mind, I'd still vote that prestop be added for completeness. |
Hello, It is a complicated song-and-dance. So far I've run into two problems:
So far I am considering all kinds of crutches to go around the problems above. Thank you, |
My use case for this is very easy. I would like to issue the same API calls through curl to gracefully shutdown Reason: I'm using the ephemeral disk to store the index, cache, wal, etc., on a very fast NVMe of the server. Calling these endpoints will flush all the log chunks to the S3 storage, before the container shutdown, and in the case there is an error on migrating the POST /flush |
@jrasell Any update on this? |
Currently Nomad supports defining a kill signal, and it'd be pretty useful to be able to define pre- and stop/kill actions/commands ( we can already do post-stop via tasks with lifecycle > hook > poststop).
The main use case i see for this is shutting down complex software/tasks that needs actions performed on it for a graceful shutdown, e.g. ScyllaDB recommend running a command (
nodetool drain
) and then shutting down gracefully before killing the Docker container.It could also be useful in order to do more graceful drains, for example when doing rolling upgrades (e.g. failing the healthcheck to make the instance inaccessible from Consul/LB before actually shutting it down).
In theory it could be achieved with an additional hook (prestop), but that might cause some issues ( e.g. in ScyllaDB's case, the prestop task would need to contain all the tools and configuration to be able to run commands on the ScyllaDB running in the main task; and it won't work for the specific case, since they recommend shutting down gracefully via supervisord after draining, and i don't think one can call supervisorctl remotely).
Adrian
The text was updated successfully, but these errors were encountered: