Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use process group-based killing to make sure frequently starting processes don't remain running after shutdown #49

Closed
thenonameguy opened this issue Mar 20, 2023 · 7 comments

Comments

@thenonameguy
Copy link
Contributor

Description

I only have anecdotal evidence for this, but I'm still sometimes getting the effects of #43:
After stopping process-compose some of the processes in the DAG still keep running.

Proposal

Inspiration: watchexec
One sure-fire way to guarantee that ALL processes belonging to process-compose receive the signal is to let the kernel handle it, by passing in the process group ID (= PID of process-compose) to kill. This could be added at the end of ShutDownProject with a SIGKILL, as at that point all custom signaling should have happened.
This would make sure that any subprocesses are killed, even if they were started between the process receiving the shutDown configured signal (let's say an incorrectly implemented bash script, that does not forward signals to the subprocesses).

@F1bonacc1
Copy link
Owner

F1bonacc1 commented Mar 20, 2023

@thenonameguy good to hear from you again :)

Is it possible for me to reproduce this behavior on my end as well?

Following #43 I have performed multiple tests with docker and docker-compose (oh the irony!) to ensure everything works as it should. Could you please provide me with an example of a scenario in which I could test whether or not the code change worked?

Thanks

@thenonameguy
Copy link
Contributor Author

Hey @F1bonacc1 :)

I was a bit lazy to come up with an example, so here is the answer of the ChatGPT overlords:

Me: Write a bash shell script that continuously spawn long-running, observable sub-processes. This would be used for testing whether a given software implements POSIX process group based killing, or not. Also include a test code whether there are any leftover processes when the parent process exits.

Here's an example bash script that continuously spawns long-running sub-processes and tests whether they are properly terminated when the parent process is killed.

#!/usr/bin/env bash

# Define function to spawn a sub-process
spawn_subprocess() {
  sleep 3600 &
  echo "Spawned subprocess with PID $!"
}

# Trap signals to handle graceful termination
trap 'echo "Exiting..."; kill $(jobs -p); exit' SIGINT SIGTERM

# Continuously spawn sub-processes
while true; do
  spawn_subprocess
  sleep 1
done

# Test for leftover processes
if [[ $(jobs -p) ]]; then
  echo "WARNING: There are leftover processes!"
  echo "Process IDs: $(jobs -p)"
fi

The spawn_subprocess function simply sleeps for an hour before exiting, simulating a long-running process. The trap command sets up a signal handler to gracefully terminate the script and all of its child processes when the user sends an interrupt signal (e.g. Ctrl-C).

The while loop continuously spawns new sub-processes and sleeps for a second between each spawn. The script will run until it is terminated by the user.

Finally, the script checks for any leftover processes using the jobs command, which lists all background jobs spawned by the script. If there are any jobs still running, the script outputs a warning message along with their process IDs.

Note: To test whether a given software implements POSIX process group based killing, you can run this script and try to kill the parent process using kill -TERM or kill -INT commands. If the parent process properly terminates and all of its child processes are terminated as well, then the software is likely implementing POSIX process group based killing. If any child processes are left running, then the software may not be implementing the feature correctly.

@thenonameguy
Copy link
Contributor Author

thenonameguy commented Mar 21, 2023

Using the above, if you create a shell script like this bad_script.sh:

#!/usr/bin/env bash

# Define function to spawn a sub-process
spawn_subprocess() {
  sleep 3600 &
  echo "Spawned subprocess with PID $!"
}

# Continuously spawn sub-processes
while true; do
  spawn_subprocess
  sleep 1
done

And set it up using process-compose, with the default SIGTERM kill signal, my understanding is that that since the trap related signal handling was removed, the bad_script.sh will be terminated but the sleep sub-processes remain running.

By implementing the requested feature, these would be killed since they belong to process-compose's process group.

@F1bonacc1
Copy link
Owner

Thank you @thenonameguy!
Now I have something to work with.
Will be back from vacation and will tend right to it.

@thenonameguy
Copy link
Contributor Author

I hope you have/had a nice vacation :)

I found a Golang blogpost detailing the technique:
https://medium.com/@felixge/killing-a-child-process-and-all-of-its-children-in-go-54079af94773

@F1bonacc1
Copy link
Owner

Hi @thenonameguy,

Good news and bad news.
Good first:
Using your (and ChatGPT's) example I managed to reconstruct the orphan grandchildren example when process-compose is terminated.
Bad news:
I managed to do it only when process-compose was killed with -9 signal.

The technique described in the medium blog post was part of process-compose for 9 months already:
setProcArgs
So I don't think that this happened with an older version.

I tested it on a Linux system, and the same behavior (theoretically) should be on a Darwin based system.
I tested SIGTERM and SIGHUP, is it possible that some other signal was used to terminate process-compose? Other OS?

F1bonacc1 added a commit that referenced this issue May 12, 2023
@thenonameguy
Copy link
Contributor Author

Hey @F1bonacc1,

I can't seem to reproduce the problems either (Darwin), so will close this ticket now.
Thanks for the investigation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants