Reap runner and sub-slave processes in slaves.

When running a command, the runner process correctly waits for termination of that command, but the slave also needs to wait for the runner process. This adds a set of child pids that get waitpid'd on (with WNOHANG) every time a command is read.
burke · Mar 3, 2013 · fed0652 · fed0652 · metcalf · Aug 1, 2016
1 parent 7a0de42
commit fed0652
Showing 1 changed file with 13 additions and 2 deletions.
diff --git a/rubygem/lib/zeus.rb b/rubygem/lib/zeus.rb
@@ -58,15 +58,26 @@ def go(identifier=:boot)
       Thread.new { notify_features(feature_pipe_w, features) }
 
       # We are now 'connected'. From this point, we may receive requests to fork.
+      children = Set.new
       loop do
         messages = local.recv(2**16)
+
+        # Reap any child runners or slaves that might have exited in
+        # the meantime. Note that reaping them like this can leave <=1
+        # zombie process per slave around while the slave waits for a
+        # new command.
+        children.each do |pid|
+          children.delete(pid) if Process.waitpid(pid, Process::WNOHANG)
+        end
+
         messages.split("\0").each do |new_identifier|
           new_identifier =~ /^(.):(.*)/
           code, ident = $1, $2
+          pid = nil
           if code == "S"
-            fork { go(ident.to_sym) }
+            children << fork { go(ident.to_sym) }
           else
-            fork { command(ident.to_sym, local) }
+            children << fork { command(ident.to_sym, local) }
           end
         end
       end