-
-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
std.Thread.Pool: process tree cooperation #20274
Comments
Alternative POSIX strategy based on advisory record locks: Root process opens a new file with Advisory record locks are automatically released by the kernel when a process dies. This unfortunately would mean that Zig could not play nicely with other applications such as make and cargo. But I think it's worth it. The whole point of the Zig project is to improve status quo, otherwise we could all just keep using C. The make jobserver protocol is fundamentally broken, so what Zig will do is step up and create a better protocol that has similar low overhead but not this glaring problem. Then it will be up to make and cargo to upgrade to the better standard. As for the strategy I outlined above, I have not evaluated its efficacy yet. I'll report back with some performance stats when I do. |
Alternative POSIX strategy based on UNIX domain sockets: Root process listens on a unix domain socket. That fd is passed down the process tree. To get a thread token a child process connects to the socket. To return the token, disconnect. The root process only accepts N concurrent connections and stops accepting when it's maxed out. When a child process dies, the OS causes the connection to end, allowing the root process to accept a new connection. 2 upsides compared to the other proposed idea:
|
Alternative pipes proposal (also incompatible with jobserver protocol): This protocol overcomes the issue of make-jobserver protocol where it's impossible for server to tell when a child is taking a job (ie: read 1-byte). We overcome this by having the child notify the server before taking a job, so the server knows it needs to write 1-byte for a waiting child on their private pipe. pros:
cons:
PIPE OVERVIEW:
IPC setup
job flow
cleanup
|
IPC and POSIX jobserver interaction for coordinating threaded work seem inefficient. Could it all be avoided by making zig commands (i.e. |
Problem statement:
std.Thread.Pool
of sizenumber_of_logical_cores
std.Thread.Pool
of sizenumber_of_logical_cores
2*number_of_logical_cores
threads active, cache thrashing which harms performance.On POSIX systems,
std.Thread.Pool
should integrate by default with the POSIX jobserver protocol, meaning that if the jobserver environment variable is detected, it should coordinate with it in order to make the entire process tree share the same number of concurrently running threads.On macOS, maybe we should use libdispatch instead. It's bundled with libSystem so it's always available and accomplishes the same goal.
I'm not sure what to do on Windows.
This is primarily a standard library enhancement, however the build system and compiler both make heavy usage of
std.Thread.Pool
so they would observe behavior changes.In particular, running
zig build
as a child process frommake
would start cooperating and not create too many threads. Similarly, running the zig compiler from the zig build system would do the same. The other way is true too - runningmake
as a child process from the zig build system would start cooperating. And then there are other third party tools that have standardized on the POSIX jobserver protocol, such as cargo.There is one concern however which is that the protocol leaves room for "thread leaks" to occur if child processes crash. I'm not sure the best way to mitigate this. The problem happens when a child process has obtained a thread token from the pipe, and then crashes before writing the token back to the pipe. In such case the thread pool permanently has one less thread active than before, which is suboptimal, and would cause a deadlock if it happened a number of times exceeding the thread pool size.
Related:
The text was updated successfully, but these errors were encountered: