-
Notifications
You must be signed in to change notification settings - Fork 13.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Eliminate blocking on the event loop when possible #8341
Comments
This builds on #8340 somewhat as it also changes how idle callbacks work in the scheduler. |
As far as I can tell the structure you suggest isn't about eliminating 'blocking' per se - if there are events to be run then the loop doesn't block - but is a proposal to improve performance by using polling instead of events. From the description I believe that the performance win you expect to get here is from not using the idle callback. I'm skeptical that calling the idle callback is a significant amount of the work that the event loop is doing though. The work imposed by an idle callback on
That is, traversing an element of a linked list and calling a function pointer. This structure has a downside that it would impose more requirements on our generic event loop, and that it is more complicated. I'm not inclined to make this change without evidence that the current callback-based structure is a problem. |
Currently, a scheduler will hit epoll() or kqueue() at the end of *every task*. The reason is that the scheduler will context switch back to the scheduler task, terminate the previous task, and then return from run_sched_once. In doing so, the scheduler will poll for any active I/O. This shows up painfully in benchmarks that have no I/O at all. For example, this benchmark: for _ in range(0, 1000000) { spawn(proc() {}); } In this benchmark, the scheduler is currently wasting a good chunk of its time hitting epoll() when there's always active work to be done (run with RUST_THREADS=1). This patch uses the previous two commits to alter the scheduler's behavior to only return from run_sched_once if no work could be found when trying really really hard. If there is active I/O, this commit will perform the same as before, falling back to epoll() to check for I/O completion (to not starve I/O tasks). In the benchmark above, I got the following numbers: 12.554s on today's master 3.861s with rust-lang#12172 applied 2.261s with both this and rust-lang#12172 applied cc rust-lang#8341
no longer relevant to the standard libraries (#17325) |
In the current design the scheduler takes one scheduling action, registers a callback to wake up in the run_sched_once function, and then blocks on the event loop. This "block" is implemented as just finishing the execution associated with the run_sched_once callback. As there is a callback registered it immediately wakes up and repeats the process.
This should be unnecessary. The drop back to the event loop is a very long process relative to a task context swap, so this is likely very bad for performance. A better solution would be to keep the scheduler active until it runs out of stuff to do, and then block on the event loop.
To do this we should have a "live" flag for the scheduler. If the flag is set the scheduler is guarenteed to check for messages and likely work before going to sleep. While active the scheduler algorithm becomes:
The libuv "uv_run" function has three modes. UV_RUN_DEFAULT is what we have been using, a blocking "do stuff until the program is done" use of the event loop. UV_RUN_NOWAIT polls for new events and then does not block if there are no events ready. This is the style that we want for the non-blocking run case. UV_RUN_ONCE is how we would perform blocking runs. It will block until it handles a single event, and then return how many pending events there are.
To wake up the scheduler while it is "sleeping", the old idle-callback is used. Except now the function provided simply modifies the live flag to "awake", and the scheduler loop logic takes over. In any case where an idle callback would be registered and the live flag is set the callback can be skipped.
The text was updated successfully, but these errors were encountered: