Use a single thread pool for all libraries #411

torkleyy · 2017-03-10T16:32:36Z

It's very unpractical to work with multiple threads if every library uses another thread pool. Please consider using one thread pool which can be shared between frameworks.

Here are some examples:

futures-cpupool - implements own thread pool
rayon - implements own thread pool
specs - uses threadpool crate, but allows sharing it
scoped-threadpool - implements own threadpool

The text was updated successfully, but these errors were encountered:

alexcrichton · 2017-03-10T22:57:48Z

Seems reasonable to me! The cpu pool here though is largely just demonstration purposes, and it shouldn't be too hard to create your own thread pool locally. (or reuse one of these existing ones with futures.

carllerche · 2017-03-10T23:00:13Z

One thing of note, as w/ all things, there is no one size fits all. Rayon has vastly different semantics than futures-cpupool. You probably wouldn't want to use Rayon to drive IO tasks (due to fairness)

That said, something like http://github.com/carllerche/futures-spawn will hopefully help w/ being able to write code abstract over the executor.

leoyvens · 2017-05-20T15:19:07Z

@carllerche I'm curious, how does futures-cpupool and Rayon's future support differ? When should one be recommended over the other?

alexcrichton · 2017-05-20T16:52:22Z

I think @nikomatsakis may actually be the best to comment to that effect.

liranringel · 2017-05-20T17:57:58Z

By saying a single thread pool, do you mean a single instance or just the same code?

carllerche · 2017-05-22T18:04:05Z

@leodasvacas The main difference is scheduling heuristics. Rayon has no fairness guarantees (in fact, it is the opposite of fair). This is because it is geared towards parallel computation of a single result. Aka, you should use it when you only care about the final result and not how each task gets scheduled.

futures-cpupool is 100% fair. Futures are scheduled entirely in the order in which they are scheduled.

There are also other variants of scheduling logic with different trade offs.

nikomatsakis · 2017-05-26T20:24:18Z

I think @carllerche's summary is reasonable; Rayon certainly doesn't try to guarantee fairness (and, at least for some use cases, fairness is not particularly desirable). But I agree with the overall gist of this issue: there should be a way to use "the default" CPU pool, and ideally end-users could change it without requiring their dependencies to be updated.

HadrienG2 · 2017-11-03T12:41:00Z

Just wondering: assuming availability of Rust language features which make this easy (such as const generics), would it be possible and desirable to create a generic thread pool crate which can fit all current use cases in the Rust ecosystem, given only a little bit of compile-time configuration?

I'm asking because writing a good thread pool takes a sizeable amount of effort, and as far as I can see there really are only so many ways to do it. Consider the design constraints which a modern thread pool design must operate upon:

Modern CPUs can have a lot of hardware threads, easily more than 100. Check out Intel's Xeon Phi (up to 72 cores, 4-way SMT) or IBM's POWER9 (up to 24 cores, 8-way SMT) if you are not convinced of that. If you want to scale that far (and future CPUs will likely go further), a design based on a single shared work queue will fail to achieve good performance, and using one work queue per CPU thread with some load balancing algorithm like work stealing becomes necessary.
Unless some form of prioritization (which AFAIK no current Rust thread pool crate provides) is applied, a worker thread which is reaching for more work only has one choice to make: which end of the queue should work be fetched from? Fetching work in FIFO order is better for fairness, whereas fetching work in LIFO order is better for cache locality (and thus performance). There is a real design choice here, but with const generics, this could be a compile-time parameter of the implementation. Even without const generics, we can make this work today using zero sized type-based hacks.
When work is submitted to the thread pool, the caller must have a way to synchronize with its completion. This is exactly what Rust futures were designed for, and they were built to be very lightweight, so except for very specific use cases which would require coarser-grained synchronization (e.g. flushing the work queue on application termination), I think everyone would be happy with a future-based synchronization interface.
Being able to submit extra work from inside the thread pool (i.e. recursive parallelism) is very convenient for many recursive algorithms, and is also a way to reduce work queue pressure by only dividing up work lazily rather than eagerly. I think we will agree that any performant thread pool should support this use case, and that it can be implemented without harming the other use cases.

Is there any design constraint which I am overlooking that would prevent the creation of a common generic thread pool which fulfills everyone's needs without being excessively over-engineered and hard to use? Otherwise, I might be interested in exploring this path further.

carllerche · 2017-11-03T16:15:23Z

I have been working in my spare time on futures-pool which is meant to be a general purpose, shareable, pool for futures. It is not ready yet though. It is being designed for fairness, which is going to be what you usually want when working w/ networking related scenarios IMO.

HadrienG2 · 2017-11-04T11:47:17Z

@carllerche This would be part of what I'm thinking about. The other part, which may be crazier and untractable in practice, would be to make the thread pool library generic over its scheduling policy, in the sense that you can configure it at compile time to use FIFO, LIFO, or maybe even priority-driven scheduling algorithms like EDF.

In this way, people who care most about fairness can configure it in FIFO mode for maximal fairness, people who care most about throughput can configure it in LIFO mode for maximal throughput, people who have real-time latency constraints can configure it in EDF mode... you get the idea. Any algorithm from the OS community's abundant literature on non-preemptive scheduling is potentially applicable to a thread pool, since it's basically a parallel batch scheduler.

I'm currently doodling some code to see how crazy that idea is. But if it's workable and usable, it would be a way to achieve the OP's goal of having one unified thread pool library for all Rust multithreaded programming environments.

carllerche · 2017-11-07T06:49:53Z

I'm not sure what the value of having a thread pool impl generic over the scheduler vs. being generic over T: future::Executor.

I'm also skeptical that one could efficiently write a single thread pool that handles all cases as well as could be.

HadrienG2 · 2017-11-07T08:37:41Z

A well-implemented executor is actually quite a nontrivial object. Among other things, it needs to handle:

Worker thread initialization and CPU pinning (mildly OS-specific)
Distribution of incoming work across workers
Load balancing when a worker is starving
Putting workers to sleep when there has been no work to do for a while (cpu time is expensive, especially on battery-powered stuff), and waking them up when new work is incoming again.
Correct recursive parallelism (which should avoid unnecessary synchronization with other workers when they are already busy, and should not be prioritized like incoming work since it is effectively part of the executing task)
Termination of workers at the end of the program

My point is that there is a good default for almost all of these operations, and that the only thing that truly needs to remain customizable is the worker-local task scheduling policy. Because in the end, that's the only fundamental difference between a thread pool designed for network packet processing (like Tokio's) and a thread pool designed for maximal compute performance (like Rayon's).

So far, my early experiments with the concept of an executor that is generic over its scheduling policy do not match your conclusion that it needs to be inefficient. But once I have a reasonably complete prototype to show, the development of which may reveal further issues, we can discuss this matter more.

aturon · 2018-03-20T20:00:26Z

I'm going to close out this issue. With 0.2's default executors, sharing a thread pool should be much easier.

stuhood mentioned this issue Nov 16, 2017

Snapshots are stored in the LMDB store not tar files pantsbuild/pants#5106

Closed

alexcrichton added the C-feature-request label Jan 25, 2018

jeehoonkang mentioned this issue Feb 10, 2018

Add thread-pool capabilities to scope crossbeam-rs/crossbeam#64

Closed

aturon closed this as completed Mar 20, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use a single thread pool for all libraries #411

Use a single thread pool for all libraries #411

torkleyy commented Mar 10, 2017

alexcrichton commented Mar 10, 2017

carllerche commented Mar 10, 2017

leoyvens commented May 20, 2017

alexcrichton commented May 20, 2017

liranringel commented May 20, 2017

carllerche commented May 22, 2017

nikomatsakis commented May 26, 2017

HadrienG2 commented Nov 3, 2017 •

edited

Loading

carllerche commented Nov 3, 2017

HadrienG2 commented Nov 4, 2017 •

edited

Loading

carllerche commented Nov 7, 2017

HadrienG2 commented Nov 7, 2017 •

edited

Loading

aturon commented Mar 20, 2018

Use a single thread pool for all libraries #411

Use a single thread pool for all libraries #411

Comments

torkleyy commented Mar 10, 2017

alexcrichton commented Mar 10, 2017

carllerche commented Mar 10, 2017

leoyvens commented May 20, 2017

alexcrichton commented May 20, 2017

liranringel commented May 20, 2017

carllerche commented May 22, 2017

nikomatsakis commented May 26, 2017

HadrienG2 commented Nov 3, 2017 • edited Loading

carllerche commented Nov 3, 2017

HadrienG2 commented Nov 4, 2017 • edited Loading

carllerche commented Nov 7, 2017

HadrienG2 commented Nov 7, 2017 • edited Loading

aturon commented Mar 20, 2018

HadrienG2 commented Nov 3, 2017 •

edited

Loading

HadrienG2 commented Nov 4, 2017 •

edited

Loading

HadrienG2 commented Nov 7, 2017 •

edited

Loading