Skip to content
This repository has been archived by the owner on Nov 5, 2018. It is now read-only.

FIFO and LIFO strategies #13

Merged
3 commits merged into from Jul 6, 2018
Merged

FIFO and LIFO strategies #13

3 commits merged into from Jul 6, 2018

Conversation

ghost
Copy link

@ghost ghost commented Jun 28, 2018

This PR adds two different ordering strategies to the deque: FIFO and LIFO. The strategy is chosen at construction time. There are two deque constructors:

fn fifo<T>() -> (Worker<T>, Stealer<T>);
fn lifo<T>() -> (Worker<T>, Stealer<T>);

FIFO is intended to be used by Tokio and LIFO is for Rayon. The FIFO variant doesn't execute any fences.

I've tried running benchmarks on Rayon's test suite and didn't find any significant differences between this PR and the current master branch.

The LIFO variant is the same as our current Chase-Lev implementation, while FIFO is slightly tweaked CircBuf. I've fixed a few bugs in CircBuf, cleaned up the code, and conservatively added a bit stronger memory orderings where I found them dubious. This time correctness is the primary concern, but we can revisit those orderings later to squeeze out more performance.

In a future PR we can also add methods for batched stealing (steal half), as explained in #11.

Another thing we should consider adding later is workers that can be shared among multiple threads. In addition to thread-local deques, schedulers usually also have a special global deque, where tasks without an associated worker thread go into (rayon::spawn pushes tasks into the global deque). Rayon currently uses Mutex<Worker<T>> and Stealer<T> for the global deque. The mutex around Worker<T> is unfortunate so we should ideally add a variant of deque that supports push and pop from multiple threads without mutexes. Such a deque would probably be implemented as a Treiber stack when using LIFO strategy and as a Michael-Scott queue when using FIFO strategy.

Closes #11
cc @carllerche

@ghost ghost requested a review from jeehoonkang June 28, 2018 20:35
@ghost ghost force-pushed the master branch 3 times, most recently from 44d755b to df1ac52 Compare June 28, 2018 21:21
@ghost
Copy link
Author

ghost commented Jul 3, 2018

What do you think, @jeehoonkang? :)

I don't think this PR needs thorough review - it's just refactoring + CircBuf with slightly stronger orderings (for now).

@ghost
Copy link
Author

ghost commented Jul 6, 2018

As suggested in tokio-rs/tokio#426, I've moved some worker-specific data from Inner into Worker. This means buffer pointer, buffer capacity, and the back index get inlined directly into worker::Entry.

Comparison between the Tokio master branch and my local branch that uses this PR:

 name                    before ns/iter  after ns/iter  diff ns/iter  diff %  speedup
 threadpool::spawn_many  4,659,536       4,199,011          -460,525  -9.88%   x 1.11
 threadpool::yield_many  11,949,901      12,170,009          220,108   1.84%   x 0.98

The yield_many benchmark is slightly slower now - my guess is that's because the steal operation is now a bit slower in case the deque is empty (because we have to load the stamp inside the buffer instead of loading the back index). But I believe this is overall for the better.

I'm also hopeful that batched stealing will give us further performance wins.

(Just a reminder: this is blocked on Tokio upgrading to Rust 1.25).

@carllerche
Copy link

I merged tokio-rs/tokio#465, so Tokio should be on 1.25

@ghost
Copy link
Author

ghost commented Jul 6, 2018

Ok, let's merge this.

@jeehoonkang I'm going to skip the review - if you have any comments on this PR, let me know anytime. :)

@ghost ghost merged commit 43c9376 into crossbeam-rs:master Jul 6, 2018
@ghost ghost deleted the new-interface branch July 6, 2018 21:54
This pull request was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Development

Successfully merging this pull request may close these issues.

An option for choosing between LIFO and FIFO ordering
1 participant