Attempt to optimize LocalExecutor #144

james7132 · 2025-07-24T04:23:16Z

This PR makes LocalExecutor it's own separate implementation instead of wrapping an Executor, forking it's own State type utilizing unsynchronized !Send types where possible: Arc -> Rc, Mutex -> RefCell, AtomicPtr -> Cell<*mut T>, ConcurrentQueue -> VecDeque. This implementation also removes any extra operations that assumes there are other concurrent Runners/Tickers (i.e. local queues, extra notifications).

For testing, I've duplicated most of the single-threaded compatible executor tests to ensure there's equivalent coverage on the new independent LocalExecutor.

I've also made an attempt to write this with UnsafeCell instead of RefCell, but this litters a huge amount of unsafe that might be too much for this crate. The gains here might not be worth it.

I previously wrote some additional benchmarks but lost the changes to a stray git reset --hard when benchmarking.
The gains here are substantial. Here are the results:

single_thread/local_executor::spawn_one
                        time:   [130.05 ns 130.98 ns 132.16 ns]
                        change: [-80.586% -80.413% -80.214%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 14 outliers among 100 measurements (14.00%)
  4 (4.00%) high mild
  10 (10.00%) high severe
single_thread/local_executor::spawn_batch
                        time:   [17.195 µs 22.167 µs 32.281 µs]
                        change: [-30.007% +0.4082% +41.137%] (p = 0.98 > 0.05)
                        No change in performance detected.
Found 17 outliers among 100 measurements (17.00%)
  7 (7.00%) high mild
  10 (10.00%) high severe
Benchmarking single_thread/local_executor::spawn_many_local: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 856.5s, or reduce sample count to 10.
single_thread/local_executor::spawn_many_local
                        time:   [2.7766 ms 2.8091 ms 2.8453 ms]
                        change: [-21.653% -8.5454% +6.4249%] (p = 0.30 > 0.05)
                        No change in performance detected.
Found 6 outliers among 100 measurements (6.00%)
  2 (2.00%) high mild
  4 (4.00%) high severe
single_thread/local_executor::spawn_recursively
                        time:   [19.618 ms 20.071 ms 20.558 ms]
                        change: [-24.237% -22.461% -20.762%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
  6 (6.00%) high mild
Benchmarking single_thread/local_executor::yield_now: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 9.7s, enable flat sampling, or reduce sample count to 50.
single_thread/local_executor::yield_now
                        time:   [1.8382 ms 1.8421 ms 1.8470 ms]
                        change: [-52.888% -52.744% -52.570%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 10 outliers among 100 measurements (10.00%)
  1 (1.00%) high mild
  9 (9.00%) high severe
single_thread/local_executor::channels
                        time:   [9.9259 ms 9.9355 ms 9.9452 ms]
                        change: [-15.797% -15.656% -15.533%] (p = 0.00 < 0.05)
                        Performance has improved.
single_thread/local_executor::web_server
                        time:   [54.839 µs 56.776 µs 59.062 µs]
                        change: [-23.926% -18.725% -13.009%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
  5 (5.00%) high mild
  1 (1.00%) high severe

TODO: Recreate these benchmarks.

james7132 · 2025-07-24T15:04:30Z

Hmmm, VecDeque::new is not const until Rust 1.68. It could be trivially replaced with LinkedList which has a const constructor in 1.63, but it would be a major regression in performance.

taiki-e · 2025-07-25T13:35:42Z

There is no need to be worried about the MSRV of const VecDeque::new, because we will be able to raise the MSRV after about two weeks: smol-rs/smol#244 (comment)

james7132 added 3 commits July 22, 2025 18:18

First attempt at a more optimized LocalExecutor

bbcdbf0

Remove remaining atomics

4e7d516

Update the safety comments

f6fbaf9

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Attempt to optimize LocalExecutor #144

Attempt to optimize LocalExecutor #144

james7132 commented Jul 24, 2025 •

edited

Loading

Uh oh!

james7132 commented Jul 24, 2025

Uh oh!

taiki-e commented Jul 25, 2025

Uh oh!

Uh oh!

Attempt to optimize LocalExecutor #144

Are you sure you want to change the base?

Attempt to optimize LocalExecutor #144

Conversation

james7132 commented Jul 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

james7132 commented Jul 24, 2025

Uh oh!

taiki-e commented Jul 25, 2025

Uh oh!

Uh oh!

james7132 commented Jul 24, 2025 •

edited

Loading