ObserveOn throughput enhancements #2804

akarnokd · 2015-03-05T11:23:12Z

Squashed commits of #2773.

Further optimizations to observeOn.

Using SpscArrayQueue directly in observeOn instead of RingBuffer to avoid the synchronization block
Split tracking structure to serial (SubscriptionList) and timed (CompositeSubscription) in EventLoopsScheduler which improves the sequential scheduling performance because a completing task's subscription will be most likely the first item in the underlying LinkedList.

Benchmark: (i7 920, Window 7 x64, Java 1.8u31, 5x1s warmup, 5x5s iteration)

Benchmark      (size)         1.x    1.x error      this PR   this error
observeOn           1  162326,012     2458,085   166536,559     3154,174
observeOn          10  132471,205     1857,434   142517,407     3734,424 ++
observeOn         100   43282,527     2145,910   112238,179     2270,103 ++
observeOn        1000   11779,482      173,370    25726,564      309,193 ++
observeOn        2000    6756,211       89,196    12123,276      276,470 ++
observeOn        3000    4736,893      253,796     9342,673      263,667 ++
observeOn        4000    3661,874       51,359     7346,015      123,049 ++
observeOn       10000    1519,282      108,503     1546,547       21,885
observeOn      100000     151,193        2,569      156,160        1,974
observeOn     1000000      15,373        1,310       15,660        0,153
subscribeOn         1  161290,037     2867,882   164952,259      797,408
subscribeOn        10  151842,821     2448,734   147906,491     4373,682
subscribeOn       100  136418,065     1773,558   136889,052     2362,203
subscribeOn      1000   58389,066     4559,030    59482,225     1372,692
subscribeOn      2000   34089,152     9318,205    36581,203     1264,100
subscribeOn      3000   26712,331     1265,442    26519,320     1319,293
subscribeOn      4000   20118,326     2018,439    20163,395      839,709
subscribeOn     10000    8914,213      677,164     9059,934      200,158
subscribeOn    100000     958,038       43,349      965,663       60,708
subscribeOn   1000000      91,849        2,148       92,706        1,202

Notes:

At size = 1, the throughput varies in a +/- 3000 range on each run, and since the changes don't touch the scalar optimization, there is no real improvement there.
At size = 10.000 my system reached either the cache capacity or the OS scheduler's time resolution so there no improvement there on.
At size = 100.000 and size = 1.000.000 the throughput doubles if I introduce some extra delay (i.e., via sleep(1) or some extra work).
The benchmark generates a lot of garbage due to boxing: switching to a constant emitter increases the throughput subscribeOn(1.000.000) from 91 to 136.

Since it conflicts with #2772 anyway, this is PR is to let others verify the optimizations actually work on other OSes, because on my Windows, I sometimes get significant variance in the throughput during iterations. Increased iteration time may be required as well.

ObserveOn throughput enhancements

ObserveOn throughput enhancements

94b53d6

akarnokd added the Enhancement label Mar 5, 2015

akarnokd added this to the 1.1 milestone Mar 5, 2015

benjchristensen added a commit that referenced this pull request Mar 6, 2015

Merge pull request #2804 from akarnokd/Perf0225

ecbd27d

ObserveOn throughput enhancements

benjchristensen merged commit ecbd27d into ReactiveX:1.x Mar 6, 2015

benjchristensen mentioned this pull request Mar 6, 2015

Version 1.0.8 #2812

Closed

akarnokd deleted the Perf0225 branch March 11, 2015 13:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ObserveOn throughput enhancements #2804

ObserveOn throughput enhancements #2804

akarnokd commented Mar 5, 2015

ObserveOn throughput enhancements #2804

ObserveOn throughput enhancements #2804

Conversation

akarnokd commented Mar 5, 2015