Show that when benchmarking properly there's no difference #1

domenic · 2015-04-09T04:16:15Z

I was trying to see what micro-optimizations I could make (first few commits). The results were really inconsistent as I reloaded my browser. I added the ratio statistic (fourth commit) to confirm this. It indeed jumped between 0.8x and 100x pretty much at random. So I thought it was time to bring out the big guns and actually use a benchmarking library to get some better statistics.

The fifth commit introduces benchmark.js, and also introduces a check on the .value property (which should avoid JITs optimizing away most of the loop). My results:

Firefox index.html
"sync x 234 ops/sec ±4.25% (11 runs sampled)" test.js:14:2
"promise x 4,217 ops/sec ±19.94% (57 runs sampled)"

Firefox bluebird.html
"sync x 230 ops/sec ±4.00% (11 runs sampled)" test.js:14:2
"promise x 245 ops/sec ±2.73% (79 runs sampled)"

Chrome index.html
sync x 243 ops/sec ±0.76% (8 runs sampled)
test.js:14 promise x 228 ops/sec ±0.70% (33 runs sampled)

Chrome bluebird.html
sync x 242 ops/sec ±0.46% (9 runs sampled)
test.js:14 promise x 243 ops/sec ±0.50% (20 runs sampled)

So if anything promises are faster.

Let me know if you see anything obviously stupid... /cc @petkaantonov

Delta: minor (0.01-ish in all configs)

Delta: -0.2 in Firefox for both Bluebird and native

It fluctuates wildly, which is not a good sign for the reliability of this benchmark

This is not really making a difference

With proper benchmarking there's no difference between the two.

domenic · 2015-04-09T04:35:15Z

Note that I think the reason benchmark.js is helpful here is because IIUC it runs until the results start settling down. So the fact that we have e.g. 57 runs sampled vs. 11 runs sampled just means that the JIT needed a few more runs to optimize the promise code than it did the sync code.

petkaantonov · 2015-04-09T06:28:16Z

Why is the sync test async? How much does the result change when you change it to sync?

domenic · 2015-04-09T13:59:51Z

I made the test async so as to give each of them an equal footing. This is supposed to be a test of how long it takes to do 10 reads in each paradigm, not how many 10-reads you can do in a second. Using different benchmark settings for each result seems like it would give an unfair advantage to the sync one because e.g. if benchmark.js uses setTimeout(,0) in its deferred implementation, that would obviously dominate the results. (Maybe that is what we are seeing here, given that they are so close...)

domenic · 2015-04-09T14:04:15Z

If we are suspicious of benchmark.js we can try evolving https://github.com/domenic/streams-promise-read/tree/no-benchmarkjs. It currently shows a ~3x slowdown for promises in Chrome, ~18x in Firefox, but I still suspect we haven't properly warmed up the JIT.

petkaantonov · 2015-04-09T15:55:24Z

Is it possible to benchmark inside a fresh web worker, this would be similar to how node benchmark or bluebird benchmarks do it. Fresh process -> warmup -> do benchmark -> shutdown -> rinse and repeat for next benchmark.

wanderview · 2015-04-09T16:26:58Z

I haven't looked at the code yet, but this promise result looks completely bogus to me:

Firefox index.html
"sync x 234 ops/sec ±4.25% (11 runs sampled)" test.js:14:2
"promise x 4,217 ops/sec ±19.94% (57 runs sampled)"

wanderview · 2015-04-09T16:54:11Z

I think the async callback at the end of the sync test is dominating the results with these low numbers of chunks. If you increase the number of chunks to 1000+ you can see the sync case pulling away from promises.

1000 chunks:
"sync x 146 ops/sec ±16.29% (36 runs sampled)" test.js:14:3
"promise x 96.67 ops/sec ±4.96% (68 runs sampled)" test.js:14:3

10000 chunks:
"sync x 40.54 ops/sec ±21.98% (32 runs sampled)" test.js:14:3
"promise x 21.55 ops/sec ±5.12% (53 runs sampled)" test.js:14:3

I guess we can debate the reality of being able to read that many chunks synchronously. (Or wisdom of doing so given frame deadlines, etc.)

I do want to re-run these tests on some mobile devices, though.

wanderview · 2015-04-09T16:56:49Z

The results in the previous comment were for bluebird. For default Firefox promises... they are much worse:

1000 chunks:
"sync x 119 ops/sec ±27.87% (23 runs sampled)" test.js:14:3
"promise x 26.71 ops/sec ±2.97% (60 runs sampled)" test.js:14:3

10000 chunks:
"sync x 35.56 ops/sec ±20.20% (21 runs sampled)" test.js:14:3
"promise x 2.80 ops/sec ±8.60% (17 runs sampled)" test.js:14:3

@domenic

Merge @domenic's changes to use benchmark.js, avoid the sync loop from being optimized out, etc.

wanderview · 2015-04-09T17:22:50Z

FYI, I opened a bug for firefox's poor performance here: https://bugzilla.mozilla.org/show_bug.cgi?id=1152875

wanderview · 2015-04-09T17:34:38Z

I haven't looked at the code yet, but this promise result looks completely bogus to me:
Firefox index.html
"sync x 234 ops/sec ±4.25% (11 runs sampled)" test.js:14:2
"promise x 4,217 ops/sec ±19.94% (57 runs sampled)"

I can reproduce this in FF release, but I can't explain the insane number of results we get with the promises here.

domenic added 6 commits April 8, 2015 23:34

Add meta charset to avoid annoying warnings

5fe6614

Only create the executor for read once per reader

e0b97ba

Delta: minor (0.01-ish in all configs)

Use Promise.resolve instead of new Promise for reader

0070182

Delta: -0.2 in Firefox for both Bluebird and native

Add ratio statistic

3c8bda0

It fluctuates wildly, which is not a good sign for the reliability of this benchmark

Try a less convoluted reader creation

87c1999

This is not really making a difference

Use benchmark.js

f846cf3

With proper benchmarking there's no difference between the two.

domenic mentioned this pull request Apr 9, 2015

rsReader.readBatch(numberOfChunks)? rbsReader.read(view, { waitUntilDone: true })? whatwg/streams#320

Closed

wanderview pushed a commit that referenced this pull request Apr 9, 2015

Merge pull request #1 from domenic/gh-pages

923497d

Merge @domenic's changes to use benchmark.js, avoid the sync loop from being optimized out, etc.

wanderview merged commit 923497d into wanderview:gh-pages Apr 9, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Show that when benchmarking properly there's no difference #1

Show that when benchmarking properly there's no difference #1

domenic commented Apr 9, 2015

domenic commented Apr 9, 2015

petkaantonov commented Apr 9, 2015

domenic commented Apr 9, 2015

domenic commented Apr 9, 2015

petkaantonov commented Apr 9, 2015

wanderview commented Apr 9, 2015

wanderview commented Apr 9, 2015

wanderview commented Apr 9, 2015

wanderview commented Apr 9, 2015

wanderview commented Apr 9, 2015

Show that when benchmarking properly there's no difference #1

Show that when benchmarking properly there's no difference #1

Conversation

domenic commented Apr 9, 2015

domenic commented Apr 9, 2015

petkaantonov commented Apr 9, 2015

domenic commented Apr 9, 2015

domenic commented Apr 9, 2015

petkaantonov commented Apr 9, 2015

wanderview commented Apr 9, 2015

wanderview commented Apr 9, 2015

wanderview commented Apr 9, 2015

wanderview commented Apr 9, 2015

wanderview commented Apr 9, 2015