feat(stream): make Readable.from performance better #37609

wwwzbwcom · 2021-03-05T11:56:41Z

prevent unnecessary await
convert recursive to loop

benjamingr · 2021-03-05T12:20:47Z

Thanks, I'm not sure I entirely understand this change and the for loop - will comment inline also cc @nodejs/streams

benjamingr · 2021-03-05T12:22:33Z

lib/internal/streams/from.js

+        if (isAysnc) {
+          r = await iterator.next();
+        } else {
+          r = iterator.next();


The change here is that iterator.next() with a sync iterator is not awaited and if there are multiple next calls it is pushed directly without waiting for the next next?

if the iterator is not async, it's not neccessary to await it

lib/internal/streams/from.js

ronag · 2021-03-05T12:29:43Z

Any benchmark?

ronag · 2021-03-05T12:30:26Z

I think it would be more readable to split this into two separate methods, i.e. syncNext and asyncNext.

benjamingr · 2021-03-05T12:44:21Z

@ronag pun intended 😅 ?

lib/internal/streams/from.js

wwwzbwcom · 2021-03-08T03:08:04Z

Thanks for so much help.
Just create draft PR to save my work earlier, haven't test and check carefully yet, sorry for that

wwwzbwcom · 2021-03-08T05:14:57Z

Any benchmark?

@ronag

Here is a simple benchmark shows that prevent unecessary await can bring more than 5x faster performance:

Code:

'use strict';

async function bench() {
  console.time('await');
  for (let i = 0; i < 1e8; i++) {
    let a = await i;
  }
  console.timeEnd('await');

  console.time('no await');
  for (let i = 0; i < 1e8; i++) {
    let a = i;
  }
  console.timeEnd('no await');

  console.time('check await');
  for (let i = 0; i < 1e8; i++) {
    if (i === Promise.resolve(i)) {
      let a = await i;
    } else {
      let a = i;
    }
  }
  console.timeEnd('check await');
}

bench();

Result:

await: 6.409s
no await: 77.315ms
check await: 1.266s

Also, loop always bring better time and space performance compares to recursive.

lib/internal/streams/from.js

ronag

LGTM if @benjamingr also approves.

lib/internal/streams/from.js

benjamingr

I am generally fine with this - thanks :)

The PromiseResolve probably bit needs to be amended to work with third-party thenables
Since this is a performance PR - I want to see a benchmark so we know this actually improves performance.
This definitely needs a CITGM run IMO since it changes the timings of Readable.from in non-trivial ways.

Once there is a benchmark (let me know) I'll start a CITGM run, if you want help with setting the benchmark up - feel free to ask any questions or ask for assistance here :)

wwwzbwcom · 2021-03-08T09:23:41Z

I am generally fine with this - thanks :)

The PromiseResolve probably bit needs to be amended to work with third-party thenables

Since this is a performance PR - I want to see a benchmark so we know this actually improves performance.

This definitely needs a CITGM run IMO since it changes the timings of Readable.from in non-trivial ways.

Once there is a benchmark (let me know) I'll start a CITGM run, if you want help with setting the benchmark up - feel free to ask any questions or ask for assistance here :)

@benjamingr Thanks for your advice

I think I know how to write a benchmark, but dont know how to let the benchmark compare between current and the earlier version of code.

Should I benchmark earlier version manually and post the result, or there is a better way?

ronag · 2021-03-08T09:34:28Z

The PromiseResolve probably bit needs to be amended to work with third-party thenables

What does await do in this case? I think there should be a spec for this somewhere? Do we have/need a util.isPromise/util.isThenable helper?

Linkgoron · 2021-03-08T09:49:02Z

I am generally fine with this - thanks :)

The PromiseResolve probably bit needs to be amended to work with third-party thenables

Since this is a performance PR - I want to see a benchmark so we know this actually improves performance.

This definitely needs a CITGM run IMO since it changes the timings of Readable.from in non-trivial ways.

Once there is a benchmark (let me know) I'll start a CITGM run, if you want help with setting the benchmark up - feel free to ask any questions or ask for assistance here :)

@benjamingr Thanks for your advice

I think I know how to write a benchmark, but dont know how to let the benchmark compare between current and the earlier version of code.

Should I benchmark earlier version manually and post the result, or there is a better way?

There's a guide on how to create a benchmark here:
https://github.com/nodejs/node/blob/master/doc/guides/writing-and-running-benchmarks.md#basics-of-a-benchmark

How to compare different node versions here:
https://github.com/nodejs/node/blob/master/doc/guides/writing-and-running-benchmarks.md#comparing-nodejs-versions

There are also some Readable benchmarks in benchmark/streams which might be relevant

benjamingr · 2021-03-08T11:44:07Z

@ronag

What does await do in this case? I think there should be a spec for this somewhere? Do we have/need a util.isPromise/util.isThenable helper?

await implicitly calls Promise.resolve in this case (with the native promise resolve), but the explicit goal of this PR is to optimize that await away in certain cases. While usually I am not a fan of a util.isThenable here it seems appropriate.

Co-authored-by: Robert Nagy <ronagy@icloud.com>

wwwzbwcom · 2021-03-12T10:25:56Z

I am generally fine with this - thanks :)

The PromiseResolve probably bit needs to be amended to work with third-party thenables

Since this is a performance PR - I want to see a benchmark so we know this actually improves performance.

This definitely needs a CITGM run IMO since it changes the timings of Readable.from in non-trivial ways.

Once there is a benchmark (let me know) I'll start a CITGM run, if you want help with setting the benchmark up - feel free to ask any questions or ask for assistance here :)

@benjamingr Hello, I add the benchmark, looks like it has a 30% improvement here's the result:

# Run bechmark
node benchmark/compare.js --old ./node-master --new ./node-dev --filter "readable-from" streams > compare-pr-dev.csv
cat compare-dev.csv | Rscript benchmark/compare.R

# Result
Warning message:
In as.POSIXlt.POSIXct(Sys.time()) :
  unknown timezone 'default/Asia/Shanghai'
                                   confidence improvement accuracy (*)   (**)  (***)
streams/readable-from.jsn=10000000        ***     30.15 %       ±4.32% ±5.76% ±7.52%

Be aware that when doing many comparisons the risk of a false-positive
result increases. In this case, there are 1 comparisons, you can thus
expect the following amount of false-positive results:
  0.05 false positives, when considering a   5% risk acceptance (*, **, ***),
  0.01 false positives, when considering a   1% risk acceptance (**, ***),
  0.00 false positives, when considering a 0.1% risk acceptance (***)

benjamingr · 2021-03-12T21:51:00Z

One last comment

wwwzbwcom · 2021-03-14T02:30:47Z

One last comment

@benjamingr Where is the comment, I can't find it

bricss · 2021-03-17T14:56:47Z

@wwwzbwcom here #37609 (comment)

mcollina

lgtm

benchmark/streams/readable-from.js

wwwzbwcom · 2021-03-18T07:11:07Z

@benjamingr check for thenable has fixed ;)

nodejs-github-bot · 2021-03-18T14:15:18Z

CI: https://ci.nodejs.org/job/node-test-pull-request/36777/

benjamingr · 2021-03-18T14:15:48Z

CITGM: https://ci.nodejs.org/view/Node.js-citgm/job/citgm-smoker/2647/

wwwzbwcom · 2021-03-18T17:37:40Z

CITGM: https://ci.nodejs.org/view/Node.js-citgm/job/citgm-smoker/2647/

I cant find the relation between failing checks and my changes, can anyone help me figure out? Thanks :)

nodejs-github-bot · 2021-03-19T20:53:44Z

CI: https://ci.nodejs.org/job/node-test-pull-request/36825/

bricss · 2021-03-21T23:31:26Z

@wwwzbwcom it looks like CITGM pipeline were red for months already, might be nothing to worry about 😬

benjamingr · 2021-03-22T09:21:08Z

Thank you for your contribution and patience. Let's hope this doesn't break too much :)

Please note the naming convention for the future :) (Which is subsystem: change and not semantic commits)

Landed in 2c251ff 🎉

PR-URL: #37609 Reviewed-By: Robert Nagy <ronagy@icloud.com> Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com> Reviewed-By: Matteo Collina <matteo.collina@gmail.com>

Yakima-Teng · 2021-08-26T14:29:02Z

benchmark/streams/readable-from.js

+  bench.start();
+  s.on('data', (data) => {
+    // eslint-disable-next-line no-unused-expressions
+    data;


Why "data" is here?

nodejs-github-bot added the needs-ci PRs that need a full CI run. label Mar 5, 2021

benjamingr added the stream Issues and PRs related to the stream subsystem. label Mar 5, 2021

benjamingr reviewed Mar 5, 2021

View reviewed changes

ronag reviewed Mar 5, 2021

View reviewed changes