Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

child_process: only stop readable side of stream passed to process #27373

Closed
wants to merge 4 commits into from

Conversation

addaleax
Copy link
Member

Closing the underlying resource completely has the unwanted side effect
that the stream can no longer be used at all, including passing it
to other child processes.

What we want to avoid is accidentally reading from the stream;
accordingly, it should be sufficient to stop its readable side
manually, and otherwise leave the underlying resource intact.

Fixes: #27097
Refs: #21209

I’d love to get reviews from @gireeshpunathil, @arcanis, @elibarzilay and maybe @mcollina (esp. regarding the pause() vs push(null) choice?).

Checklist
  • make -j4 test (UNIX), or vcbuild test (Windows) passes
  • tests and/or benchmarks are included
  • commit message follows commit guidelines

@nodejs-github-bot

This comment has been minimized.

@nodejs-github-bot nodejs-github-bot added the child_process Issues and PRs related to the child_process subsystem. label Apr 23, 2019
@addaleax addaleax added the stream Issues and PRs related to the stream subsystem. label Apr 23, 2019
stream._stdio.emit('close');
}
stream.handle.readStop();
stream._stdio.pause();
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of this, we could also use stream._stdio.push(null) and disable reading permanently, instead of until it’s explicitly started again. Any thoughts on that? Generally, the use case of “let another process read from this pipe, once that process is done we’ll continue reading ourselves” is not absurd or anything – shell scripts do that a lot.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I never saw it myself but I can understand why just pausing the stream might be useful for piping logs or similar if the process crashes. If there's no technical drawback it sounds strictly better in term of enabled use cases, compared to closing it for good.

A few questions to better understand how it works:

  • What does readStop do, particularly on writable streams? Is it just a noop?
  • Maybe related, what does pause exactly do? Keep the fd open but stop polling it for data for readable streams, and buffer what's written into it for writable streams?
  • Since you call both pause and readStop, how is it meant to be resumed? Is it just resume(), or is there another function that should be called to counteract readStop?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I never saw it myself but I can understand why just pausing the stream might be useful for piping logs or similar if the process crashes. If there's no technical drawback it sounds strictly better in term of enabled use cases, compared to closing it for good.

@arcanis Okay, thanks! I don’t think there are any technical drawbacks to this.

A few questions to better understand how it works:

  • What does readStop do, particularly on writable streams? Is it just a noop?

It stops libuv from listening for readable events on the fd until readStart() is called again. In partiuclar, no read() system call will occur during that time.

  • Maybe related, what does pause exactly do? Keep the fd open but stop polling it for data for readable streams, and buffer what's written into it for writable streams?

pause() makes the stream stop emitting 'data' events. According to the documentation, it does not stop emitting readable events, but I guess in that case it’s up to the user to take care of not calling .read() anyway.

  • Since you call both pause and readStop, how is it meant to be resumed? Is it just resume(), or is there another function that should be called to counteract readStop?

Thanks for pointing this out – yes, one thing more is necessary here: Adding stream.handle.reading = false;, so that ._read() will call readStart() again:

node/lib/net.js

Line 522 in eac8f50

} else if (!this._handle.reading) {
, as well as resetting the stream state so that _read() actually gets called once we resume the stream. (None of these things are signs of a great streams implementation… we should probably get rid of handle.reading altogether, but that’s not really a concern for this PR.)

(Both calling .resume() explicitly and using .on('data') to start reading again should work.)

I’ve added those things, and also added a test file for this scenario.

Copy link
Contributor

@arcanis arcanis Apr 23, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a lot of flags, eheh! 😄

Btw, why isn't readStop enough? If the libuv doesn't listen for readable events anymore, I guess that the data events won't be triggered, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arcanis The issue that I’m seeing is, that at the time that this code is executed, there might be an existing 'data' listener or similar that reads data from the stream. When the readable stream buffer drains, the streams implementation calls _read(), and for sockets that leads to a handle.readStart() call, undoing the effects of the handle.readStop() call here.

I mean, sure, it’s somewhat questionable to just ignore existing listeners that read data from the stream, but at this point I feel like that’s a reasonable solution?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, that makes sense, thanks for the explanation!

@addaleax addaleax changed the title child_process: only end readable side of stream passed to process child_process: only stop readable side of stream passed to process Apr 23, 2019
Closing the underlying resource completely has the unwanted side effect
that the stream can no longer be used at all, including passing it
to other child processes.

What we want to avoid is accidentally reading from the stream;
accordingly, it should be sufficient to stop its readable side
manually, and otherwise leave the underlying resource intact.

Fixes: nodejs#27097
Refs: nodejs#21209
@nodejs-github-bot

This comment has been minimized.

@nodejs-github-bot

This comment has been minimized.

@nodejs-github-bot
Copy link
Collaborator

Copy link
Member

@mcollina mcollina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@nodejs-github-bot
Copy link
Collaborator

@sam-github
Copy link
Contributor

Recall that if uv isn't epolling a fd to read it, then it won't detect when it closes, either. I didn't look closely, but from the commentary, this PR might have the side effect of causing the 'close' event not to be emitted. I'm not saying that's a problem, but if it is worrying, maybe something to look into.

@gireeshpunathil
Copy link
Member

@sam-github - is your comment an objection or a request for change?

Does anyone have objections on this one landing? It has necessary approvals, and if it lands today by noon UTC, it goes into v12.1.0 and helps a field reported issue, but checking to make sure we are not rushing it through, or there are no concerns etc.

@gireeshpunathil gireeshpunathil added the author ready PRs that have at least one approval, no pending requests for changes, and a CI started. label Apr 29, 2019
@gireeshpunathil
Copy link
Member

the only failure known_issues/test-vm-timeout-escape-nexttick is known issue; so landing..

@sam-github
Copy link
Contributor

It's a suggestion that you look carefully at close events, which may no longer get emitted if you stop reading. Feel free to disregard, I haven't had time to look at this.

@gireeshpunathil
Copy link
Member

landed as cc7b3fb

gireeshpunathil pushed a commit that referenced this pull request Apr 29, 2019
Closing the underlying resource completely has the unwanted side effect
that the stream can no longer be used at all, including passing it
to other child processes.

What we want to avoid is accidentally reading from the stream;
accordingly, it should be sufficient to stop its readable side
manually, and otherwise leave the underlying resource intact.

Fixes: #27097
Refs: #21209

PR-URL: #27373
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Gireesh Punathil <gpunathi@in.ibm.com>
targos pushed a commit that referenced this pull request Apr 29, 2019
Closing the underlying resource completely has the unwanted side effect
that the stream can no longer be used at all, including passing it
to other child processes.

What we want to avoid is accidentally reading from the stream;
accordingly, it should be sufficient to stop its readable side
manually, and otherwise leave the underlying resource intact.

Fixes: #27097
Refs: #21209

PR-URL: #27373
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Gireesh Punathil <gpunathi@in.ibm.com>
@addaleax addaleax deleted the spawn-pipe branch April 29, 2019 22:21
@targos targos mentioned this pull request May 6, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
author ready PRs that have at least one approval, no pending requests for changes, and a CI started. child_process Issues and PRs related to the child_process subsystem. stream Issues and PRs related to the stream subsystem.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Piping w/ spawn is broken in Node 11 / 12
6 participants