-
Notifications
You must be signed in to change notification settings - Fork 30.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Piping w/ spawn is broken in Node 11 / 12 #27097
Comments
@arcanis - thanks for reporting this, I will have a look |
This comment has been minimized.
This comment has been minimized.
Git bisect between 197efb7f846e14bdb3002f5ff7528d4070880328 is the first bad commit
commit 197efb7f846e14bdb3002f5ff7528d4070880328
Author: Gireesh Punathil <gpunathi@in.ibm.com>
Date: Fri Jun 8 08:33:37 2018 -0400
child_process: close pipe ends that are re-piped
when t0 and t1 are spawned with t0's outputstream [1, 2] is piped into
t1's input, a new pipe is created which uses a copy of the t0's fd.
This leaves the original copy in Node parent, unattended. Net result is
that when t0 produces data, it gets bifurcated into both the copies
Detect the passed handle to be of 'wrap' type and close after the
native spawn invocation by which time piping would have been over.
Fixes: https://github.com/nodejs/node/issues/9413
Fixes: https://github.com/nodejs/node/issues/18016
PR-URL: https://github.com/nodejs/node/pull/21209
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Anna Henningsen <anna@addaleax.net>
-bash-4.2$ git bisect log
git bisect start
# good: [8e24451439a9abbc1cd99b42066eec65473e781c] 2019-01-30, Version 11.9.0 (Current)
git bisect good 8e24451439a9abbc1cd99b42066eec65473e781c
# bad: [6e56771f2a9707ddf769358a4338224296a6b5fe] 2018-02-14, Version 11.10.0 (Current)
git bisect bad 6e56771f2a9707ddf769358a4338224296a6b5fe
# bad: [2b1f88185fca2bc73ef8f6d09422626e9e237636] benchmark: remove unreachable return
git bisect bad 2b1f88185fca2bc73ef8f6d09422626e9e237636
# good: [e28d891788069732d6e37a00459812bee02910fc] src: fix race condition in `~NodeTraceBuffer`
git bisect good e28d891788069732d6e37a00459812bee02910fc
# bad: [353de0f7520bce50b490f367048ee1e58894a784] doc: fix err_synthetic issue on v11.x
git bisect bad 353de0f7520bce50b490f367048ee1e58894a784
# good: [b5a8376ffe2f380103ebe7f7314d5c7a12500084] src: organize TLSWrap declarations by parent
git bisect good b5a8376ffe2f380103ebe7f7314d5c7a12500084
# good: [1c6fadea3135e5f95f3042cf06cd71de49a33e8f] meta: clarify EoL platform support
git bisect good 1c6fadea3135e5f95f3042cf06cd71de49a33e8f
# good: [2c737a89d59a0c3fa471d36031c64550607fc367] repl: remove obsolete buffer clearing
git bisect good 2c737a89d59a0c3fa471d36031c64550607fc367
# bad: [197efb7f846e14bdb3002f5ff7528d4070880328] child_process: close pipe ends that are re-piped
git bisect bad 197efb7f846e14bdb3002f5ff7528d4070880328
# good: [c866b52942b326674e9ab2adf52467d5f27cdf53] test: remove obsolete code
git bisect good c866b52942b326674e9ab2adf52467d5f27cdf53
# first bad commit: [197efb7f846e14bdb3002f5ff7528d4070880328] child_process: close pipe ends that are re-piped
-bash-4.2$ |
thanks @richardlau - I dont know what mistake I was doing yesterday, but now I confirm that the behavior difference is caused by 197efb7 will analyze the test and and the failure to see what this means. |
ok, here is a deadlock situation: $ cat true.js const {spawn} = require(`child_process`);
const p3 = spawn('cat', {stdio: ['pipe', process.stdout, process.stderr]});
const p1 = spawn('./foo.sh', {stdio: ['pipe', p3.stdin, process.stderr]});
const p2 = spawn('./bar.sh', {stdio: ['pipe', p3.stdin, process.stderr]}); $ cat false.js const { spawn } = require("child_process");
let p3 = spawn('wc', ['-l'], {stdio: ['pipe', process.stdout, process.stderr]})
let p2 = spawn('grep', ['spawn'], {stdio: ['pipe', p3.stdin, process.stderr]})
let p1 = spawn('cat', [__filename], {stdio: ['pipe', p2.stdin, process.stderr]}) without 197efb7 true.js passes while false.js fails the rationale for 197efb7 is that pipe ends when re-piped were loosing data at either of the destination becaue of the bifurcation, making re-piping a meaningless use case. At the same time, looks like that is breaking the accumulation scenario. I am not sure what is wrong with 197efb7 . The piping is good, as the data is correctly delivered (in true.js) but looks like something went wrong that led to the error. pinging @nodejs/child_process to get an advice, while investigating further. |
Have you checked my comment regarding |
yes, that is exactly if your question is whether the said commit has considered this use case, no, this was not considered. |
Ah indeed, I didn't check correctly!
Yep no worry I just wanted to be sure it was accounted now 😊 Given that, do you think there's a way to make both the original case and this new one work? Or should 197efb7 be reverted? If I understand correctly, the original issue (#18016) referenced in option 3 what seemed to be a satisfying behavior: require the pipes to be manually closed by the user. Any improvement can then be implemented in userland via tools like |
@arcanis, are you saying that the problem in this comment should be considered satisfying? Which pipes should have been closed to make it work? |
I believe the problem in your post is due to an incorrect execution order (I mentioned it in my comment from the other thread but the exact ramifications didn't occur to me until I got to experience the bug first hand). Here is your example with two alternative fixes (I've put comments to detail my thoughts for each of them): "use strict";
switch (process.argv[2]) {
default: {
console.log("Usage: ./x.js <0|1>");
} break;
case "0": {
// The case that doesn't work; the pipes are left open in the current process, so some
// data can be lost between the time "p1" spawns and the time "p2" spawns
const { spawn } = require("child_process");
let p1 = spawn("./b", ["10000"], {stdio: [process.stdin, "pipe", process.stderr]});
let p2 = spawn("grep", ["7"], {stdio: [p1.stdio[1], "pipe", process.stdout]});
let p3 = spawn("wc", ["-l"], {stdio: [p2.stdio[1], process.stdout, process.stdout]});
} break;
case "1": {
// This is the same thing, except that we now close the pipes on our hand after spawning
// the processes. It seems that calling "p.stdout.end()" doesn't work for some reason
// (it throws a ENOTCONN exception), so we need to close the handle itself (basically
// manually do what 197efb7 automatically does). That makes me think that we're not
// supposed to do this.
const { spawn } = require("child_process");
let p1 = spawn("./b", ["10000"], {stdio: [process.stdin, "pipe", process.stderr]});
let p2 = spawn("grep", ["7"], {stdio: [p1.stdio[1], "pipe", process.stdout]});
p1.stdout._handle.close();
let p3 = spawn("wc", ["-l"], {stdio: [p2.stdio[1], process.stdout, process.stdout]});
p2.stdout._handle.close();
} break;
case "2": {
// This is imo the best solution. Instead of spawning the data producers first, we start
// by spawning the consumers first (right-to-left). After we spawned the process, we
// close our side of the pipe so that it becomes clear that no data is to be expected
// from us. Curiously this time we can use "p.stdin.end()", even though the previous
// example showed that we couldn't call "p.stdout.end()".
//
// This approach also makes it possible to configure multiple subprocesses to write in
// the same pipe, something which isn't possible with left-to-right execution (since you
// don't know what pipe the left process should write into until after you've spawn the
// right process).
const { spawn } = require("child_process");
let p3 = spawn("wc", ["-l"], {stdio: ["pipe", process.stdout, process.stdout]});
let p2 = spawn("grep", ["7"], {stdio: ["pipe", p3.stdin, process.stdout]});
p3.stdin.end();
let p1 = spawn("./b", ["10000"], {stdio: [process.stdin, p2.stdin, process.stderr]});
p2.stdin.end();
} break;
} |
Thanks for the code, which I'll use if the commit is reverted, but I First, as I also replied in the other thread, the order should not Now, looking at your two solutions I get a vague feeling that the One more comment is that things are still broken (ping let p3 = spawn("wc.exe", ["-l"], {stdio: ["pipe", process.stdout, process.stdout]});
let p2 = spawn("grep.exe", ["7"], {stdio: ["pipe", p3.stdin, process.stdout]});
// p3.stdin.end();
let p1 = spawn("bash.exe", ["./b", "10000"], {stdio: [process.stdin, p2.stdin, process.stderr]});
// p2.stdin.end(); fails now, and works on linux. It looks like a read is initiated which If my guess is correct, then I see three conclusions:
(Regardless, something should be documented, especially if the commit |
My testsuite pass on Windows; the only different thing I'm doing is that I actually call the p2.on('exit', () => {
p3.stdin.end();
}); When I tried your example without the exit listener it worked on my machine so I didn't look further, but maybe the listener is required. In any case I agree there's a behavior that should be documented, since it's not clear what works by design and what's an undefined behavior. My main concern at the moment is that Node 12 is meant to be cut on April 26, and I hope this BC-breaking bug won't be part of it ... otherwise reverting it would become BC-breaking in its own right 😕 |
@arcanis - thanks for the detailed explanation! agreeing that case 2 is the best among all, there are still few concerns here:
I believe what we are missing here is a specification around piping: specifically answer to these questions:
|
I think it depends on whether the Overall I tend to think that the In my perfect world Node would pretty much expose
I'm not a shell expert (my knowledge mostly come from an old high school project where we had to implement a shell from scratch) so I can't say whether left-to-right execution is used in some of them; what I can say however is that right-to-left is common (and makes more sense to me personally).
I hope so! To give some context about my use case: I'm implementing a portable shell for Yarn 2+. It will allow to use a common shell language (posix-inspired) across all systems, fixing one of the major portability pain points we currently have. But that requires pipes to actually work 😆 |
(Predlude: I have a lot of experience with shells, and I have implemented much of this functionality for a different language together with the core developer, in a way that works on both unix and windows. As a side note, I ran into this writing some code that makes shell-like functionality available in node.) @gireeshpunathil: I agree with your points, and specifically
is a particular problem. Not only is it unintuitive, it's also hard to explain, and I believe that as such it is not a good functionality.
I also agree that too much magic is problematic, and exactly because of that I suggested adding an option that avoids creating a pipe altogether. It's not as low-level as exposing any form of FD duplication (which IMO would be an overkill and an over-complication), but it's low-level enough to ensure that no surprises happen and to allow cross-platform code. Specifically, any explanation that talks about doing stuff (starting processes, closing pipes) in the same tick or not are fundamentally complicated since there will be cases where people won't be able to guarantee that, or will be surprised when things break.
The original problem that surprised me is that I ended with code where two things were competing for data, leading to loss. Any such behavior is extremely dangerous and I strongly believe that it should not be allowed to happen. (I.e., fixing the documentation would be necessary, but leaving the ability for people to shoot their foot is very bad.) Again, I believe that a new mode that ensures that no pipe is created is therefore needed -- an alternative would be some special treatment once a stream is connected to a process which leads to the above problems.
I very strongly believe that a design that requires an order is similarly flawed. I have never heard of any such requirement, and I don't think that there's any "popular" choice. Searching the web brings up nothing, so I resorted to just trying it out:
I tried that using (Again, see my earlier explanations: it is better now to start processes right-to-left to avoid potential data loss, only because of the design, and because it is generally more dangerous than redundant reading. A good design would not have any potential loss anyway, leaving things to the OS which deals fine with blocking a process on either side when needed.)
It was certainly surprising to be (even in the |
@arcanis , @elibarzilay - thanks for the detailed analysis and explanation! I guess all are valid points. At this point I want to restate this - guess we all know, but just restating for check-pointing this discussion:
Hope you all are with me up to this point? Right now the internal pipes are created un-conditionally in a much earlier phase in Pipe creations that consider the option set at the process creation site would solve all the said issues IMO; though not sure how much effort / design changes that would be required. I will study the code around a bit more before coming up with anything concrete; feel free to share opinions! |
@gireeshpunathil: yes, I agree with that, including the point about internal piping needed. The reason I mentioned the other language I worked on is that it is a very similar situation, where you need some piping to interact with the stream from the language. So in addition to plain in-language streams, there are "primitive streams" that can be used as an optimization when you want to spawn to subprocesses and let them talk directly to each other -- and the choice there is to explicitly ask for a pipe whenever you need to interact with the process from the language. It sounds to me that node is doing the pipe regardless, and the problems could be resolved if you always use If this is done well, then there would also be some facility to wrap such values in a way that creates a pipe that makes it possible to talk to the process (I'm using "pipe" in the OS sense here, not |
What's the plan going forward, especially regarding Node 12? |
So no revert is planned, even if this diff clearly breaks existing usages? 😕 I'm all for a better option, but I'm concerned it will take a lot of time. In the meantime the bug will stay, will start to be expected by some users, and changing this will be even more painful than it currently. |
@arcanis - thanks. I haven't made up my mind on reverting the change. As stated earlier, there is one class of issues this fix resolves, while it looks like introduces another class, and these two types being mutually exclusive. As I haven't heard any opinions from @nodejs/child_process , I will add it to the tsc-agenda to get things moving. |
Sounds good - to explain a bit why I'm pushing for a revert, when faced with two equally imperfect options I believe preserving the status quo should prime. |
/ping @nodejs/tsc |
Well, it wasn't April 26 after all ... 😞 https://medium.com/@nodejs/introducing-node-js-12-76c41a1b3f3f |
Given some other parts of core, I think all stdio should never be closed at all. Otherwise @gireeshpunathil @addaleax what do you think? |
I was out of office for a while, will try to come up with something solid by tomorrow. |
@mcollina This issue is about handling child process stdio streams in the parent process, so I think that’s an unrelated concern. @arcanis I think the lack of movement here is partially due to the issue being very specific, and the length of the existing discussion here – I have a hard time catching up with everything, tbh, although the issue in itself seems limited in scope. It sounds like the gist of the issue fixed by 197efb7 is that the Node.js parent process might attempt to read data from an end to the same pipe as the child process received, which is obviously bad. It seems like closing the stream, is, however, not a good solution, because that prevents us from letting other child processes also write to that stream. Intuitively, what I’d suggest as an alternative to 197efb7, is to disable reading from the pipe in the parent process either completely or until reading is re-initiated explicitly. But that also seems like it’s a very simple solution, and a simple solution just seems inherently unlikely after 15 screen pages of discussion and multiple previous issues? (I’ll try it out and report back, just in case.) (The error shown in the original issue comment here seems to stem from the fact that the script is trying to shutdown the writable side of an already-closed stream, and that the native method that does that returns @elibarzilay’s suggestion of a second kind of way to represent os-level streams also sounds reasonable in a way, but if that’s what we want to do, it’s also something that we should have done from the beginning and doesn’t seem like a answer to this problem in the sense that everyone, for now, is stuck with the current system. @arcanis Since you sound (rightfully) concerned about the process around this: If you don’t see a real solution to an issue coming up in an overseeable and acceptable time frame, I think you can generally feel free to open a PR with a revert yourself or ask somebody else to, since this seems like a real regression and that’s what we usually do for regressions. The Node.js 12 release shouldn’t be a huge issue when it comes to a revert, we’ve landed reverts of breaking changes in the next-release-after-a-semver-major release before. |
It looks like the “simple” solution, i.e. just stopping the readable side, might just work? I’ve opened #27373 with that. |
Closing the underlying resource completely has the unwanted side effect that the stream can no longer be used at all, including passing it to other child processes. What we want to avoid is accidentally reading from the stream; accordingly, it should be sufficient to stop its readable side manually, and otherwise leave the underlying resource intact. Fixes: nodejs#27097 Refs: nodejs#21209
@addaleax - thanks. Looking back, closing the stream was not the optimal solution, and I agree with your fix as comprehensive, and addressing both the use cases without side effects. |
just wondering about the |
Closing the underlying resource completely has the unwanted side effect that the stream can no longer be used at all, including passing it to other child processes. What we want to avoid is accidentally reading from the stream; accordingly, it should be sufficient to stop its readable side manually, and otherwise leave the underlying resource intact. Fixes: #27097 Refs: #21209 PR-URL: #27373 Reviewed-By: Matteo Collina <matteo.collina@gmail.com> Reviewed-By: Gireesh Punathil <gpunathi@in.ibm.com>
I updated to 12.3.1 from node 10 and my code which detects stdin 'end' event on child process stopped working. is this related? |
@ninja- - can't say for sure, do you have a simple test case that shows up the issue? |
Ref #18016, ping @elibarzilay and @gireeshpunathil
It seems that Node 11's behavior changed compared to Node 10, and pipes are now automatically closed after being used as output stream from a process. I think this is a bug, because it makes it impossible to use the same pipe as output from two different processes (which would be the case if I was to implement
(foo; bar) | cat
- bothfoo
andbar
would write into thecat
process).Additionally, it causes previously working code to "randomly" throw internal exceptions. The random part is likely caused by a race condition, since
perl
throws consistently whilerev
doesn't cause problems. The exception is as such:As you can see the code executed fine (
HELLO WORLD
got printed), but during cleanup an internal assertion failed and Node crashed.The text was updated successfully, but these errors were encountered: