Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stdio regression: process.stdin getter drops pending stdin data #36251

Closed
isaacs opened this issue Nov 24, 2020 · 8 comments
Closed

stdio regression: process.stdin getter drops pending stdin data #36251

isaacs opened this issue Nov 24, 2020 · 8 comments
Labels
process Issues and PRs related to the process subsystem.

Comments

@isaacs
Copy link
Contributor

isaacs commented Nov 24, 2020

  • Version: v15.3.0
  • Platform: macOS Darwin
  • Subsystem: process (or maybe streams?)

What steps will reproduce the bug?

In v15, if process.stdin is referenced, and is not a TTY or Pipe, then its data will be not be seen when passed to a child process. It's almost like it's already consumed or something.

$ node -v
v15.3.0

$ cat shell.js
const { spawn } = require('child_process')
// just reference the object, don't do anything with it
if (process.env.LOAD_STDIN === '1')
  process.stdin
const child = spawn(process.env.SHELL, [], { stdio: 'inherit' })
child.on('close', (code, signal) => console.error({ code, signal }))

$ cat cmd.sh
#!/bin/bash
echo "hello from node's child shell"

$ node shell.js < cmd.sh
hello from node's child shell
{ code: 0, signal: null }

## This is the error:

$ LOAD_STDIN=1 node shell.js < cmd.sh
{ code: 0, signal: null }

## but this works, which is interesting?

$ cat cmd.sh | LOAD_STDIN=1 node shell.js
hello from node's child shell
{ code: 0, signal: null }

$ nave use 14

$ node -v
v14.15.1

$ node shell.js < cmd.sh
hello from node's child shell
{ code: 0, signal: null }

$ LOAD_STDIN=1 node shell.js < cmd.sh
hello from node's child shell
{ code: 0, signal: null }

$ cat cmd.sh | LOAD_STDIN=1 node shell.js
hello from node's child shell
{ code: 0, signal: null }

How often does it reproduce? Is there a required condition?

100% of the time on node 15. Not in any prior version tested (14, 12, and 10).

What is the expected behavior?

Stdin data should not be consumed when process.stdin is merely referenced. This causes problems when a script checks process.stdin.isTTY to print a helpful message prior to opening a shell in a child process.

What do you see instead?

Stdin data is not passed to child process if stdin is referenced, and not a Pipe or TTY.

Additional information

This kind of is a pain for npm exec. It means you can't do something like npx some-interactive-thing < list-of-commands.

@mmomtchev
Copy link
Contributor

@isaacs, there is a relevent discussion here #35997
The creation of the implicit stdio ReadStream is in the process.stdio getter

@gireeshpunathil
Copy link
Member

gireeshpunathil commented Nov 26, 2020

able to recreate with the said version, but not consistently.

  • If I run ~10 times, ~2 runs go fine.
  • If I run it under dtruss, it passes consistently

so I believe there is a race condition happening here.

@mmomtchev
Copy link
Contributor

@gireeshpunathil Yes there is. The ReadingStream starts reading immediately as well as the clone/exec path. This is the race condition. If libuv was able to read from the fd before the clone/exec, that the data is lost.
On Linux, at least on my machine, strace allows you to observe both situations.
@joyeecheung maybe create the ReadStream in stopped state?

@gireeshpunathil
Copy link
Member

@mmomtchev - thanks, I too got a linux box which recreates pretty easily.

passing case:

[pid 661072] execve("/bin/bash", ["/bin/bash"], 0x55dd87501770 /* 22 vars */ <unfinished ...>
...
[pid 661072] read(0, "#!/bin/bash\necho \"hello from nod"..., 49) = 49
[pid 661072] fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(0x88, 0), ...}) = 0
[pid 661072] write(1, "hello from node's child shell\n", 30) = 30
[pid 661072] read(0, "", 49)            = 0
[pid 661072] rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
[pid 661072] rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
[pid 661072] exit_group(0)              = ?

the process 661072 is exec'ing into bash and reads from 0.

failing case:

489 [pid 661087] execve("/bin/bash", ["/bin/bash"], 0x557edd43a770 /* 22 vars */ <unfinished ...>
...
615 [pid 661088] <... read resumed>"#!/bin/bash\necho \"hello from nod"..., 65536) = 49
..
701 [pid 661087] read(0, "", 49)            = 0
702 [pid 661087] rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
703 [pid 661087] rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
704 [pid 661087] exit_group(0)              = ?

The process 661087 is exec'ing into bash but does ot get any data, due to the previous read by 661088 - but what it is? (it is cloned from 661080 which is the parent process)

@mmomtchev
Copy link
Contributor

@gireeshpunathil @isaacs I solved it by implementing a manualStart for ReadStream then using it in the process.stdin getter, just a sec to clean my debug printfs and if the unit tests pass, I will submit the PR

@mmomtchev
Copy link
Contributor

@gireeshpunathil All other stdio ReadStream start in paused mode - except file - which didn't have a manualStart option

@mmomtchev
Copy link
Contributor

@isaacs @gireeshpunathil This will be a platform-dependent unit test, exclusive to darwin and linux unless you have an idea?

mmomtchev added a commit to mmomtchev/node that referenced this issue Nov 26, 2020
All stdio ReadStream's use manual start to avoid
consuming data for example when a process
execs/spawns

Refs: nodejs#36251
@mmomtchev
Copy link
Contributor

The unit test is awful, I admit, if you have any ideas, fell free to comment

@PoojaDurgad PoojaDurgad added the process Issues and PRs related to the process subsystem. label Dec 16, 2020
mmomtchev added a commit to mmomtchev/node that referenced this issue Jan 5, 2021
All stdio ReadStream's use manual start to avoid
consuming data for example when a process
execs/spawns

Refs: nodejs#36251
ronag pushed a commit to nxtedition/node that referenced this issue Jan 6, 2021
All stdio ReadStream's use manual start to avoid
consuming data for example when a process
execs/spawns.

Using stream._construct would cause the Readable
to incorrectly greedily start reading.

Refs: nodejs#36251
@ronag ronag closed this as completed in 053abac Jan 10, 2021
danielleadams pushed a commit that referenced this issue Jan 12, 2021
Using stream._construct would cause the Readable
to incorrectly greedily start reading.

Fixes: #36251

PR-URL: #36823
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
process Issues and PRs related to the process subsystem.
Projects
None yet
Development

No branches or pull requests

4 participants