Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fs.closeSync and fs.close crash node process when fd is a random integer #37874

Open
zyscoder opened this issue Mar 23, 2021 · 20 comments
Open
Labels
fs Issues and PRs related to the fs subsystem / file system. libuv Issues and PRs related to the libuv dependency or the uv binding.

Comments

@zyscoder
Copy link

What steps will reproduce the bug?

Setup a node instance,

» node

and run the following javascript code.

fs.closeSync(16)

Then an abort occurs.

How often does it reproduce? Is there a required condition?

This abort can always be triggered following the steps above.

What is the expected behavior?

If any error occurs, an exception or other similar error-reporting stuff should be thrown. There is no reason to abort the whole node process.

What do you see instead?

 » node                                                                                                                                                                                                       134 ↵ zys@zys-X299-UD4-Pro
Welcome to Node.js v14.15.1.
Type ".help" for more information.
> fs.closeSync(16);
undefined
> [1]    125036 abort (core dumped)  node

Additional information

@targos targos added fs Issues and PRs related to the fs subsystem / file system. confirmed-bug Issues with confirmed bugs. labels Mar 23, 2021
@himself65
Copy link
Member

This bug does not appear on Windows. But appears in v14.16.0 and v15.12.0 under Linux.

@targos
Copy link
Member

targos commented Mar 23, 2021

Debug backtrace:

(gdb) bt
#0  0x00007ffff6cf683f in raise () from /lib64/libc.so.6
#1  0x00007ffff6ce0c95 in abort () from /lib64/libc.so.6
#2  0x0000000001f6a8b8 in uv__async_send (loop=0x5dafca0 <default_loop_struct>) at ../../deps/uv/src/unix/async.c:198
#3  0x0000000001f6a469 in uv_async_send (handle=0x5dafd50 <default_loop_struct+176>) at ../../deps/uv/src/unix/async.c:73
#4  0x0000000001f64702 in worker (arg=0x0) at ../../deps/uv/src/threadpool.c:128
#5  0x00007ffff708914a in start_thread () from /lib64/libpthread.so.0
#6  0x00007ffff6dbb7e3 in clone () from /lib64/libc.so.6

It seems to be a bug in libuv.. @nodejs/libuv

@targos targos added the libuv Issues and PRs related to the libuv dependency or the uv binding. label Mar 23, 2021
@targos
Copy link
Member

targos commented Mar 23, 2021

Abort is here:

abort();

@himself65
Copy link
Member

$ gdb -c core node main.js
Excess command line arguments ignored. (main.js)
GNU gdb (Ubuntu 9.2-0ubuntu1~20.04) 9.2
Copyright (C) 2020 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from node...
[New LWP 5663]
[New LWP 5664]
[New LWP 5665]
[New LWP 5666]
[New LWP 5667]
[New LWP 5668]
[New LWP 5669]

warning: Unexpected size of section `.reg-xstate/5663' in core file.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `node main.js'.
Program terminated with signal SIGABRT, Aborted.

warning: Unexpected size of section `.reg-xstate/5663' in core file.
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
50      ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
[Current thread is 1 (Thread 0x7f04bb431780 (LWP 5663))]
(gdb) bt
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x00007f04bb45b859 in __GI_abort () at abort.c:79
#2  0x0000000000962331 in uv__io_poll (loop=loop@entry=0x446c7c0 <default_loop_struct>, timeout=<optimized out>)
    at ../deps/uv/src/unix/linux-core.c:254
#3  0x000000000137c438 in uv_run (loop=0x446c7c0 <default_loop_struct>, mode=UV_RUN_ONCE)
    at ../deps/uv/src/unix/core.c:385
#4  0x00000000009a4b28 in node::Environment::CleanupHandles() ()
#5  0x00000000009a9dc6 in node::Environment::RunCleanup() ()
#6  0x000000000096d547 in node::FreeEnvironment(node::Environment*) ()
#7  0x0000000000a4483f in node::NodeMainInstance::Run() ()
#8  0x00000000009d1e15 in node::Start(int, char**) ()
#9  0x00007f04bb45d0b3 in __libc_start_main (main=0x967f50 <main>, argc=2, argv=0x7ffe66776b08, init=<optimized out>,
    fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7ffe66776af8) at ../csu/libc-start.c:308
#10 0x00000000009694cc in _start ()
(gdb)

@targos
Copy link
Member

targos commented Mar 23, 2021

Values of some of the variables:

errno: 9
fd: 16
r: -1

@RaisinTen
Copy link
Contributor

RaisinTen commented Mar 23, 2021

The docs for fs.closeSync says this:

Calling fs.closeSync() on any file descriptor (fd) that is currently in use through any other fs operation may lead to undefined behavior.

I tried to log the file path of the file descriptor:

> fs.readlinkSync('/proc/self/fd/16')
'anon_inode:[eventfd]'

It seems eventfd is in use already by the node process. So, isn't this more of an undefined behaviour than a bug?

@targos
Copy link
Member

targos commented Mar 23, 2021

I don't know what "undefined behavior" means exactly in the Node.js documentation, but we usually treat any kind of non-catchable error as a bug.

@RaisinTen
Copy link
Contributor

PR to ignore EBADF to prevent the abort: libuv/libuv#3138

@jasnell
Copy link
Member

jasnell commented Mar 23, 2021

There have been a few discussions around this (cc @ronag) ... Specifically, we really need to be ref counting fds that have active operations and deferring the close until current operations complete. With closeSync that can get a bit hairy of course, so perhaps with that we just throw and say we can't close while there are pending operations. With async close, we can just defer closing without throwing. That's a significant change in behavior so we need to be careful. We do need the ability to force an as-soon-as-possible close, in which case those pending operations that we can cancel should be, and those that are already in progress would still be allowed to finish. The implementation for this will need to be at the C++ level given the complexities of worker threads and the fact that FileHandles are shareable. I don't think the implementation is that complicated.

@addaleax addaleax added wontfix Issues that will not be fixed. and removed confirmed-bug Issues with confirmed bugs. wontfix Issues that will not be fixed. labels Mar 23, 2021
@addaleax
Copy link
Member

So ... in this specific case, I'm kind of on the fence for just closing it as wontfix -- but I certainly wouldn't consider it a bug. The program is doing something it shouldn't do, which is closing a fd it doesn't own, and as Ben said in the libuv issue -- "it gets what it deserves".

Yes, providing an integer-fd-based API for fs operations was a big API design mistake, but it's not like we could remove that API, and FileHandle is a much better replacement already.

What we could do is to:

  • Enable kTrackUnmanagedFds in the main thread by default, like we do in Workers
  • Throw an error if fs.close() or fs.closeSync() is called on an fd that is not in the set of the fds owned by the current environment, i.e. fds opened through the fs.open()/fs.openSync() methods

I think that would give a reasonable balance between letting users do whatever they think is right, and keeping them from shooting themselves in the foot.

@aduh95 aduh95 changed the title "fs.closeSync" results in an abort fs.closeSync and fs.close crash node process when fd is a random integer Mar 25, 2021
@XadillaX
Copy link
Contributor

So ... in this specific case, I'm kind of on the fence for just closing it as wontfix -- but I certainly wouldn't consider it a bug. The program is doing something it shouldn't do, which is closing a fd it doesn't own, and as Ben said in the libuv issue -- "it gets what it deserves".

Yes, providing an integer-fd-based API for fs operations was a big API design mistake, but it's not like we could remove that API, and FileHandle is a much better replacement already.

What we could do is to:

  • Enable kTrackUnmanagedFds in the main thread by default, like we do in Workers
  • Throw an error if fs.close() or fs.closeSync() is called on an fd that is not in the set of the fds owned by the current environment, i.e. fds opened through the fs.open()/fs.openSync() methods

I think that would give a reasonable balance between letting users do whatever they think is right, and keeping them from shooting themselves in the foot.

I mean we should trace the fd like epoll eventfd / async handle fd, etc. And throw errors when we try to use those fds in Node.js userland manually.

@addaleax
Copy link
Member

@XadillaX As @jasnell mentioned in #38365 (comment), that’s probably not the best approach here (for example, we are not able to track libuv-internal fds easily, and this would also not prevent users from interacting with fds created by other threads).

If we do want to do a more radical approach, we could just make the fd-based fs API a wrapper around FileHandle, i.e. keep a global “fd” → FileHandle map (effectively doing ourselves what the OS does for us right now with fds).

@jasnell
Copy link
Member

jasnell commented Apr 25, 2021

we could just make the fd-based fs API a wrapper around FileHandle

Certainly possible and a good approach but I'd be very concerned about performance regression -- tho if we do anything here there's going to be a performance hit so we can't entirely avoid it.

@XadillaX
Copy link
Contributor

XadillaX commented Apr 25, 2021

@XadillaX As @jasnell mentioned in #38365 (comment), that’s probably not the best approach here (for example, we are not able to track libuv-internal fds easily, and this would also not prevent users from interacting with fds created by other threads).

If we do want to do a more radical approach, we could just make the fd-based fs API a wrapper around FileHandle, i.e. keep a global “fd” → FileHandle map (effectively doing ourselves what the OS does for us right now with fds).

The samplest way is just tracing eventfds. That is enough to protect process.exit() core dump.

e.g. call uv_backend_fd after each uv_loop_init.

@addaleax
Copy link
Member

@XadillaX I would be -1 on that for the reasons mentioned above.

@XadillaX
Copy link
Contributor

@XadillaX I would be -1 on that for the reasons mentioned above.

But what if an online server already used code like:

fs.closeSync(<userInput>);

That's a dangerous code. We may protect them.

@targos
Copy link
Member

targos commented Apr 26, 2021

There is no correct use case where a file descriptor comes from user input, is there?

@jasnell
Copy link
Member

jasnell commented Apr 26, 2021

There is no correct use case where a file descriptor comes from user input, is there?

Not really. Any anyone that allows calling a fs operation on unsanitized and unverified user input is just asking for trouble far beyond simply crashing the process with an abort.

@addaleax
Copy link
Member

@XadillaX Yeah, but partial “protection” instead of a proper solution is just not going to be very useful, because you can always pick another libuv-internal or Node.js-internal or owned-by-another-thread fd and mess with it.

@XadillaX
Copy link
Contributor

XadillaX commented May 5, 2021

@XadillaX Yeah, but partial “protection” instead of a proper solution is just not going to be very useful, because you can always pick another libuv-internal or Node.js-internal or owned-by-another-thread fd and mess with it.

So a new solution is needed. But how do we to deal with old APIs like fs.close?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fs Issues and PRs related to the fs subsystem / file system. libuv Issues and PRs related to the libuv dependency or the uv binding.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants