Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Worker thread throws error when started from --require script #36531

Closed
cspotcode opened this issue Dec 15, 2020 · 12 comments
Closed

Worker thread throws error when started from --require script #36531

cspotcode opened this issue Dec 15, 2020 · 12 comments
Labels
confirmed-bug Issues with confirmed bugs. worker Issues and PRs related to Worker support.

Comments

@cspotcode
Copy link

  • Version: 15.4.0
  • Platform: Linux
  • Subsystem: worker_threads

What steps will reproduce the bug?

Run the following at a bash shell:

touch empty.js
echo "new (require('worker_threads').Worker)('./empty.js')" > create-worker.js
node -r ./create-worker.js ./empty.js

How often does it reproduce? Is there a required condition?

It reproduces every time. It happens when the worker is created by a --require script.

What is the expected behavior?

No errors from the worker_thread.

What do you see instead?

Logs the following:

node:events:353
      throw er; // Unhandled 'error' event
      ^
TypeError [Error]: undefined is not an integer typed array.
    at Atomics.load (<anonymous>)
    at process.cwd (node:internal/main/worker_thread:140:38)
    at MessagePort.<anonymous> (node:internal/main/worker_thread:153:48)
    at MessagePort.[nodejs.internal.kHybridDispatch] (node:internal/event_target:422:41)
    at MessagePort.exports.emitMessage (node:internal/per_context/messageport:18:26)
Emitted 'error' event on process instance at:
    at emitUnhandledRejectionOrErr (node:internal/event_target:602:11)
    at MessagePort.[nodejs.internal.kHybridDispatch] (node:internal/event_target:426:9)
    at MessagePort.exports.emitMessage (node:internal/per_context/messageport:18:26)

Additional information

The line number referenced in the error is here: https://github.com/nodejs/node/blob/master/lib/internal/main/worker_thread.js#L140

Based on a brief look at the code, I guess node is assuming that the worker receives a bootstrapping message with a bunch of values, one of them being cwdCounter. But when created via --require, that message is not sent or has not arrived yet.

@cspotcode cspotcode changed the title Worker throws error when started from --require script Worker thread throws error when started from --require script Dec 15, 2020
@jasnell jasnell added confirmed-bug Issues with confirmed bugs. worker Issues and PRs related to Worker support. labels Dec 15, 2020
@jasnell
Copy link
Member

jasnell commented Dec 15, 2020

/cc @nodejs/workers

@addaleax
Copy link
Member

Just switching the initialization order here should be fine, I think:

diff --git a/lib/internal/main/worker_thread.js b/lib/internal/main/worker_thread.js
index 7734aaa2c201..6c7c4acb3759 100644
--- a/lib/internal/main/worker_thread.js
+++ b/lib/internal/main/worker_thread.js
@@ -120,10 +120,6 @@ port.on('message', (message) => {
     initializeCJSLoader();
     initializeESMLoader();
 
-    const CJSLoader = require('internal/modules/cjs/loader');
-    assert(!CJSLoader.hasLoadedAnyUserCJSModule);
-    loadPreloadModules();
-    initializeFrozenIntrinsics();
     if (argv !== undefined) {
       process.argv = ArrayPrototypeConcat(process.argv, argv);
     }
@@ -146,6 +142,11 @@ port.on('message', (message) => {
     };
     workerIo.sharedCwdCounter = cwdCounter;
 
+    const CJSLoader = require('internal/modules/cjs/loader');
+    assert(!CJSLoader.hasLoadedAnyUserCJSModule);
+    loadPreloadModules();
+    initializeFrozenIntrinsics();
+
     if (!hasStdin)
       process.stdin.push(null);
 

Just note that your program still won’t work as it’s written right now, because Workers inherit their parent thread’s execArgv by default, and so they will also see -r create-worker.js, and spawn new Workers recursively.

@cspotcode
Copy link
Author

When I catch the error with process.on('error', preventing the worker from terminating, I only see the one worker. It does not recursively spawn workers, even though the worker has stayed alive.

Are you saying that the fix you're proposing will cause it to start spawning workers recursively?

@addaleax
Copy link
Member

Are you saying that the fix you're proposing will cause it to start spawning workers recursively?

Yes, that would be the expected (and documented) behavior for the code you shared here.

When I catch the error with process.on('error', preventing the worker from terminating, I only see the one worker. It does not recursively spawn workers, even though the worker has stayed alive.

That’s because the Worker never gets around to spawning its main script, I assume.

@cspotcode
Copy link
Author

I can put setInterval(logAMessage, 1000) into the worker's JS file and see those messages logging from the worker. (When doing this, I'm not using empty.js in 2 different places) So the worker's main script does execute. But perhaps node's bootstrapping logic which handles --require gets terminated early by the error.

What is the recommended approach to avoid recursive thread spawning? Read workerData and check for a flag passed from the parent thread?

@targos
Copy link
Member

targos commented Dec 15, 2020

require('worker_threads').isMainThread is false in worker threads

@addaleax
Copy link
Member

@cspotcode Maybe to clarify: It’s not the first worker, the one you want to spawn, that throws the exception. It’s the inner, second worker, that’s already being spawned by the first one as a preload module, that fails.

So it’s not that Workers can’t be spawned from preload scripts in the main thread, it’s that Workers can’t be spawned from the preload scripts of other Worker threads.

@cspotcode
Copy link
Author

Thanks for the clarification, I see what you mean. This is tricky.

If I'm writing a library that internally uses worker_threads to do useful things, and I want the library's implementation to be properly opaque, then I should always be passing an empty execArgv array to each worker thread, right? Otherwise users utilizing --require will accidentally have their code executed multiple times and possibly break my threads.

@cspotcode
Copy link
Author

cspotcode commented Dec 16, 2020

Is a preload script supposed to have access to workerData and argv synchronously, or does it need to wait asynchronously? I'm not sure if I should file that as a separate bug or if it's intended.

@addaleax
Copy link
Member

If I'm writing a library that internally uses worker_threads to do useful things, and I want the library's implementation to be properly opaque, then I should always be passing an empty execArgv array to each worker thread, right? Otherwise users utilizing --require will accidentally have their code executed multiple times and possibly break my threads.

I mean, that really depends … it’s hard to know what the correct behavior would be without knowing the concrete circumstances. Sometimes a user using --require might want their script also loaded in Workers, sometimes not, I guess.

Is a preload script supposed to have access to workerData and argv synchronously, or does it need to wait asynchronously?

I think it should have that, yes.

I'm not sure if I should file that as a separate bug or if it's intended.

🤷‍♀️ The patch that I suggested above would also resolve this problem as well. You can open a PR with it, if you like, it just needs a test to come with, I think.

@cspotcode
Copy link
Author

Ok thanks, I did not fully understand that --require preload scripts are always expected to run in all child processes and worker_threads. Better not to override execArgv and NODE_OPTIONS.

One note on #36531 (comment), checking isMainThread isn't necessarily the best option if you expect a library's code to be invoked from within a consumer's worker_thread. But we can always use a different entry-point .js for the thread, and we can pass it workerData

jasnell added a commit to jasnell/node that referenced this issue Feb 22, 2021
Fix spawning nested worker threads from preload scripts and
warn about doing so.

Signed-off-by: James M Snell <jasnell@gmail.com>
Fixes: nodejs#36531
@jasnell
Copy link
Member

jasnell commented Feb 22, 2021

PR: #37481

jasnell added a commit to jasnell/node that referenced this issue Feb 23, 2021
Fix spawning nested worker threads from preload scripts and
warn about doing so.

Signed-off-by: James M Snell <jasnell@gmail.com>
Fixes: nodejs#36531
@jasnell jasnell closed this as completed in 360e8c8 Mar 1, 2021
lpinca pushed a commit to lpinca/node that referenced this issue Mar 2, 2021
Fix spawning nested worker threads from preload scripts and
warn about doing so.

Signed-off-by: James M Snell <jasnell@gmail.com>
Fixes: nodejs#36531

PR-URL: nodejs#37481
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
targos pushed a commit that referenced this issue Mar 2, 2021
Fix spawning nested worker threads from preload scripts and
warn about doing so.

Signed-off-by: James M Snell <jasnell@gmail.com>
Fixes: #36531

PR-URL: #37481
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
targos pushed a commit that referenced this issue May 1, 2021
Fix spawning nested worker threads from preload scripts and
warn about doing so.

Signed-off-by: James M Snell <jasnell@gmail.com>
Fixes: #36531

PR-URL: #37481
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
confirmed-bug Issues with confirmed bugs. worker Issues and PRs related to Worker support.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants