-
Notifications
You must be signed in to change notification settings - Fork 30.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ESM Hook deadlocks since 21.2.0 a3e09b3 #50948
Comments
Also from the slack convo (me):
I can definitely see this being surprising. As it currently is, I think we can't protect against this in node because that part of node is sleeping. A potential mitigation is switching from I can repro the issue. I'm not sure if this is strictly a bug or just an oversight/limitation. I added the |
Does |
I don’t think so, though we’re hoping to avoid any more breaking changes at this point before we go stable. Please don’t forget to ping @nodejs/loaders. Just adding the label doesn’t do anything. |
There was a reason to make it sync. IIR, it was I who advocated for it because there was no scenario we could think of where it being async was desirable. |
Hooray for running code! It has a nasty way of finding edge cases that even very smart people can't think of ;) Idk if it's desirable for it to be async, per se, but it seems like the machinery required to prevent deadlocks here would be quite a bit more elaborate. Maybe there's some clever approach I'm not thinking of, but it feels like it might be expensive. And certainly, not deadlocking is preferable. I guess the question is whether taking on that change (making register async) is more prudent than combing through the edge cases and supporting that machinery long term. |
I suppose another way to approach this would be to inform the hook methods that the main thread is blocked, some flag on context or something, so at least they can avoid doing any main-thread comms for that request. |
In the meeting on Tuesday @JakobJingleheimer thought that the potential for deadlock might be eliminated if we changed the first argument of |
|
Is there no equivalent? When I was suggesting import.meta.resolve as a potential solution, I was mentioning there must be a reason because it's too easy. |
|
|
Yeah. And if that’s the case for a particular package, then the user just needs to create an ESM file for this, or define the URL explicitly without relying on Also a hooks file must be ESM, so it wouldn’t make much sense for there to be a |
I don't think that's reasonable, and it wouldn't solve the problem anyway (the main thread still need to be blocked while the |
So having the communication channel is just impossible to support? |
First I think we need to see if it solves the issue. If it doesn’t, there’s no point in debating whether it’s desirable UX. The thinking was that we don’t block the main thread for |
Not possible during sync operations ( |
Can we error on sending a cross thread message while the main thread is blocked? So that we throw rather than deadlock. |
Maaaybe. For |
Is this still an issue, or has it been fixed by #52706? (Try on latest |
Version
21.2.0
Bisected to a3e09b3
Platform
Darwin moxy.lan 23.1.0 Darwin Kernel Version 23.1.0: Mon Oct 9 21:27:24 PDT 2023; root:xnu-10002.41.9~6/RELEASE_ARM64_T6000 arm64
Subsystem
esm hooks
What steps will reproduce the bug?
How often does it reproduce? Is there a required condition?
Always reproduces
What is the expected behavior? Why is that the expected behavior?
Expect that it loads the two import arguments, and then runs the file.
What do you see instead?
Deadlocks at the
Module.register()
call in the second import script.Additional information
Discussion from slack:
isaacs
7 hours ago
@aduh95
Since a3e09b3 landed, node-tap hangs whenever I run it. It seems like it's getting stuck at:
I haven't been able to narrow it down to a satisfyingly minimal reproduction, and for extra 😬 it's order-dependent, but this will reproduce it in any project that has the latest tap installed:
(where x.js is just console.log('hello'))
5 replies
Jacob Smith
4 hours ago
Oof. These are very difficult to troubleshoot.
I think i'll have time on Wednesday to take a look (edited)
aduh95
3 hours ago
For reference: a3e09b3
🙏
1
isaacs
9 minutes ago
It seems like what's going on is that the @tapjs/mock needs to make a request on its MessageChannel port in order to finish its resolve hook.
Then it loads @tapjs/processinfo/import, which calls Module.register. This triggers a sync request to the worker thread for 'register', which then triggers the mock's resolve hook to resolve the loader.mjs file, but because it's already awaiting a sync message (the register), it'll never come, because that sync message is waiting for another sync message (the resolve).
isaacs
4 minutes ago
My working theory at the moment is that it started at the "run imports sequentially" because prior to that, it wasn't blocking on Module.register() so any blocking async-made-sync actions in the process of the register call weren't a problem.
isaacs
1 minute ago
Also: seems like it's related to the use of the MessageChannel specifically. Just a timeout doesn't trigger it, but the busywait message processing seems to keep the other message channel from proceeding.
The text was updated successfully, but these errors were encountered: