-
Notifications
You must be signed in to change notification settings - Fork 30.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
src: don't call into VM from AsyncWrap destructor #9467
Conversation
It is not allowed anymore to call JS code when collecting weakly persistent handles, it hits the assertion below: # Fatal error in ../deps/v8/src/execution.cc, line 103 # Check failed: AllowJavascriptExecution::IsAllowed(isolate). Remove the call into the VM from the AsyncWrap destructor. This commit breaks the destroy hook but that cannot be helped. Fixes: nodejs#8216
Shouldn't the other destroy hook-related code be removed also (e.g. setting of the destroy hook in async-wrap.cc and the relevant persistent string in env.h)? |
The Given that /cc @nodejs/diagnostics |
I can do that, I'll update the PR. EDIT: https://ci.nodejs.org/job/node-test-pull-request/4787/
Yeah, I don't buy that. People have filed bug reports about it twice now and in both cases it was a module somewhere in their dependency chain, not something they were using directly. |
I wellcome a diffrent perspective on the policy regarding undocumented API. But |
I reported one of the bugs. We use cls-hooked, which in turn uses async-hook, which then uses AsyncWrap. Maybe I didn't read the readme's very carefully, but as far as I can see there is nothing there warning about it being unstable or not fit for production use. We use it to track transactions through requests, and it's not in anyway optional for us. We haven't observed any issues in production on 6.x. In my opinion it's not a good idea to have unstable API's enabled by default, even if they are not documented. Similar to V8 experimental features, I would think a flag should be set, so users are clearly aware they are using features not recommended for production use. |
#8216 has been open for quite a while with no sign of movement, and as it stands, the number of people willing to work on the async_hooks parts of the codebase seems to be rather overseeable. So, yeah, it sucks, but right now I don’t see a better way than this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with a green CI
Could you maybe explain how this fixes the issue? The modules that uses edit: I have published a new version of
I'm a little sad that @nodejs/diagnostics wasn't cc'ed on this. This makes it difficult for us to function as a working group. |
Sorry, yeah. So far the only person I associated with async_hooks are Trevor (and later you); here’s a PR to tell people to @mention the diagnostics team: #9471 |
From APM perspective, this is probably fine. The destructor is nice for tracking handle lifetimes, but doesn't really matter for transaction tracing. I feel like there definitely should've been more care put into async_wrap in regard to runtime warnings and/or flag gating though. If a feature can be used, it will be used, even when not documented. All the people running into issues using async/await flag or generators in early koa days are plenty proof of that. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this remove the destroy hook? Will this impact the async_hooks PR?
In either case, removing the hook renders async_wrap/async_hooks virtually useless in many cases and should be considered a last resort.
I would strongly prefer not to merge this until @trevnorris reviews it. He's out atm I think but I'll try to make sure he's on this at least first thing Monday.
The problem is that it's categorically unsafe to call into the VM during GC, which is what the AsyncWrap destructor does when it calls the destroy hook. It's never been safe but it slipped through review before V8 started enforcing it. |
Triggering the destroy hook in the next tick using uv_idle_t seems like a reasonable approach to me. As long as the handle itself is not made accessible in any way within the destroy hook, I can't see any edge cases to worry about. Does that sound reasonable? The queued destroy hook would only be aware of the id to notify the JS side about. |
+1 to waiting to land until @trevnorris can have an opportunity to review. |
Back. I'll look into this more today, but the short answer is I've already been working on removing destroy on GC for a different issue. Though at that point it's more like "done" or "complete". Which I'm fine changing the name to. @bnoordhuis for a short term solution we should be able to use the same uv_idle_t that setImmediate uses. The destructor will check if the handle is weak. If so then place on the list, if not execute the callback immediately. Sound conceptually sane? |
I don't think "non-weak == can call into the VM" is a safe assumption. Hanging it off a uv_idle_t: seems okay but what if |
Is there a meaningful difference from the way things are right now? I mean, my impression is that V8 can basically run GC whenever it wants to, so there would be no real way to tell a delayed execution of the hook from a delayed invocation of the GC?
Yeah, I’d just queue up all |
Fair enough. All destroy callbacks can be placed in the
Using some flag magic. Each active set of hooks is assigned an id. If a hook is added/removed then increment the id. Likewise have a flag that indicates whether a destructor ran and depends on the current state of hooks. If this flag is set that a destructor has been called when a hook is added/removed then make a clone of the array of hooks. The only pairing needed is the id of the handle calling destroy, and the id of the hook's state. On the next loop, run through the array of hooks and call them for any associated id. Cost of this if no hooks are active is zero, and not noticeable even if there are active hooks.
Can you elaborate on this? |
Mh, basically: If we decided to delay the invocation of the |
I don't think so but I figured I'd bring it up anyway. |
Actually, @bnoordhuis, just realized I combined two angles of approach. Thing is that GC can no longer be allowed to trigger With this in mind, what measurable circumstance would there be for when the destructor can't call into JS? I'd like to setup a test and begin tracing when it may not be appropriate for the destructor to call
If we need to go the route of delaying
|
We got another report, #9599. If there isn't any progress on an alternative pull request in the next few days, I'm going to go ahead and land this. I'd really like to see this fixed before the next release. |
@bnoordhuis Not sure if this is the best place to ask, but is it possible that the changes to V8 that caused this bug (a change in how/when v8 is collecting weakly persistent handles) might also affect the correctness of the node-weak module (https://github.com/TooTallNate/node-weak) for newer versions of v8/node? We are running into segfaults in the garbage collector in our stressful node application that uses node-weak when we run on node 6.9.1, but we didn't have any such issue when running on node v0.12. This is true even if we get rid of any use of node-weak callback functions. We may also be getting such segfaults in node 4.5.0, but if so, they are much, much rarer. I realize that the gc test code in the node distribution actually uses node-weak, but that is a very simple, unstressful test case. |
@bnoordhuis I didn't consider my last comment as an approved approach. I'd still like to know when the JS callback shouldn't be run when the destructor is manually triggered? After review it seems like we should be able to manually delete classes that are now weak. Because of the safety mechanisms in place, detaching the C++ class won't cause JS to segfault. Thus, we could completely remove weak handles/requests. Thoughts? EDIT: Couldn't we also attach the call to JS through |
@danscales Yes, node-weak does the same thing. @trevnorris Like we discussed in today's meeting, making everything non-weak sounds great. |
@trevnorris @bnoordhuis ... I just want to make sure I understand the "making everything non-weak" part of this as it would impact some work that I'm doing: does this mean entirely avoiding the use of |
@jasnell Correct. |
@bnoordhuis sorry for the delay. i'm able to remove the weak handles for everything (even non asyncwrap inheriting classes) for everything except for |
Alternate PR at #9753 |
Add a group of people to the “Who to CC in issues” list as the maintainers of `async_hooks`. Ref: nodejs#9467 (comment)
Add a group of people to the “Who to CC in issues” list as the maintainers of `async_hooks`. Ref: #9467 (comment) PR-URL: #9471 Reviewed-By: Gibson Fahnestock <gibfahn@gmail.com> Reviewed-By: Colin Ihrig <cjihrig@gmail.com> Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Sam Roberts <vieuxtech@gmail.com> Reviewed-By: Stephen Belanger <admin@stephenbelanger.com> Reviewed-By: Josh Gavant <josh.gavant@outlook.com>
This should no longer be necessary since #9753 landed, so I’m closing this |
Add a group of people to the “Who to CC in issues” list as the maintainers of `async_hooks`. Ref: #9467 (comment) PR-URL: #9471 Reviewed-By: Gibson Fahnestock <gibfahn@gmail.com> Reviewed-By: Colin Ihrig <cjihrig@gmail.com> Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Sam Roberts <vieuxtech@gmail.com> Reviewed-By: Stephen Belanger <admin@stephenbelanger.com> Reviewed-By: Josh Gavant <josh.gavant@outlook.com>
Add a group of people to the “Who to CC in issues” list as the maintainers of `async_hooks`. Ref: nodejs#9467 (comment) PR-URL: nodejs#9471 Reviewed-By: Gibson Fahnestock <gibfahn@gmail.com> Reviewed-By: Colin Ihrig <cjihrig@gmail.com> Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Sam Roberts <vieuxtech@gmail.com> Reviewed-By: Stephen Belanger <admin@stephenbelanger.com> Reviewed-By: Josh Gavant <josh.gavant@outlook.com>
Add a group of people to the “Who to CC in issues” list as the maintainers of `async_hooks`. Ref: #9467 (comment) PR-URL: #9471 Reviewed-By: Gibson Fahnestock <gibfahn@gmail.com> Reviewed-By: Colin Ihrig <cjihrig@gmail.com> Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Sam Roberts <vieuxtech@gmail.com> Reviewed-By: Stephen Belanger <admin@stephenbelanger.com> Reviewed-By: Josh Gavant <josh.gavant@outlook.com>
Add a group of people to the “Who to CC in issues” list as the maintainers of `async_hooks`. Ref: #9467 (comment) PR-URL: #9471 Reviewed-By: Gibson Fahnestock <gibfahn@gmail.com> Reviewed-By: Colin Ihrig <cjihrig@gmail.com> Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Sam Roberts <vieuxtech@gmail.com> Reviewed-By: Stephen Belanger <admin@stephenbelanger.com> Reviewed-By: Josh Gavant <josh.gavant@outlook.com>
Add a group of people to the “Who to CC in issues” list as the maintainers of `async_hooks`. Ref: #9467 (comment) PR-URL: #9471 Reviewed-By: Gibson Fahnestock <gibfahn@gmail.com> Reviewed-By: Colin Ihrig <cjihrig@gmail.com> Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Sam Roberts <vieuxtech@gmail.com> Reviewed-By: Stephen Belanger <admin@stephenbelanger.com> Reviewed-By: Josh Gavant <josh.gavant@outlook.com>
Add a group of people to the “Who to CC in issues” list as the maintainers of `async_hooks`. Ref: #9467 (comment) PR-URL: #9471 Reviewed-By: Gibson Fahnestock <gibfahn@gmail.com> Reviewed-By: Colin Ihrig <cjihrig@gmail.com> Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Sam Roberts <vieuxtech@gmail.com> Reviewed-By: Stephen Belanger <admin@stephenbelanger.com> Reviewed-By: Josh Gavant <josh.gavant@outlook.com>
R=@trevnorris?
See #8216 and #9465, it's making node crash.
It might be possible to retain the destroy hook by maintaining a per-environment list + a uv_idle_t handle or something like that but it's a lot of work and there might be edge cases so I'm opting for simply removing the hook. Whoever disagrees volunteers to do the hard work. :-)
CI: https://ci.nodejs.org/job/node-test-pull-request/4782/