Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

async_hooks: add getActiveResources (prototype) #21453

Closed
wants to merge 1 commit into from

Conversation

Fishrock123
Copy link
Contributor

@Fishrock123 Fishrock123 commented Jun 21, 2018

Prototype of a new getActiveResources() API for async_hooks.

Returns an object map of { <id>: <resource> }, and differs slightly from getActiveHandles(). (The old API used to return the handle owner property for some reason.)

Currently, it works for all async_hooks resources except nextTick-s, which are probably not useful to expose in this way.

Reason being that nextTick-s almost always have no interesting information attached, and exist in async_hooks almost exclusively for continuation purposes. The same thing can pretty much be said for Immediate-s too, and I wouldn't be opposed to removing them from this.

As an alternative to not producing nextTicks, I think something along the lines of getActiveResources(async_type_mask) could be acceptable.


There are two primary reasons for why this might be useful compared to async_hooks, and why _getActiveHandles() is still useful today.

  1. Handle access. This bypasses the tricky problem of somehow storing a weak reference to handles so that you can access them up to destroy() but also not prevent GC.
  2. Performance. It is unnecessary to do tracking on each resource allocation, which requires boundary crossing for C++ resources on each call and also dealing with the way-too-plentiful nextTicks.

The root why of this is that some tools (such as N|Solid) like to have access to this kind of thing so we can inspect what kind of resources are active in a live application at any arbitrary point in time.


@nodejs/diagnostics @mike-kaufman @gpotter2

Supersedes #21102 & probably also #21313

Checklist
  • make -j4 test (UNIX), or vcbuild test (Windows) passes
  • tests and/or benchmarks are included
  • documentation is changed or added
  • commit message follows commit guidelines

@Fishrock123 Fishrock123 added wip Issues and PRs that are still a work in progress. discuss Issues opened for discussions and feedbacks. async_wrap async_hooks Issues and PRs related to the async hooks subsystem. labels Jun 21, 2018
@nodejs-github-bot nodejs-github-bot added async_hooks Issues and PRs related to the async hooks subsystem. c++ Issues and PRs that require attention from people who are familiar with C++. timers Issues and PRs related to the timers subsystem / setImmediate, setInterval, setTimeout. labels Jun 21, 2018
@@ -216,6 +246,7 @@ module.exports = {
createHook,
executionAsyncId,
triggerAsyncId,
GetActiveResources,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be getActiveResources

@nodejs nodejs deleted a comment Jun 22, 2018
@AndreasMadsen
Copy link
Member

AndreasMadsen commented Jun 22, 2018

I would like to see a benchmark of this, that compares with the naive approach that uses asyn_hooks.createHook. Preferably, one that involves promises and one that doesn't. See my comment in #21313 on how to do that.

@Fishrock123
Copy link
Contributor Author

Fishrock123 commented Jun 22, 2018

@AndreasMadsen It's not actually possible to my understanding. I forgot to add this in (and so I will add it to the original post) but it is also a matter of handle access.

The async_hooks approach breaks down because you cannot (or at least not without significant native addon trickery) to my knowledge keep handles around for access at any time. Doing so requires strong references because JS does not have full weakrefs and so the handles will never be GC'd. This solves that by bypassing the need to store in JS and just doing a full C++ resource fetch at request.

Prototype of a new getActiveResources() API for async_hooks.

Returns an object map of { <id>: <resource> }, and differs slightly from getActiveHandles().

Currently, it works for all async_hooks resources except nextTick-s, which are probably not useful to expose in this way.
@Fishrock123 Fishrock123 force-pushed the get-active-resources branch from 7c63030 to 148ee61 Compare June 22, 2018 15:01
@AndreasMadsen
Copy link
Member

The async_hooks approach breaks down because you cannot (or at least not without significant native addon trickery) to my knowledge keep handles around for access at any time. Doing so requires strong references because JS does not have full weakrefs and so the handles will never be GC'd. This solves that by bypassing the need to store in JS and just doing a full C++ resource fetch at request.

Makes sense. Thanks for reminding me 👍

@@ -123,6 +124,35 @@ function createHook(fns) {
}


function getActiveResources() {
const handles = process._getActiveHandles();
const reqs = process._getActiveRequests();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we... perhaps... move the implementations of these to an internal/* so we can more easily deprecate the process._getActive* variants later on?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I'll do that once it is a little more settled if people are ok this idea

Copy link
Member

@AndreasMadsen AndreasMadsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not really against this, especially considering it is not easily implemented in userland. However, I still want to raise some concerns. And I'm wondering if there aren't better alternatives.

  • The resource objects are essentially completely undocumented. And we properly don't want to ever document them. So we really have to ask the question, will this really advance the story of making _getActiveHandles() and _getActiveRequests() a public API that much. The only thing that will be documented and supported is Object.keys(getActiveResources()).length, the rest remains undocumented! Yes, we also expose the resources in createHook({init}), I don't like that either and I would really like to find a better solution.

  • This gives the illusion that getActiveResources directly corresponds the resources from async_hooks which is not always the case. I do for example not believe Promise resources will be included and for sure user-land resources will not appear in getActiveResources(). That is directly misleading the user!

I likely missed some comments, but I feel there is some secrecy about what this is really for. For example, if it is for inspecting active timers, could we then not just add an API for that under require('timers')? Similarly for any other resource type.

That way we can fully document the content and we don't mislead the user about how these resources correspond to those in createHook({init}).

@jasnell
Copy link
Member

jasnell commented Jun 22, 2018

I definitely share the same concerns around the undocumented resource objects. We should take a step back and look at the use cases in detail. I'd rather not expose undocumented objects so if we do need this API, we should likely wrap the resource objects in something we can document.

@AndreasMadsen
Copy link
Member

AndreasMadsen commented Jun 22, 2018

I definitely share the same concerns around the undocumented resource objects. We should take a step back and look at the use cases in detail. I'd rather not expose undocumented objects so if we do need this API, we should likely wrap the resource objects in something we can document.

The solution for createHook({init}) that @addaleax suggested to me a long time ago, was to expose the public API objects, that the handles typically belong to, instead. For example, instead of exposing the TCPServer handle, then expose the corresponding require('tcp').Server object instead.

@Fishrock123
Copy link
Contributor Author

Fishrock123 commented Jun 22, 2018

The resource objects are essentially completely undocumented

This concession was already made for async_hooks callbacks, this doesn't seem any different. Also, the same things are also exposed to any code that has access to the objects already (i.e. the code the made a net.Socket).

This gives the illusion that getActiveResources directly corresponds the resources from async_hooks which is not always the case.

That is a bit of a problem, maybe documentation notes would solve it enough? I'm not really sure.

I likely missed some comments, but I feel there is some secrecy about what this is really for.

I put this in the OP after my last comment; It's nothing more special than what people have been using _getActiveHandles for, for a long time.

"The root why of this is that some tools (such as N|Solid) like to have access to this kind of thing so we can inspect what kind of resources are active in a live application at any arbitrary point in time."


I'd rather not expose undocumented objects so if we do need this API, we should likely wrap the resource objects in something we can document.

The rest of the async_hooks api should then also be subject to that. I'm not sure that would really suit all use-cases though. Certainly some API stability would be nice.


Regarding exposing handle owners: What do we do about requests (req_wraps)? As far as I'm aware file & crypto requests do not have any sort of owner object.

@AndreasMadsen
Copy link
Member

This concession was already made for async_hooks callbacks, this doesn't seem any different.

I agree, let's fix it all. I don't buy the argument, that just because a mistake was made once we might as well repeat it. That is not how progress is made. And for async_hooks we have an opportunity to solve it.

I put this in the OP after my last comment; It's nothing more special than what people have been using _getActiveHandles for, for a long time.

"The root why of this is that some tools (such as N|Solid) like to have access to this kind of thing so we can inspect what kind of resources are active in a live application at any arbitrary point in time."

Yeah, I was hoping for some more details than that. I don't really know what "people" use _getActiveHandles() for. Or what "access to this kind of thing" actually refers to. I'm not denying there is value here, I'm would just like to understand what exactly that value is.

Regarding exposing handle owners: What do we do about requests (req_wraps)? As far as I'm aware file & crypto requests do not have any sort of owner object.

I'm sure we can come up with a reasonable solution for those cases. Maybe we can just cleanup the API for those objects and document them.

@Fishrock123
Copy link
Contributor Author

I agree, let's fix it all.

I'm not sure how much time I can dedicate to that though, it's a pretty big issue. 😉

Just like async_hooks as it stands is necessary to some people, so is this.

Yeah, I was hoping for some more details than that. I don't really know what "people" use _getActiveHandles() for. Or what "access to this kind of thing" actually refers to. I'm not denying there is value here, I'm would just like to understand what exactly that value is.

One example could be: "find what ports are open on network sockets, if any".

I haven't poked around on npm what all uses _getActiveHandles(), but I've certainly seen stuff in the past.

@Fishrock123
Copy link
Contributor Author

Also, the original original (x2) PR for this from @cjihrig read:

process._getActiveHandles() and process._getActiveRequests() have been around for a long time, and are utilized to a large enough degree that it makes sense to make them official
APIs.

And seemed to garner a generally positive response.

@devsnek
Copy link
Member

devsnek commented Jun 24, 2018

would it be semver major to change an internal system such that the "expected" output of this api changes? (e.g. i expect a bunch of timers but there's only one)

if yes, i feel like this is exposing too much. otherwise it is fine.

@TimothyGu
Copy link
Member

@devsnek It is one thing for us to say it's not semver-major, another for users to start depending on it permanently so that it's effectively so.

@AndreasMadsen Would it be acceptable for this to land in its current shape, behind a runtime flag? This way embedders with resources to maintain possible breakages due to internals change can still have such a functionality, while we won't have to worry about backward compatibility.

@devsnek
Copy link
Member

devsnek commented Jun 25, 2018

@TimothyGu i bring that up so that we can document if it should/shouldn't be dependable behavior. the difference between "use this api to see handles" and "use this api to see all X because each X has its own handle" are two different usages that could arise but we might not want people to think that the second will always be a thing.

@AndreasMadsen
Copy link
Member

AndreasMadsen commented Jun 25, 2018

@devsnek That is a big part of the problem. With this, we are moving one step further on making users depend on all internal Wrap resources. That is not something we want. Experience also tells us that documentation isn't really a solution. Users will still depend on behavior that the documentation tells them not to and we will therefore still break modules in the future.

@AndreasMadsen Would it be acceptable for this to land in its current shape, behind a runtime flag? This way embedders with resources to maintain possible breakages due to internals change can still have such a functionality, while we won't have to worry about backward compatibility.

Sure. I think it should be put on the process object then. I know some are against that but in its current state this has very little to do with async_hooks.

@Fishrock123 Fishrock123 requested a review from cjihrig June 26, 2018 00:28
@Fishrock123
Copy link
Contributor Author

Since it maps by the handle's async_id's, I'd say it is more related to async_hooks than it is to anything else. Putting it on process also risks ending up with an unintended level of support.

@TimothyGu
Copy link
Member

TimothyGu commented Jun 26, 2018

@Fishrock123

Putting it on process also risks ending up with an unintended level of support.

Not if we put it behind a flag 😄

@jasnell
Copy link
Member

jasnell commented Jun 26, 2018

We can't put every new thing behind a flag, just as we can't put every new thing on process. The uses of this are most closely aligned with async_hooks, even if there are differences in approach. Otherwise, it is at best a utility function that can go into util.

@rvagg
Copy link
Member

rvagg commented Jul 28, 2018

I'd love to see some progress on this. Our (NodeSource's) downstream users keep on giving us the signal that the "what is Node doing right now?" question that we help answer via use of _getActiveHandles() and _getActiveRequests() is more critical than we originally anticipated when we started using them, and I know we're not alone in this. This use-case is significantly simpler than justifies heavy-weight use of async_hooks, which you have to be proactively using to get the kinds of answers users are after. There's nothing secret about what we're doing—we're simply representing a felt need. A lot of pain is expressed through our support services, which support users not even using N|Solid, so we just want to have good answers. What is new is that because _getActiveHandles() and _getActiveRequests() are _ methods they're fair game for adjustments, both internally and in what they expose and this keeps on making it difficult for us and our users (and again, I've heard this from others). Interst and development around async_hooks and ongoing V8 changes around Promises seems to have made _getActiveHandles() and _getActiveRequests() quite unstable.

I share concern expressed in this thread about the kinds of objects being exposed here and I don't like that we have all of these different shapes of internal objects being exposed in a seemingly ad-hoc manner and it's something we need to figure out. But I can see that discussion extending for a significant period of time unless someone has a fully baked proposal they want to make right now.

Aside from that, I quite like this proposal as it is now but would just like to see docs.

So, here's a proposal, let me know if there are objections to this or suggested amendments:

  • Document it
  • Mark this experimental for now, pending finalisation of object types & shapes coming out of here and async_hooks (this also gives us the flexibility to potentially move it to process or util if we end up deciding the objects diverge too far from async_hooks)
  • Move _getActiveHandles() and _getActiveRequests() implementations to internal/ and mark them as doc-deprecated, pending runtime-deprecation which will sync with this new API moving out of experimental.

@AndreasMadsen
Copy link
Member

@rvagg To be fair, I think #22002 is a pretty big step in making an official API.

@jasnell
Copy link
Member

jasnell commented Oct 17, 2018

Ping... where are we at on this? Would love to see progress.

@jasnell jasnell added the stalled Issues and PRs that are stalled. label Oct 17, 2018
@szmarczak
Copy link
Member

@Fishrock123 Any updates on this?

@Fishrock123
Copy link
Contributor Author

Fishrock123 commented Sep 15, 2019

Hey if anyone wants to pick this back up go ahead. I'm probably not going to, although I believe this is the correct approach.

(I am no longer at an employer who needs such a thing and have real no need of it myself.)

@Flarna
Copy link
Member

Flarna commented Sep 18, 2019

Is this somehow related to #21313?

Even that one is closed it's still somehow on agenda to be done at some time.

@Fishrock123
Copy link
Contributor Author

@Flarna Yes, you can effectively do the same thing with this. This is more flexible.

@addaleax
Copy link
Member

I think the two are unrelated besides being both about the async hooks API.

@Fishrock123
Copy link
Contributor Author

I believe the following would be roughly equivalent to that PR:

async_hooks.getActiveResources()[async_hooks.executionAsyncId()]

@Trott Trott added the help wanted Issues that need assistance from volunteers or PRs that need help to proceed. label Oct 23, 2019
@Trott
Copy link
Member

Trott commented Oct 23, 2019

@nodejs/async_hooks Any chance someone wants to pick this up?

@iffy
Copy link

iffy commented Oct 24, 2019

I likely missed some comments, but I feel there is some secrecy about what this is really for. For example, if it is for inspecting active timers, could we then not just add an API for that under require('timers')? Similarly for any other resource type.

My fly-by 2 cents describing a use-case for this: I want to start a long-running child process, then track every asynchronous thing that is started/stopped after the child process. Once every asynchronous thing is done, kill the child process: https://stackoverflow.com/questions/58515254/how-do-i-make-nodejs-terminate-even-if-i-have-non-essential-things-in-the-even

@jasnell
Copy link
Member

jasnell commented Jun 19, 2020

This has been stalled for a while and has seen no further activity. Closing

@jasnell jasnell closed this Jun 19, 2020
RaisinTen added a commit to RaisinTen/node that referenced this pull request Apr 19, 2021
This change picks up the changes from the
referenced PR and moves the implementation of
`getActiveResources` to `util` to return a summary
of the resources.

Refs: nodejs#21453
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
async_hooks Issues and PRs related to the async hooks subsystem. c++ Issues and PRs that require attention from people who are familiar with C++. discuss Issues opened for discussions and feedbacks. help wanted Issues that need assistance from volunteers or PRs that need help to proceed. stalled Issues and PRs that are stalled. timers Issues and PRs related to the timers subsystem / setImmediate, setInterval, setTimeout. wip Issues and PRs that are still a work in progress.
Projects
None yet
Development

Successfully merging this pull request may close these issues.