Skip to content
This repository has been archived by the owner on Apr 22, 2023. It is now read-only.

Allow passing specific PROVIDERs to AsyncListener callbacks #7145

Closed
wants to merge 5 commits into from
Closed

Allow passing specific PROVIDERs to AsyncListener callbacks #7145

wants to merge 5 commits into from

Conversation

trevnorris
Copy link

This allows user to pass specific "providers" to {add,create}AsyncListener() which will allow them to track only specific operations.

While the functionality is working now, I'm sure a lot of fine tuning will be done as users get their hands on it.

@trevnorris trevnorris added this to the v0.12 milestone Feb 18, 2014
@trevnorris trevnorris self-assigned this Feb 18, 2014
The `NEXTTICK` `ASYNC_PROVIDER` will always fire regardless, because of the
implicit usage of [`process.nextTick()`][] within core code. This makes it
difficult to determine the true origin of the call. We hope to correct this
in the future.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Including these last two lines is not necessary, just state the fact and leave it at that.

@groundwater
Copy link

I have come across an inconsistency between the async polyfill and the native async-listener interface. This currently breaks some of the CLS tests.

It appears that some create callbacks occur outside the scope of a before/after context. For example, here is what continuation-local-storage is expecting from async-listener:

(scenario: three asynchronous calls in a single continuation chain)

  create[1]
      |
      |
  before[1]
      |      create[2]
   after[1]      |
                 |
             before[2]
                 |      create[3]
              after[2]      |
                            |
                        before[3]
                            |
                         after[3]

In order for the contexts to be set up correctly to pass state through the continuation chain, each subsequent continuation has to be able to get at the context set by the previous ones. Therefore, each create callback has to be run after the previous before callback, and before the previous after callback.

The native async-listener breaks this contract, as illustrated in a trace below:

--> MAIN
  * SET STATE
  + CREATE 1
  + CREATE 2
<-- MAIN

--> BEFORE 2
  + CREATE 3
  + CREATE 4
<-- AFTER  2

  + CREATE 5

--> BEFORE 1
  * ASSERT STATE PRESERVED
  + CREATE 6
<-- AFTER  1

The above trace was generated while debugging the CLS problem.

A more detailed trace, along with a code example is available here https://gist.github.com/groundwater/e791f503e21e7c44f2f1

The way the current pollyfill works, the above CREATE 5 event would occur between BEFORE/AFTER 1. This lets CLS link any events in the onConnect handler with events in the onData handler.

@othiym23 and I have tried several variations on solving this, but there seems to always be a corner case that can break it. I am skeptical about moving CLS to support this API, as there seem to be a lot of corner cases to handle.

Given that we are predicating a lot of work on the ability of our (or any) tracing module to consume ALs, I just wanted to make sure it's clear that this solution is going to be difficult for us to consume.

Is the create statement supposed to happen outside of a before/after, or is that a bug which we need new tests for? I don't want to hold up the 0.12 release, but this is a critical bug.

cc/ @tjfontaine @trevnorris

@tjfontaine
Copy link

Ok this is a serious issue --

Coming in from the libuv event loop instantiating a handle_wrap results in a create callback that can't be wrapped in a before/after context load because those only happen for transitions from MakeCallback.

This basically makes use case for CLS impossible.

I think we will have to make this API private.

@trevnorris
Copy link
Author

I'm aware of the following issue:

var l = tracing.addAsyncListener(callbacks);
net.createServer(function() {
  // This connection and any incoming data requests won't
  // be linked to the AL that wrapped the server.
}).listen(8000);
tracing.removeAsyncListener(l);

It's something I've been working on solving.

@trevnorris
Copy link
Author

@tjfontaine please refrain from making sweeping statements like "Ok this is a serious issue" without speaking with the developer that's actually spent his life in this patch.

@trevnorris
Copy link
Author

The issue brought up by @groundwater has been resolved. I need to double check and see if there are any other cases where this will occur in source, and also do some cleanup of how the operation is performed, but it's working.

The last "bug" for me is an edge case when a listener is removed in the middle of an asynchronous call stack.

@@ -41,30 +41,63 @@ inline AsyncWrap::AsyncWrap(Environment* env,
: BaseObject(env, object),
async_flags_(NO_OPTIONS),
provider_type_(provider) {
if (!env->has_async_listener())
AsyncWrap* parent = env->async_wrap_parent_class();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is nothing initializing async_wrap_parent_ to NULL or reset_async_wrap_parent_class() is not called every time. details

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What system are you running on? I can't reproduce the issue you're having.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tonistiigi can you run your tests again? it should be fixed by 3f812ac.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, seems to work fine now. This was on OSX.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great. thanks for reporting this. :)

* Generalize the ASYNC_PROVIDER being passed by then category to which
  it belongs. Instead of having specialized providers for classes like
  ConnectWrap.
* When some classes are instantiated from a libuv callback there is no
  activeContext, so manually set the parent class that contains the
  context that should be used for class instantiation. This will allow
  proper propagation.
* Force before/after callbacks to run when a class is instantiated, not
  only when a callback is about to be called from MakeCallback.
* Add _removeAsyncQueue() method to handles so when there are no longer
  any items in the queue async_flags_ can be reset.
* Manually track additional states in Environment. For example: the
  length of the activeContext queue, the provider type of the
  activeContext or whether any AsyncListener callbacks are being
  processed.
* Pass if the error was handled as the fourth argument to the error
  callback handler.
* Move all properties to single _async object.
* Only run callbacks for watched providers.
* Each asynchronous call chain is independent and the listener can be
  removed at any time for that specific branch in the call stack.
* Each AsyncListener tracks its own watched providers. These are
  accumulated on _async.watchedProviders for quick checks when the
  run/load/unload functions are called.
* When an AsyncListener is removed, it's removed from the activeContext
  and from every context in the contextStack.
If an AsyncListener was added in the middle of an async execution chain
then it was possible for the after callback to run w/o the before
callback having had a chance to run.

This make sure the before callback has a chance to run, or the after
callback will be skipped.
This is to prevent an edge case that causees EXC_BAD_ACCESS.
@trevnorris
Copy link
Author

Closing this. I'm going to keep the branch open on my repo as to not loose a lot of the JS logic that I spent so much time on. But for now it'll be removed, and I'll open another PR for that.

@trevnorris trevnorris closed this Aug 8, 2014
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants