-
Notifications
You must be signed in to change notification settings - Fork 7.3k
setImmediate is not always very immediate #5798
Comments
Mailing list link: https://groups.google.com/forum/#!topic/nodejs/A_uo0Mfmk5o (message ID |
That is certainly how it was designed, to avoid starving the loop we decided to only run one callback per tick. Prior to 0.11 nextTick checks to see how "deep" you are and warns that you may essentially be starving the loop, in 0.11 that check is no longer there and nextTick will soldier on and do what you told it to do and fire every cb you scheduled. The differentiating factor for setImmediate is that we're always yielding to the loop to let it do IO. Yes, that means that cb queued with setImmediate will fire more slowly than nextTick. |
The usage case seemed to me was to use setImmediate to schedule any |
The difference is that we could move the list out of the way, such that anything that would schedule a new setImmediate (while we're currently processing immediates) would indeed happen on the next tick This avoids the recursive starvation that could happen with nextTick but not the general starvation problem for your application -- the question is if we care |
Fwiw, the observed behaviour is what I would expect from reading the docs: "Immediates are queued in the order created, and are popped off the queue once per loop iteration.". Its unfortunate that process.nextTick and the timers global setImmediate are in different modules despite being very similar, and that the behaviour is reversed from what I might guess from the name:
|
I think we should. It hurts throughput and doesn't behave like I would expect it to. (YMMV, of course. Sam is a counterpoint to my second argument.) |
The question is if we should care that the change would potentially let the user starve his application, we obviously care that what we're doing may be the exact opposite and slowing down an application and having unintended consequences. On the plus side, it is working exactly like I documented :) |
A change like this would put |
I think that if we swapped the names of nextTick and setImmediate, most of the confusion would go away. It seems pretty clear to me that both functions are useful: nextTick allows you to ensure that a function is "always asynchronous" without incurring a penalty in terms of overhead, and setImmediate allows you to distribute computation over many turns of the event loop while ensuring that I/O doesn't get starved. If setImmediate worked through the entire queue of callbacks on the next turn of the event loop, that reintroduces the possibility of starvation. The current behavior is more expensive, but (IMO) more useful as a counterpart to nextTick. I also think it's more intuitive, but that's a matter of opinion. What this suggests is that the docs need to be clearer in terms of explaining when you'd want to use one instead of the other (and also maybe gently guiding people away from using setTimeout(fn, 0), although maybe we could modify setTimeout(fn, 0) to process the entire callback queue on a further turn of the event loop, because setTimeout(0)'s semantics are sufficiently vague and magic is great!). |
Also, for purely documentary purposes, putting nextTick() on global instead of process or cross-referencing it from the timers page would be very useful. |
Starvation doesn't have to be an issue, we could do something like this: var queue = immediateQueue;
immediateQueue = {};
while (!L.isEmpty(queue)) {
var obj = L.shift(queue);
obj.cb();
} IOW, operate on a copy of the 'pending callbacks' list and clear the original. If a callback calls setImmediate(), that callback won't get invoked until the next tick. Problem solved, right? |
I think conceptually it makes sense to handle one setImmediate per turn of the event loop -- it's simpler and more consistent, even if performance suffers. nextTick is there if you want something with minimal overhead. |
I still don't fully understand how this would work w/ multiple callbacks and multiple levels. example: function log(n) { console.log(n); }
setImmediate(function() {
setImmediate(function() {
log(1);
setImmediate(function() { log(2); });
setImmediate(function() { log(3); });
});
setImmediate(function() {
log(4);
setImmediate(function() { log(5); });
setImmediate(function() { log(6); });
});
});
// output: 1 4 2 3 5 6 To achieve this you'd need to maintain multiple lists. |
Only two lists are needed
There's no difference in order of execution because every created immediate always goes on the end, there's only a difference in which tick they happen on |
@tjfontaine that still means I could essentially starve I/O. Not indefinitely because v8 would eventually crash from filling up an object beyond bounds, but would change the following doc:
|
Yes, as I pointed out above, it can starve your application, but not in quite the same pathological way a recursive nextTick would. This is because as you queue new immediates (while immediates are being processed) they won't happen until the next tick. |
Guess my thought was that |
Repurposing a conversation I had about this with @domenic... Given how messy the situation is when dealing with the event loop across multiple JS runtimes, and the absence of standards here, does it make sense to try to reuse names from the browser to mimic semantically similar behaviors in Node, or should we give these functions names that accurately reflect their behavior? Cross-environment consistency has merit, but right now I'm not devoted to the idea of swapping the names of
I can see games and other latency-sensitive applications caring about some of those nonexistent functions, but my larger point is that right now this very important class of functions are scattered across the API surface of the runtime in a scattershot way due to the desire to provide a more browser-like interface, when it makes more sense (to me, at least) to centralize them in a module that can be used and documented as a single unit. |
There are two basic ideas then that core should formalize and export
Everything else can be built on those. Removing those globals would certainly cause some upheaval, but if we're going to do it we need to agree to start the deprecation procedure for v0.12. |
The backwards compatibility of But that time passed, the dust settled, and the applications relying on the current meanings of both of those functions started to appear gradually. Now (almost a year later) should the backwards compatibility be broken once again? Could it be that both functions just need to be documented slightly better? For example, |
Would just like to clarify a couple things that @tjfontaine and i discussed on IRC. First is that developers should not think about Tthe functionality of So again, don't think of Again just to make it really really clear. @othiym23 You're point on "run an entire queue of functions on the next turn of the event loop (doesn't exist today)". It does. Not pretty, but it does. setImmediate(function() {
process.nextTick(cb1);
process.nextTick(cb2);
process.nextTick(cb3);
}); Here it will schedule a single function to run on the next tick, after which the entire queue of callbacks will also run. Of course this does have the caveat that if there are multiple |
Before the topic derails further, here is what I had in mind: diff --git a/lib/timers.js b/lib/timers.js
index f4a6069..d5dc787 100644
--- a/lib/timers.js
+++ b/lib/timers.js
@@ -313,19 +313,24 @@ L.init(immediateQueue);
function processImmediate() {
- var immediate = L.shift(immediateQueue);
+ var queue = immediateQueue;
+ immediateQueue = {};
+ L.init(immediateQueue);
+
+ while (L.isEmpty(queue) === false) {
+ var immediate = L.shift(queue);
+ var domain = immediate.domain;
+ if (domain) domain.enter();
+ immediate._onImmediate();
+ if (domain) domain.exit();
+ }
+ // Only round-trip to C++ land if we have to. Calling clearImmediate() on an
+ // immediate that's in |queue| is okay. Worst case is we make a superfluous
+ // call to NeedImmediateCallbackSetter().
if (L.isEmpty(immediateQueue)) {
process._needImmediateCallback = false;
}
-
- if (immediate._onImmediate) {
- if (immediate.domain) immediate.domain.enter();
-
- immediate._onImmediate();
-
- if (immediate.domain) immediate.domain.exit();
- }
}
The numbers:
tl;dr 1,400% speedup. |
Woah! This is bad. Did not realize it worked this way. Allowing plan interference attacks via throwing handlers is the opposite of what any asynchronous scheduling mechanism should be about. (See e.g. slides 44 onward of Mark Miller's presentation.) Have you considered using |
@bnoordhuis you can check if domains are being used before checking if there's a domain by @domenic we already do (https://github.com/joyent/node/blob/master/src/node.js#L372-L377). Problem w/ the approach of marching blindly on in the case of an error is that an error callback can also throw. If you hit that, and you're using domains, you're now in an infinite loop. |
Oh dear :(. I'll have to give this some thought; there must be a way to avoid plan interference attacks while preserving domain functionality (and not hurting performance of course). |
@trevnorris I completely agree with
Its equally as true that I agree with @Mithgol, despite the confusing names, a good first step would be to accurately document the current behaviour (a 1,400% speed improvement would be nice, too), starting with rewriting "On the next loop around the event loop call this callback. " If you call nextTick from inside nextTick, won't it happen right away? And aren't there multiple points in a |
A good starting step would be editing http://nodejs.org/docs/latest/api/process.html#process_process_nexttick_callback to put a hyperlink leading to http://nodejs.org/docs/latest/api/timers.html from there (in order for any human reader to be able to compare them and make an informed choice). The description of |
Previously only one cb per turn of the event loop was processed at a time, which is not exactly what is meant by immediate fixes nodejs#5798
The updated changes the execution order, which now seems incorrect: setImmediate(function() {
console.log('0');
process.nextTick(function() {
console.log('2');
process.nextTick(function() {
console.log('3');
});
});
});
setImmediate(function() {
console.log('4');
process.nextTick(function() {
console.log('5');
});
});
// before: 0 2 3 4 5
// now: 0 4 2 5 3 I expect that callbacks passed to |
change almost nearly solidifying my belief that |
So |
The
|
Very good. |
It is important to document the fairness issue. Besides From what I understand each turn of event loop resets the queues and processes all the callbacks that had been queued. The callbacks are on separate queues so they are not processed exactly in a first-in first-out fashion. But there is no risk of starvation as all the queues are emptied at every turn of the event loop. But I'm not sure I'm getting it right. Can someone comment? |
All deferral mechanisms guarantee execution order to be the same as |
Thanks TJ. I just have one more question: what are the categories, and in which order are they processed? |
|
What about the callbacks from I/O operations? Are they on a separate queue, processed after the setTimeout queue? |
OK but the diagram does not show if only one I/O callback is dequeued at each turn of the loop or if all the queued I/O callbacks are processed. I ran a little test and it looks like several I/O callbacks are processed. |
@tjfontaine Please remind me tomorrow to write up extensive documentation |
@trevnorris @tjfontaine I've been working on a blog post off and on for weeks now with a bunch of this stuff in there. Here's my current draft: https://gist.github.com/othiym23/6102710 Feel free to suggest revisions, or fork and edit. Big blocker right now is doing enough of a crawl through the C++ to accurately describe how the event loop is actually run. I have the JS side down, but the interaction between |
Here's a quick run down of what a tick of the libuv event loop looks like. Dispatch in chronological order:
process.nextTick() callbacks weave through pretty much all of the above, just about everything that calls into JS land can set off a dispatch of pending nextTick callbacks... ... which is pretty fscking stupid IMO, it should be more like setImmediate() and dispatch only once per tick. The downside is that callbacks may sit in the queue longer which can affect latency and/or throughput. The whole current "call into JS land" scheme is pretty deranged right now. A call to MakeCallback() potentially does four C++ to JS calls: to enter the callback's domain, to invoke the callback, to exit the domain again and to dispatch pending nextTick callbacks. That's a lot of potential overhead. :-( Anyway, hope that helps. (EDIT: That comment about the four calls into JS land is only applicable when you use domains. If you don't, there's just the callback and the nextTick dispatch.) |
On Jul 29, 2013 4:55 AM, "Ben Noordhuis" notifications@github.com wrote:
Ironically it was slower when the entire thing was done with a single call |
Apply this patch:
Then run this:
Which prints this:
Suggesting that the callbacks are spread out over multiple ticks, i.e. one per tick.
/cc @tjfontaine
The text was updated successfully, but these errors were encountered: