Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Infinite loop in processTimers #23921

Closed
jcormont opened this issue Oct 27, 2018 · 9 comments
Closed

Infinite loop in processTimers #23921

jcormont opened this issue Oct 27, 2018 · 9 comments
Labels
duplicate Issues and PRs that are duplicates of other issues or PRs. timers Issues and PRs related to the timers subsystem / setImmediate, setInterval, setTimeout.

Comments

@jcormont
Copy link

jcormont commented Oct 27, 2018

  • Version: v11.0.0
  • Platform: Mac OS Mojave
  • Subsystem: timers

There seems to be a critical issue in the timers.js module which causes an infinite loop in certain cases.

I kept hitting this issue at random times in my server code (which heavily depends on timers and async for connection state, timeouts etc), bringing the CPU up to 100%, but I noticed that the issue wasn't in my code at all. The debugger kept stepping through various functions in timers.js, and never any of my own code when I paused the program while it was stuck.

After rewriting my own code several times I got into different infinite loops in the same module (?!) -- therefore I am not sure if the exact issue below is the only one in this module, or if there is a deeper issue here. After another rewrite what finally 'worked' for me was removing calls to clearTimeout. (My timers were just there anyway to introduce timeouts to reject Promises iff they were not resolved yet, so they could easily be ignored; not even sure if there would be a performance benefit to clear timeouts vs rejecting already-resolved promises).

But I did some more digging with the clearTimeout calls back in place and I think I located (one of ) the issue(s), although my own static analysis skills aren't good enough to pinpoint exactly the steps to reproduce.


Looking at the code in processTimers I notice the following:

while (list = queue.peek())
  // loop here, which loops infinitely in my case...

The queue is apparently a priority queue of lists, which are also referenced by lists. Presumably, the list value should be different every time unless there are multiple timers to be fired from the same list.

Within this while loop there is mostly a call to listOnTimeout(list, ...). Notice the following code at the end of listOnTimeout:

  // The current list may have been removed and recreated since the reference
  // to `list` was created. Make sure they're the same instance of the list
  // before destroying.
  if (list === lists[msecs]) {
    delete lists[msecs];
    queue.shift();
  }

Notice that nothing happens to queue at all if the list is not in lists. Therefore the peek call in the top-most while loop condition will return the same list all the time.

Could there be a case where list is not in lists anymore, but still in queue? Apparently there is, and it is related to the effect of clearTimeout...?


Analysis: At the above point in my debugger, list has the following properties:

expiry:17376
id:-9007199254740989
msecs:15000
priorityQueuePosition:1

However lists does not have an entry at all for 15000. At the same time, queue.peek() does in fact return the list value. Stepping into peek reveals that the first element is indeed the same list.

After running queue.removeAt(1), my program continues running normally, confirming that this is indeed the exact issue.

I can't figure out how this situation can ever arise (the code in unenroll, which is called by clearTimeout, looks fine), maybe it's even a bug in PriorityQueue?

In any case, perhaps it's prudent to wrap the call to queue.shift in a separate if-statement, that checks queue.peek just to be sure. Like so:

  // Double check
  if (list === lists[msecs]) {
    delete lists[msecs];
  }
  if (queue.peek() === list) {
    queue.shift();
  }
@richardlau
Copy link
Member

See #23860. Should be fixed by #23870.

@richardlau richardlau added duplicate Issues and PRs that are duplicates of other issues or PRs. timers Issues and PRs related to the timers subsystem / setImmediate, setInterval, setTimeout. v11.x labels Oct 27, 2018
@jcormont
Copy link
Author

Thanks! Indeed already fixed.

This is what I get for searching only open issues, didn't think an issue like this would be fixed this fast :)

@jcormont
Copy link
Author

jcormont commented Nov 1, 2018

I am reopening this because until now there is no release and it looks like a really critical issue to me. Having an open issue on this will help others figure out what is happening. There was talk of fast-tracking a patch to be released in a day but that hasn't happened.

@jcormont jcormont reopened this Nov 1, 2018
@targos
Copy link
Member

targos commented Nov 2, 2018

The fix just went out in v11.1.0.

@avwo
Copy link

avwo commented Nov 9, 2018

I also have this problem when running whistle on Node v11.1.0 in Windows. There is no such problem in < v11 version.

Infinite loop

@apapirovski
Copy link
Member

apapirovski commented Nov 9, 2018

@avwo see #24203 and the associated PR. The cause is the same after looking at the source for Whistle.

That said, if you have a reproduction — please post it.

@apapirovski
Copy link
Member

@avwo I found the bug. PR upcoming later today.

@apapirovski
Copy link
Member

@avwo Are you certain you're running this on 11.1.0? I just double checked after freshly compiling from master and I don't see this behavior for the whistle tests. I do see this behaviour with 11.0.0. (I was previously running an old build of master when I had confirmed the bug above.)

@avwo
Copy link

avwo commented Nov 10, 2018

@apapirovski I tried it many times and I can definitely have this problem on 11.1.0. This problem is not easy to reproduce. I need to whistle for a while. I am trying to find the code that triggered this bug.
loop2
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
duplicate Issues and PRs that are duplicates of other issues or PRs. timers Issues and PRs related to the timers subsystem / setImmediate, setInterval, setTimeout.
Projects
None yet
Development

No branches or pull requests

5 participants