Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flaky test-timers-block-eventloop #16310

Closed
targos opened this issue Oct 19, 2017 · 10 comments
Closed

Flaky test-timers-block-eventloop #16310

targos opened this issue Oct 19, 2017 · 10 comments
Labels
flaky-test Issues and PRs related to the tests with unstable failures on the CI. test Issues and PRs related to the tests. timers Issues and PRs related to the timers subsystem / setImmediate, setInterval, setTimeout.

Comments

@targos
Copy link
Member

targos commented Oct 19, 2017

Test added in #15072

Example failure:
https://ci.nodejs.org/job/node-test-commit-linuxone/9533/nodes=rhel72-s390x/console

not ok 1945 sequential/test-timers-block-eventloop
  ---
  duration_ms: 0.440
  severity: fail
  stack: |-
    assert.js:45
      throw new errors.AssertionError({
      ^
    
    AssertionError [ERR_ASSERTION]: eventloop blocked!
        at Timeout.mustNotCall [as _onTimeout] (/data/iojs/build/workspace/node-test-commit-linuxone/nodes/rhel72-s390x/test/common/index.js:571:12)
        at ontimeout (timers.js:478:11)
        at tryOnTimeout (timers.js:302:5)
        at Timer.listOnTimeout (timers.js:262:5)
  ...
@targos targos added flaky-test Issues and PRs related to the tests with unstable failures on the CI. timers Issues and PRs related to the timers subsystem / setImmediate, setInterval, setTimeout. test Issues and PRs related to the tests. labels Oct 19, 2017
@gireeshpunathil
Copy link
Member

/cc @nodejs/platform-s390

@gireeshpunathil
Copy link
Member

I guess the reported error is on RHEL on S390

@zhangzifa
Copy link
Contributor

zhangzifa commented Oct 19, 2017

@targos and @gireeshpunathil
I reproduce this at my laptop(thinkpad x240/ubuntu 16.04) with the following steps:

  1. start 4 processes to occupy the cpu while(1);
  2. run the test case every other second.
  3. the error may happen every 10 times running the case.

So I guess maybe the test server was busy when this happens.

Fix is easy: just change the t3 timeout to 200ms.

--- a/test/sequential/test-timers-block-eventloop.js
+++ b/test/sequential/test-timers-block-eventloop.js
@@ -11,7 +11,7 @@ const t2 = setInterval(() => {
   common.busyLoop(15);
 }, 10);
 
-const t3 = setTimeout(common.mustNotCall('eventloop blocked!'), 100);
+const t3 = setTimeout(common.mustNotCall('eventloop blocked!'), 200);

@zhangzifa
Copy link
Contributor

With this fix, failures does not happen after 1000 times running. Is a PR needed?

@gireeshpunathil
Copy link
Member

@zhangzifa - makes sense, the test seem to be walking on a ridge with high sensitivity on the timing of events that does not account into the environmental factors in the system. While we wait to hear from @nodejs/platform-s390 this experiment and the proposal thereon looks reasonable to me, thanks!

@mhdawson
Copy link
Member

@jBarz can you take a look and comment.

@jBarz
Copy link
Contributor

jBarz commented Oct 20, 2017

It seems like we are depending on the following fs.stat call to callback within 50 milliseconds. Otherwise the test fails.

fs.stat('./nonexistent.txt', (err, stats) => {...}

Increasing the timeout is one way to fix this.
@zhangzifa , could we stat something like /dev/zero which won't depend on the speed of the filesystem mount?

@zhangzifa
Copy link
Contributor

@jBarz I tested just now with /dev/zero, anyway, it still fails occasionally when cpu is busy. Anyway, I will update the case to /dev/nonexistent

@gibfahn
Copy link
Member

gibfahn commented Oct 20, 2017

@zhangzifa a PR would be great.

@zhangzifa
Copy link
Contributor

@gibfahn 16314 is there now.

zhangzifa added a commit to zhangzifa/node that referenced this issue Oct 20, 2017
When CPU is busy, the above sequential case fails occasionally,
expand the timeout value to fix it.

Fixes: nodejs#16310
@refack refack closed this as completed in 83b8474 Oct 29, 2017
gibfahn pushed a commit that referenced this issue Oct 30, 2017
When CPU is busy, the above sequential case fails occasionally,
expand the timeout value to fix it.

PR-URL: #16314
Fixes: #16310
Reviewed-By: Gireesh Punathil <gpunathi@in.ibm.com>
Reviewed-By: Gibson Fahnestock <gibfahn@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Refael Ackermann <refack@gmail.com>
gibfahn pushed a commit that referenced this issue Oct 30, 2017
When CPU is busy, the above sequential case fails occasionally,
expand the timeout value to fix it.

PR-URL: #16314
Fixes: #16310
Reviewed-By: Gireesh Punathil <gpunathi@in.ibm.com>
Reviewed-By: Gibson Fahnestock <gibfahn@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Refael Ackermann <refack@gmail.com>
gibfahn pushed a commit that referenced this issue Oct 31, 2017
When CPU is busy, the above sequential case fails occasionally,
expand the timeout value to fix it.

PR-URL: #16314
Fixes: #16310
Reviewed-By: Gireesh Punathil <gpunathi@in.ibm.com>
Reviewed-By: Gibson Fahnestock <gibfahn@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Refael Ackermann <refack@gmail.com>
Qard pushed a commit to ayojs/ayo that referenced this issue Nov 2, 2017
When CPU is busy, the above sequential case fails occasionally,
expand the timeout value to fix it.

PR-URL: nodejs/node#16314
Fixes: nodejs/node#16310
Reviewed-By: Gireesh Punathil <gpunathi@in.ibm.com>
Reviewed-By: Gibson Fahnestock <gibfahn@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Refael Ackermann <refack@gmail.com>
MylesBorins pushed a commit that referenced this issue Nov 16, 2017
When CPU is busy, the above sequential case fails occasionally,
expand the timeout value to fix it.

PR-URL: #16314
Fixes: #16310
Reviewed-By: Gireesh Punathil <gpunathi@in.ibm.com>
Reviewed-By: Gibson Fahnestock <gibfahn@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Refael Ackermann <refack@gmail.com>
MylesBorins pushed a commit that referenced this issue Nov 21, 2017
When CPU is busy, the above sequential case fails occasionally,
expand the timeout value to fix it.

PR-URL: #16314
Fixes: #16310
Reviewed-By: Gireesh Punathil <gpunathi@in.ibm.com>
Reviewed-By: Gibson Fahnestock <gibfahn@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Refael Ackermann <refack@gmail.com>
MylesBorins pushed a commit that referenced this issue Nov 28, 2017
When CPU is busy, the above sequential case fails occasionally,
expand the timeout value to fix it.

PR-URL: #16314
Fixes: #16310
Reviewed-By: Gireesh Punathil <gpunathi@in.ibm.com>
Reviewed-By: Gibson Fahnestock <gibfahn@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Refael Ackermann <refack@gmail.com>
addaleax pushed a commit to ayojs/ayo that referenced this issue Dec 7, 2017
When CPU is busy, the above sequential case fails occasionally,
expand the timeout value to fix it.

PR-URL: nodejs/node#16314
Fixes: nodejs/node#16310
Reviewed-By: Gireesh Punathil <gpunathi@in.ibm.com>
Reviewed-By: Gibson Fahnestock <gibfahn@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Refael Ackermann <refack@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
flaky-test Issues and PRs related to the tests with unstable failures on the CI. test Issues and PRs related to the tests. timers Issues and PRs related to the timers subsystem / setImmediate, setInterval, setTimeout.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants