Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test: tune test-uv-threadpool-schedule #25358

Closed
wants to merge 2 commits into from
Closed

Conversation

Trott
Copy link
Member

@Trott Trott commented Jan 6, 2019

test-uv-threadpool-schedule has been failing consistently in
node-daily-master CI. It also fails consistently on my personal laptop.
These changes make it pass consistently with current master but fail
consistently with Node.js 10.11 (which was the last release in the 10.x
line before the fix that the test is for). It succeeds/fails as expected
whether or not I am connected to the network.

Fixes: #25305

/ping @gireeshpunathil

Checklist
  • make -j4 test (UNIX), or vcbuild test (Windows) passes
  • commit message follows commit guidelines

test-uv-threadpool-schedule has been failing consistently in
node-daily-master CI. It also fails consistently on my personal laptop.
These changes make it pass consistently with current master but fail
consistently with Node.js 10.11 (which was the last release in the 10.x
line before the fix that the test is for). It succeeds/fails as expected
whether or not I am connected to the network.

Fixes: nodejs#25305
@Trott Trott added the flaky-test Issues and PRs related to the tests with unstable failures on the CI. label Jan 6, 2019
@nodejs-github-bot nodejs-github-bot added the test Issues and PRs related to the tests. label Jan 6, 2019
@Trott
Copy link
Member Author

Trott commented Jan 6, 2019

CI job that runs the internet test: https://ci.nodejs.org/job/node-test-commit-custom-suites/815/

That and the linter are the only relevant ones in this case.

@Trott
Copy link
Member Author

Trott commented Jan 6, 2019

Unfortunately, that still didn't quite work...

16:33:10 not ok 25 internet/test-uv-threadpool-schedule
16:33:10   ---
16:33:10   duration_ms: 0.213
16:33:10   severity: fail
16:33:10   exitcode: 1
16:33:10   stack: |-
16:33:11     assert.js:351
16:33:11         throw err;
16:33:11         ^
16:33:11     
16:33:11     AssertionError [ERR_ASSERTION]: fast I/O took longer to complete, actual: 11, expected: 3.5
16:33:11         at GetAddrInfoReqWrap.onResolve (/home/iojs/build/workspace/node-test-commit-custom-suites/default/test/internet/test-uv-threadpool-schedule.js:41:12)
16:33:11         at GetAddrInfoReqWrap.callback (/home/iojs/build/workspace/node-test-commit-custom-suites/default/test/common/index.js:376:15)
16:33:11         at GetAddrInfoReqWrap.onlookup [as oncomplete] (dns.js:57:17)
16:33:11   ...

@Trott Trott added the wip Issues and PRs that are still a work in progress. label Jan 6, 2019
@Trott
Copy link
Member Author

Trott commented Jan 6, 2019

OK, updated it again. This passes/fails as expected with different versions of Node.js with and without network connectivity.

CI (internet): https://ci.nodejs.org/job/node-test-commit-custom-suites/816/
CI (lite): https://ci.nodejs.org/job/node-test-pull-request-lite-pipeline/2151/

@Trott Trott removed the wip Issues and PRs that are still a work in progress. label Jan 6, 2019
@Trott
Copy link
Member Author

Trott commented Jan 6, 2019

That seemed to work that time. Collaborators, please 👍 here to fast-track to fix the node-daily-master job.

@Trott Trott added the fast-track PRs that do not need to wait for 48 hours to land. label Jan 6, 2019
@Trott
Copy link
Member Author

Trott commented Jan 6, 2019

(Unrelated failure in the custom-suites run is fixed in #25365.)

@gireeshpunathil
Copy link
Member

@Trott - I understand the rational behind this change, and may be the right thing to do eventually as well.

However, the relaxation of the expectation (fast I/O to finish at least in half time of that of slow I/O) seems (to me) to be too wide to be a determinant of the libuv function that this test tests.

For the failures, what confidence do we have in terms of the test calculation being too strict in a load-dependant environment versus libuv function failure?

In short, until someone (an SME) answering this key question, I am concerned that this change is just a peripheral remedy.

@Trott
Copy link
Member Author

Trott commented Jan 6, 2019

In short, until someone (an SME) answering this key question, I am concerned that this change is just a peripheral remedy.

As written, here are the things that would seem to significantly affect the test:

  • network speed (or more generally, speed at which DNS queries get a response)
  • network being enabled or disabled
  • load on the machine
  • I/O issues
  • myriad environmental possibilities (e.g., if the directory being written to is NFS-mounted or not)

Given all that, "divide by 2" vs. "divide by 100" vs. "divide by some other magic number" is a matter of calibrating in the real world.

  • With or without the libuv fix, divide-by-100 fails consistently for me locally and on CI.
  • Divide-by-2 on the other hand fails for me consistently when the libuv fix is absent and passes for me consistently when the libuv fix is present. It also seems to pass in CI (although data set is smaller there). It also works as expected (fails without libuv fix, passes with libuv fix) when the network is disabled.

I think the real fix (and I think you've talked about this, or at least alluded to it) is to change the approach of the test to more reliably measure threadpool activity rather than take performance ratios as a proxy. And yeah, if a SME can weigh in on if that's possible and (if so) how to achieve it, that would definitely be better. @nodejs/libuv

But I think this is what makes sense for now based on real-world observations.

@gireeshpunathil
Copy link
Member

@Trott - thanks for the detailed description and the reasons for the change. Based on your observations and my own experiments - especially

Divide-by-2 on the other hand fails for me consistently when the libuv fix is absent

gives me sufficient confidence that this change is valid.

I was only trying to reason with factual evidences, hope you did not consider my concerns as blockers.

@Trott
Copy link
Member Author

Trott commented Jan 7, 2019

I was only trying to reason with factual evidences, hope you did not consider my concerns as blockers.

Wasn't sure if it was blocking or not, but it certainly seemed reasonable! I always appreciate your comments.

@Trott Trott added fast-track PRs that do not need to wait for 48 hours to land. author ready PRs that have at least one approval, no pending requests for changes, and a CI started. and removed fast-track PRs that do not need to wait for 48 hours to land. labels Jan 7, 2019
@Trott
Copy link
Member Author

Trott commented Jan 8, 2019

Landed in 9987f1a

@Trott Trott closed this Jan 8, 2019
Trott added a commit to Trott/io.js that referenced this pull request Jan 8, 2019
test-uv-threadpool-schedule has been failing consistently in
node-daily-master CI. It also fails consistently on my personal laptop.
These changes make it pass consistently with current master but fail
consistently with Node.js 10.11 (which was the last release in the 10.x
line before the fix that the test is for). It succeeds/fails as expected
whether or not I am connected to the network.

Fixes: nodejs#25305

PR-URL: nodejs#25358
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Gireesh Punathil <gpunathi@in.ibm.com>
Reviewed-By: Sakthipriyan Vairamani <thechargingvolcano@gmail.com>
addaleax pushed a commit that referenced this pull request Jan 9, 2019
test-uv-threadpool-schedule has been failing consistently in
node-daily-master CI. It also fails consistently on my personal laptop.
These changes make it pass consistently with current master but fail
consistently with Node.js 10.11 (which was the last release in the 10.x
line before the fix that the test is for). It succeeds/fails as expected
whether or not I am connected to the network.

Fixes: #25305

PR-URL: #25358
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Gireesh Punathil <gpunathi@in.ibm.com>
Reviewed-By: Sakthipriyan Vairamani <thechargingvolcano@gmail.com>
refack pushed a commit to refack/node that referenced this pull request Jan 14, 2019
test-uv-threadpool-schedule has been failing consistently in
node-daily-master CI. It also fails consistently on my personal laptop.
These changes make it pass consistently with current master but fail
consistently with Node.js 10.11 (which was the last release in the 10.x
line before the fix that the test is for). It succeeds/fails as expected
whether or not I am connected to the network.

Fixes: nodejs#25305

PR-URL: nodejs#25358
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Gireesh Punathil <gpunathi@in.ibm.com>
Reviewed-By: Sakthipriyan Vairamani <thechargingvolcano@gmail.com>
@BridgeAR BridgeAR mentioned this pull request Jan 16, 2019
BridgeAR pushed a commit to BridgeAR/node that referenced this pull request Jan 16, 2019
test-uv-threadpool-schedule has been failing consistently in
node-daily-master CI. It also fails consistently on my personal laptop.
These changes make it pass consistently with current master but fail
consistently with Node.js 10.11 (which was the last release in the 10.x
line before the fix that the test is for). It succeeds/fails as expected
whether or not I am connected to the network.

Fixes: nodejs#25305

PR-URL: nodejs#25358
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Gireesh Punathil <gpunathi@in.ibm.com>
Reviewed-By: Sakthipriyan Vairamani <thechargingvolcano@gmail.com>
@MylesBorins MylesBorins mentioned this pull request Jan 24, 2019
BethGriggs pushed a commit that referenced this pull request Apr 28, 2019
test-uv-threadpool-schedule has been failing consistently in
node-daily-master CI. It also fails consistently on my personal laptop.
These changes make it pass consistently with current master but fail
consistently with Node.js 10.11 (which was the last release in the 10.x
line before the fix that the test is for). It succeeds/fails as expected
whether or not I am connected to the network.

Fixes: #25305

PR-URL: #25358
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Gireesh Punathil <gpunathi@in.ibm.com>
Reviewed-By: Sakthipriyan Vairamani <thechargingvolcano@gmail.com>
@BethGriggs BethGriggs mentioned this pull request May 1, 2019
BethGriggs pushed a commit that referenced this pull request May 10, 2019
test-uv-threadpool-schedule has been failing consistently in
node-daily-master CI. It also fails consistently on my personal laptop.
These changes make it pass consistently with current master but fail
consistently with Node.js 10.11 (which was the last release in the 10.x
line before the fix that the test is for). It succeeds/fails as expected
whether or not I am connected to the network.

Fixes: #25305

PR-URL: #25358
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Gireesh Punathil <gpunathi@in.ibm.com>
Reviewed-By: Sakthipriyan Vairamani <thechargingvolcano@gmail.com>
MylesBorins pushed a commit that referenced this pull request May 16, 2019
test-uv-threadpool-schedule has been failing consistently in
node-daily-master CI. It also fails consistently on my personal laptop.
These changes make it pass consistently with current master but fail
consistently with Node.js 10.11 (which was the last release in the 10.x
line before the fix that the test is for). It succeeds/fails as expected
whether or not I am connected to the network.

Fixes: #25305

PR-URL: #25358
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Gireesh Punathil <gpunathi@in.ibm.com>
Reviewed-By: Sakthipriyan Vairamani <thechargingvolcano@gmail.com>
@Trott Trott deleted the fix-uv-test branch January 13, 2022 22:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
author ready PRs that have at least one approval, no pending requests for changes, and a CI started. flaky-test Issues and PRs related to the tests with unstable failures on the CI. test Issues and PRs related to the tests.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

investigate flaky internet/test-uv-threadpool-schedule
6 participants