[Converge] Flaky Tests #32

jasnell · 2015-05-22T19:54:56Z

Flaky tests were removed in io.js. They've still been useful in node.js's CI. Need a final decision on this.

e8b21095670543ad6ee8e24e0a1b096b908e20b1 test: update flaky test definitions - nodejs/node-v0.x-archive@e8b2109
@misterdjules @orangemocha @nodejs/tsc

The text was updated successfully, but these errors were encountered:

Fishrock123 · 2015-05-22T19:56:59Z

I think the original flaky tests were fixed in io.js, but there's still one or two that give sporadic failures. The much larger, but not entirely related issue is jenkins messing up certain windows tests.

rvagg · 2015-05-23T00:26:52Z

Please no no no, I have strong objections to this on a philosophical level, flakey tests should be fixed rather than accepted, that's why the whole thing was removed from io.js. I haven't checked but has it been put back in in this repo?

We have ongoing problems with the io.js CI because of Windows+Jenkins crap that we haven't resolved but as far as I'm concern this just means we need to work on it and figure out a fix, not accept it and move on.

jbergstroem · 2015-05-24T22:04:35Z

I have a pretty strong opinion about this as well. Hiding them won't really solve anything.

orangemocha · 2015-05-25T13:12:34Z

I think there might have been a bit of miscommunication around the topic of ‘flaky tests’. The point is not to to accept nor to ignore that tests are failing. The point is to enable an automated merge job, like node-accept-pull-request. (in fact, I think it would be best to rename the issue title to something more along those lines).

node-accept-pull-requests automatically merges changes, after rebasing them on top of the target branch and passing tests. We have found that useful because it pretty much eliminates the chances of anyone introducing regressions in the test set that it runs on. Using this merge job has ensured that a) nobody can forget to run tests on all platforms; b) nobody can make a mistake when pushing directly to the repo; c) changes are tested on the same final rebase that will be pushed to origin, so it’s not possible for change B to be pushed between the time change A was tested and the time it is rebased/pushed.

We could argue whether this process is useful and whether there are good cases for not following it. But in the context of an automated merge job, support for flaky tests is must. Perhaps it’s the terminology ‘flaky tests’ that stirs confusion. You may think of it as mere configuration for the node-accept-pull-request, to tell it which tests should not prevent an automated merge. Such configuration is versioned with the code, as it should be. But it’s not used for anything else other than the automated merge job. Even in the automated merge job, failures for the tests that are marked as flaky are not completely hidden. Builds with failures in flaky tests are marked as ‘yellow’, instead of ‘blue’, and the TAP test results show the tests that failed, with a ‘todo’ next to the outcome.

When a test is marked as flaky, a high-priority issue should be filed to follow up on the test flakiness, so that this test can be re-enabled again for the automated merge job.

In io.js you have been running CI on every PR, which is great. I am pretty sure, you have had cases where -say- John’s PR was tested and some tests failed that you recognized as not due to the John’s PR. Even if you fixed all the tests, this can always happen, as tests can be susceptible to environment or timing issues. When this happened, did you prevent John’s PR to be merged? Of course not. You made the determination that the failure was independent from John’s changes. Flaky test configuration allows node-accept-pull-request to do the same in an automated way.

Having this supported in an automated way allows for the process to scale. We can add more tests and more platform to verify every change that gets pushed to the main repo, and keep it manageable. Otherwise, we'll end up scribbling the same list of tests on a piece of paper. I think we should enhance this functionality, not remove it. It should be noted that this support in our test runner is based on support that was there in an ancient version of the v8 test runner. Since then, the v8 test runner has greatly improved support for this type of configuration.

orangemocha · 2015-05-25T13:19:19Z

My suggestion would be to implement a node-accept-pull-request -like job in the converged CI system, without making mandatory, but giving people a chance to try it and show its usefulness. Only if people find it useful, we can adopt it in a more systematic way. I think that should be pretty uncontroversial.

But there's no point in disabling support for flaky tests in the test runner. It does nothing different unless you pass --flaky-tests=dontcare as an option.

jbergstroem · 2015-05-25T22:09:31Z

We just had a quick talk about this in the build meeting; I think the thing "between us" in terms of mentality is the merge-pr job -- since the amount of flaky tests we're experience is probably around 10 we should see what can be done about them, but to avoid holding things up we should indeed support a flaky flag and make sure we have a high priority on resolving them once flagged as flaky.

jasnell · 2015-05-26T00:40:09Z

Perhaps this as a compromise: incorporate the flaky tests for day to day
commits but require the test failures to be documented and to be fixed
prior to an LTS. That gives a significant amount of leeway.
On May 25, 2015 3:09 PM, "Johan Bergström" notifications@github.com wrote:

We just had a quick talk about this in the build meeting; I think the
thing "between us" in terms of mentality is the merge-pr job -- since the
amount of flaky tests we're experience is probably around 10 we should see
what can be done about them, but to avoid holding things up we should
indeed support a flaky flag and make sure we have a high priority on
resolving them once flagged as flaky.

—
Reply to this email directly or view it on GitHub
nodejs/node#32 (comment).

bnoordhuis · 2015-05-26T19:50:38Z

to avoid holding things up we should indeed support a flaky flag and make sure we have a high priority on resolving them once flagged as flaky

Who is on the hook for flaky tests? If the answer is 'no one in particular', there is no real imperative for anyone to go in and fix them.

I'd love to see automated merges a la Rust, by the way. But with rebase and fast-forward, of course, anything else is blasphemy.

jbergstroem · 2015-05-26T22:44:53Z

@orangemocha perhaps you could elaborate on what we ended yesterdays call with in regards to how you're procedure currently work?

orangemocha · 2015-05-27T13:08:58Z

The current procedure is to file a P-1 issue and assign it to someone, at the time that we mark a test a flaky. We should probably also set a milestone goal. Not sure if we can do much better than that. If by the time the milestone date hits, the flakiness has not been resolved we can decide to hold up the milestone for it, or to postpone the issue. This is to say that unless a test is invalid, it's better to keep it around as flaky than to delete it. As I mentioned, flaky tests are run and their failures are shown (in yellow instead of red). Deleting a test really hides the failure.

But with rebase and fast-forward, of course, anything else is blasphemy.

@bnoordhuis : yes, that's how node-accept-pull-request does it by default. It also has a merge option, which was intended for branch merges.

jasnell · 2015-05-27T17:53:00Z

Given that this is largely a build/CI issue, I'm inclined to defer this to the @nodejs/build WG for resolution.

rvagg · 2015-05-27T23:50:22Z

agreed, we'll be discussing this at our next meeting and possibly beyond, current plan is to wait to experiment with the jenkins merge job and understand the value of flakey test marking and then attempt to move towards a compromise situation that satisfies everyone's concerns

jasnell · 2015-07-09T22:28:04Z

@rvagg @misterdjules ... do we need to keep this issue open?

misterdjules · 2015-07-09T22:45:06Z

@orangemocha and @rvagg know more about whether the flaky tests mechanism will be kept in the converged CI platform, so deferring to them.

rvagg · 2015-07-10T00:28:18Z

Yes, I believe current status is that we've capitulated and agreed that flaky tests will need to be in converged repo in order to make the node-accept-pull-request jenkins job to work satisfactorily, and in fact we may even want it to come back in to io.js for the same reason. Will let @orangemocha update on progress there.

orangemocha · 2015-07-10T14:07:37Z

Update on progress:

node-accept-pull-request is already implemented in the converged CI, and will support the current joyent\node v0.12 workflow as soon as this PR is landed (which should happen in the next day or two). I plan on taking a shortcut for 0.10 and port a simpler job to support the current workflow, since we don't use node-accept-pull-request for it. After that, we can start transitioning joyent\node collaborators to the new CI.
node-accept-pull-request is technically ready to work in io.js though there are two work items (both on my plate) necessary to make it practical for wide adoption:

Adding support for flaky tests in io.js.
Reducing the time spent running the ARM builds

In terms of process for marking tests as flaky, I think the current process that we have used in joyent\node is a reasonable starting point: we open a P-1 issue to investigate the flakiness and assign it to the next milestone, cc'ing the project collaborators. I will document this in the wiki by the time we announce the availability of node-accept-pull-request. If we find that this process is not sufficient we might want to consider assigning to specific collaborators, either the test/area owner or -if that's not available- any collaborator in a round-robin fashion.

With regard to this issue, I think the original misalignment that it was meant to track has been resolved, so it should be fine to close it. 😄

jasnell · 2015-07-10T14:33:43Z

Excellent!

jasnell added the pull-from-joyent-node label May 22, 2015

jasnell added the build-wg label May 27, 2015

jasnell closed this as completed Jul 10, 2015

orangemocha mentioned this issue Aug 18, 2015

CI convergence (flaky tests support, add internet tests) nodejs/node#2424

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Converge] Flaky Tests #32

[Converge] Flaky Tests #32

jasnell commented May 22, 2015

Fishrock123 commented May 22, 2015

rvagg commented May 23, 2015

jbergstroem commented May 24, 2015

orangemocha commented May 25, 2015

orangemocha commented May 25, 2015

jbergstroem commented May 25, 2015

jasnell commented May 26, 2015

bnoordhuis commented May 26, 2015

jbergstroem commented May 26, 2015

orangemocha commented May 27, 2015

jasnell commented May 27, 2015

rvagg commented May 27, 2015

jasnell commented Jul 9, 2015

misterdjules commented Jul 9, 2015

rvagg commented Jul 10, 2015

orangemocha commented Jul 10, 2015

jasnell commented Jul 10, 2015

[Converge] Flaky Tests #32

[Converge] Flaky Tests #32

Comments

jasnell commented May 22, 2015

Fishrock123 commented May 22, 2015

rvagg commented May 23, 2015

jbergstroem commented May 24, 2015

orangemocha commented May 25, 2015

orangemocha commented May 25, 2015

jbergstroem commented May 25, 2015

jasnell commented May 26, 2015

bnoordhuis commented May 26, 2015

jbergstroem commented May 26, 2015

orangemocha commented May 27, 2015

jasnell commented May 27, 2015

rvagg commented May 27, 2015

jasnell commented Jul 9, 2015

misterdjules commented Jul 9, 2015

rvagg commented Jul 10, 2015

orangemocha commented Jul 10, 2015

jasnell commented Jul 10, 2015