Enforce citgm for major changes #126

MylesBorins · 2017-05-11T14:55:53Z

We should do this.

At the very least it will document in the thread that we care about these breakages

jasnell · 2017-05-11T15:00:26Z

I'm fine with this. Would need a PR adding the requirement to the collaborator docs. If it's not there already, we should make it a required step for all releases also... perhaps even listing the failures in the release notes as known issues.

addaleax · 2017-05-11T15:00:42Z

Sounds good. Two thoughts:

Reading CITGM output is hard because there are so many known/expected failures, and it’s not obvious where those come from. Can we have a tracking issue at nodejs/node with all the failures?
While error message changes are still semver-major, can we exclude them from this rule?

jasnell · 2017-05-11T15:02:49Z

It makes a ton of sense to exclude certain types of semver-major's from this rule:

error message changes
semver-major changes to Node.js' own tooling, build, test framework or benchmarks

Generally, only semver-major changes in src/* and lib/* would be subject to the rule.

sam-github · 2017-05-11T15:54:31Z

Don't we want to know when an error message change breaks a package in npmjs.org?

jasnell · 2017-05-11T16:07:17Z

Sure, but those do not necessarily need to be identified before landing the PR... we're also not likely to use such a failure to determine whether or not to land such PRs. If we're changing the error message, we're likely doing so for a very good reason.

cjihrig · 2017-05-11T18:52:49Z

SGTM. Like @addaleax, I'd like easier to understand CITGM output.

What happens when a breaking change actually breaks something in CITGM? Would we just never break anything ever again? Is that something we'd decide on a case by case basis?

jasnell · 2017-05-11T21:12:25Z

We would need to decide on a case-by-case basis. CITGM is not a gate, it's an early warning system. It helps us to understand what the potential impact a change may have but it should never stop us from making any particular change.

MylesBorins · 2017-05-12T11:35:18Z

Fwiw the citgm team is fairly active in keeping the lookup table up to date regarding flakes. I'd be more than happy to teach people how to read the leaves, but I general failures that are not timeouts are legit failures

…

On May 11, 2017 11:12 PM, "James M Snell" ***@***.***> wrote: We would need to decide on a case-by-case basis. CITGM is not a gate, it's an early warning system. It helps us to understand what the potential impact a change may have but it should never stop us from making any particular change. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#126 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAecV9F2MfWvHTVIJm4IVfiOnNE_IrgCks5r43m6gaJpZM4NYH31> .

jasnell · 2017-05-12T12:43:28Z

Oh, I don't doubt that at all @MylesBorins. I just don't believe that failures we happen to see in citgm should block changes that we feel should be made. For example, we will never get out of the cycle of treating error message changes as semver-major unless we continue working towards assigning error codes and making the error messages more consistent -- that work could very well break existing code but the benefits far outweigh the potential disruption to existing modules.

mhdawson · 2017-05-12T13:30:27Z

I think we need to get CITGM to the point where it normally just runs/passes (at least one variant that we required for semver-majors). One approach may to be to fix the module versions.

Provided that the jobs run/pass reliably and takes less than 20 minutes I think requiring it for all semver majors should not be to much to ask. The issue of whether it blocks landing the PR or not can be handled on a case by case basis. Requiring the run would mean that we know where we stand earlier on as opposed to having to figure that out at release time.

MylesBorins · 2017-05-12T20:12:48Z

Locking the versions means we don't test the module that people get when they run npm install I cannot see any way to scalably update the lookup, especially as we add more modules. The majority of failures we see are platform related flakeyness, much of which can be specific to our infra. Seeing green citgm because we lock modules is not an accurate representation of the ecosystem

…

On May 12, 2017 9:30 AM, "Michael Dawson" ***@***.***> wrote: I think we need to get CITGM to the point where it normally just runs/passes (at least one variant that we required for semver-majors). One approach may to be to fix the module versions. Provided that the jobs run/pass reliably and takes less than 20 minutes I think requiring it for all semver majors should not be to much to ask. The issue of whether it blocks landing the PR or not can be handled on a case by case basis. Requiring the run would mean that we know where we stand earlier on as opposed to having to figure that out at release time. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#126 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAecV2L0WzURFyJeYZVutixOUSbfKd-Gks5r5F70gaJpZM4NYH31> .

mcollina · 2017-05-13T08:40:39Z

Currently there are a bunch of failures in citgm against master. I guess this is to be expected, as we do land breaking changes in master. How can we hold changes to have an all-green? How do I know my change did not break them more?

I have a very specific case for this, my stream.destroy() PR: nodejs/node#12925.

It would be good to have the "timeouts" reported differently than actual failures.

MylesBorins · 2017-05-13T23:13:23Z

@mcollina unfortunately there is no way we can do introspection into the reasoning of a modules failure, so no way to know if failures are caused by timeouts. I would argue that master would not have a bunch of failures if we gated against citgm...

mhdawson · 2017-05-16T21:21:54Z

Its a bit of the chicken and egg in terms of:

we need to gate against citgm to keep citgm green
we need citgm to be green to gate on citgm

I think we have to plot a path to getting citgm being green consistently, otherwise its too hard to get the maximum value from it. Once we managed to get the core regression builds passing most of the time, it became much easier to keep them green.

Changes in the modules themselves do add a complication and if they continuously break citgm independent from changes people are making in core we'll need a way to decouple from that. We may need both pinned modules for the regression runs for core, and the latest for nightly/releases. If we can keep citgm green on the latest all the better and we only need the one. I just think we have to get to the point were anybody can run it an it pretty much passes unless there is a problem with the change they made.

refack · 2017-05-16T22:12:48Z

my 4_¢

I actually think that CITGM should primarily be a gate for semver-minor, they should be breakage free (unless it's already so, and I just didn't RTFM).

Its a bit of the chicken and egg in terms of:

we need to gate against citgm to keep citgm green

we need citgm to be green to gate on citgm

If it's possible for the CITGM team to estimate the amount of time to get the CI job green, we call a moratorium on landing breaking changes for this time window.

addaleax · 2017-05-17T11:55:03Z

I actually think that CITGM should primarily be a gate for semver-minor, they should be breakage free (unless it's already so, and I just didn't RTFM).

Everything but semver-major changes should be breakage-free (by definition).

refack · 2017-05-17T11:58:32Z

Everything but semver-major changes should be breakage-free (by definition).

[asking] Do we test that it's actually breakage-free (i.e. CITGM), or do we deduce it from code reviews and our regression testing?

MylesBorins · 2017-05-17T12:45:30Z

We test before every release

…

On May 17, 2017 7:58 AM, "Refael Ackermann" ***@***.***> wrote: Everything but semver-major changes should be breakage-free (by definition). [asking] Do we test that it's actually breakage-free (i.e. CITGM), or do we deduce it from code reviews and our regression testing? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#126 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAecV93ce_3oVonwvyDju1UXYibpHCQcks5r6uDpgaJpZM4NYH31> .

MylesBorins · 2017-05-17T13:13:24Z

The TLDR on making CITGM useful is that someone needs to audit the results on a weekly basis and keep the lookup up to date for every release line.

This could potentially be automated... but I'm not sure of a clean / intuitive way to do so

addaleax · 2017-05-17T13:16:04Z

@nodejs/build We could have a citgm-daily (or citgm-weekly) like node-daily-master, right? That might help.

mhdawson · 2017-05-17T14:29:31Z

@gibfahn or myself can easily setup the required CI jobs. Once we agree on what's needed one of us can volunteer to set them up.

jasnell · 2017-05-17T14:34:56Z

Setting up the job is the easy part, yes? The real task will be ensuring that there is someone auditing the results. It would be interesting to see if we could have a page similar in nature to benchmarking.nodejs.org but that summarizes the results of the nightly citgm run.

refack · 2017-05-17T16:13:48Z

I'm volunteering to audit the results, but can't commit to every day. IMHO should be at least two people...
(Hopefully by doing it a couple of times I will figure who to automate it)

Trott · 2017-05-19T03:29:37Z

Setting up the job is the easy part, yes? The real task will be ensuring that there is someone auditing the results. It would be interesting to see if we could have a page similar in nature to benchmarking.nodejs.org but that summarizes the results of the nightly citgm run.

For node-daily-master, I just have it as part of my daily morning routine to check https://ci.nodejs.org/job/node-daily-master/. No special page needed. It might be perfectly usable to check the equivalent page for a daily citgm run.

If @refack is volunteering to stay sufficiently knowledgable about what a clean citgm run looks like and what a problematic one looks like, then that may be all we need for minimum requirements to be effective. (More people would be better of course, but one is all you need.)

gibfahn · 2017-05-19T12:11:32Z

(More people would be better of course, but one is all you need.)

I think one is most definitely not all you need. You need one person available at any time, which translates to at least five people (in my experience).

refack · 2017-05-19T12:37:55Z

You need one person available at any time, which translates to at least five people (in my experience).

I'm hoping that will be a temporary sitch, and that will figure out how to automated it.
FWIW we can start with a weekly (or semi-weekly) job.

refack · 2017-05-21T03:14:59Z

Until we find a better format: https://github.com/nodejs/node/wiki/CITGM-Status

refack · 2017-05-21T12:44:03Z

And bit more use off the WIKI
last night to today diff

refack · 2017-05-21T20:49:13Z

Question: should we keep a lookup.json just for the CI, so we could choose a baseline and turn it green.
It could be considered like a floating patch, since the CITGM's lookup.json takes into consideration a wider scope, and should be more parsimonious.

gibfahn · 2017-05-22T07:38:07Z

Question: should we keep a lookup.json just for the CI, so we could choose a baseline and turn it green.

Discussed in nodejs/citgm#407

refack · 2017-05-22T12:16:21Z

Yay my insomnia pays off...
We have our first hit nodejs/node#13098 (comment)
(well obviously not yay for the breakage, but for this validation process)
/cc @aqrln

refack · 2017-05-22T14:32:32Z

Question: when encountering a regression, should I inform the module owners (i.e. open an issue)?
Follow up: what's the criteria for a verified regression? for example 2 platform in 1 run, or 1 platform on 2 runs?

aqrln · 2017-05-22T14:44:42Z

@refack

when encountering a regression, should I inform the module owners (i.e. open an issue)?

I think that makes sense and is generally nice.

mcollina · 2017-05-22T14:54:48Z

@refack IMHO before pinging the author we should check if it was our fault or not. If the original PR was not semver-major and expected to break that, maybe no, we should try to fix things on our side.

refack · 2017-05-22T15:02:57Z

If the original PR was not semver-major and expected to break that, maybe no, we should try to fix things on our side.

It's definatly an open question, when there is a range of commits suspected probably the module author is the best resource for pinning down the "culprit"...
For example in websockets/ws#1118 there was a range nodejs/node@171a43a...ad4765a. Admittedly in this case it was fairly easy for me to pin it down (modulo insomnia induced mixup)

MylesBorins · 2017-05-22T15:23:27Z

generally I will either ping people on our repo when we add something to flaky or alternatively send a fix. (the latter is preferable )

Trott · 2017-09-08T21:21:08Z

This issue has been inactive for a while and this repository is now obsolete. I'm going to close this, but feel free to open another issue in a relevant active repository (TSC perhaps?) and include a link back to this issue if this is a subject that should receive continued attention.

refack mentioned this issue May 23, 2017

bug: remove windowBits : 0 spdy-http2/spdy-transport#42

Merged

Trott closed this as completed Sep 8, 2017

Enforce citgm for major changes #126

Enforce citgm for major changes #126

Comments

MylesBorins commented May 11, 2017

jasnell commented May 11, 2017

addaleax commented May 11, 2017

jasnell commented May 11, 2017

sam-github commented May 11, 2017

jasnell commented May 11, 2017

cjihrig commented May 11, 2017

jasnell commented May 11, 2017

MylesBorins commented May 12, 2017 via email

jasnell commented May 12, 2017

mhdawson commented May 12, 2017

MylesBorins commented May 12, 2017 via email

mcollina commented May 13, 2017

MylesBorins commented May 13, 2017

mhdawson commented May 16, 2017

refack commented May 16, 2017

addaleax commented May 17, 2017

refack commented May 17, 2017

MylesBorins commented May 17, 2017 via email

MylesBorins commented May 17, 2017

addaleax commented May 17, 2017

mhdawson commented May 17, 2017

jasnell commented May 17, 2017

refack commented May 17, 2017

Trott commented May 19, 2017

gibfahn commented May 19, 2017

refack commented May 19, 2017

refack commented May 21, 2017

refack commented May 21, 2017

refack commented May 21, 2017

gibfahn commented May 22, 2017

refack commented May 22, 2017

refack commented May 22, 2017

aqrln commented May 22, 2017

mcollina commented May 22, 2017

refack commented May 22, 2017

MylesBorins commented May 22, 2017

Trott commented Sep 8, 2017