Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[JENKINS-49073] Propagate the downstream result, not just FAILURE #24

Merged
merged 2 commits into from
Jul 12, 2019

Conversation

jglick
Copy link
Member

@jglick jglick commented Jun 3, 2019

JENKINS-49073

screenshot

Probably improved by jenkinsci/workflow-step-api-plugin#44.

Note that this may also satisfy the stated use case for #13:

this information can be used by Build Failure Analyzer Plugin to show failures in child jobs on the top trigger job

Copy link
Member

@dwnusbaum dwnusbaum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! FWIW BuildTriggerListener.java line 48 looks like a good place to add a WarningAction when !trigger.propagate && run.getResult() != Result.SUCCESS, but that's unrelated.

@jglick
Copy link
Member Author

jglick commented Jun 3, 2019

add a WarningAction when !trigger.propagate && run.getResult() != Result.SUCCESS

Note that you have to set expectations carefully, since users may come to expect this warning to appear, when in fact it will only appear if the upstream happens to run longer than the downstream.

@dwnusbaum dwnusbaum merged commit d665e83 into jenkinsci:master Jul 12, 2019
@jglick jglick deleted the propagate-JENKINS-49073 branch July 12, 2019 17:48
@basil
Copy link
Member

basil commented Sep 18, 2019

I'm a bit nervous to upgrade to pipeline-build-step 2.10 now, because I'm not sure all of the parallel steps in my pipelines are going to correctly deal with the downgrade of build results as a result of this change in behavior. In a pipeline-build-step 2.10 world and with build steps using the default propagate: true, the first branch of a parallel step that finishes will set the status of the overall build. So assuming there are three branches, if the first finishes with SUCCESS and the second finishes with UNSTABLE and the third finishes with FAILURE, the overall status of the job will be UNSTABLE (which is a change in behavior from pipeline-build-step 2.9). While technically behaving as designed, it's an unfortunate outcome in some of my use cases and may cause regressions in some jobs when we upgrade to pipeline-build-step 2.10. Any such regressions could be solved by setting propagate: false and manually doing the propagation, as I'm already doing in some of my jobs. I think a better solution would be for the parallel step to support a new mode where it propagates the "worst" result of any branch to the upstream job. Pipeline doesn't support this today and it doesn't look easy to implement, but I at least started a conversation about it in JENKINS-49073 (comment). Pending a feature like that, I think I'd feel a lot better about upgrading to pipeline-build-step 2.10 if there was a feature flag to restore the old behavior. At least that way I know that my jobs won't be falsely reporting a status of UNSTABLE when they should be reporting a status of FAILURE when parallel steps are used.

@jglick
Copy link
Member Author

jglick commented Sep 18, 2019

for the parallel step to support a new mode where it propagates the "worst" result of any branch

I think this makes sense, assuming a suitable definition of “worst”: success > … > FlowInterruptedException with UNSTABLE > FlowInterruptedException with FAILURE > … (acc. to comparison on Result) > AbortException > any other exception.

a feature flag to restore the old behavior

Or an explicit option in the build step, though I think the behavior in this PR is a natural default.

@basil
Copy link
Member

basil commented Sep 18, 2019

I think this makes sense, assuming a suitable definition of “worst”: success > … > FlowInterruptedException with UNSTABLE > FlowInterruptedException with FAILURE > … (acc. to comparison on Result) > AbortException > any other exception.

Thanks as always for your pertinent insights, @jglick. I opened jenkinsci/workflow-cps-plugin#325 to address this.

@@ -49,7 +50,8 @@ public void onCompleted(Run<?,?> run, @Nonnull TaskListener listener) {
trigger.context.onFailure(trigger.interruption);
}
} else {
trigger.context.onFailure(new AbortException(run.getFullDisplayName() + " completed with status " + run.getResult() + " (propagate: false to ignore)"));
Result result = run.getResult();
trigger.context.onFailure(new FlowInterruptedException(result != null ? result : /* probably impossible */ Result.FAILURE, new DownstreamFailureCause(run)));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One side effect of using FlowInterruptedException instead of AbortException is that retry will no longer retry the build step if it fails (it ignores FlowInterruptedException because of JENKINS-44379 ). This was reported as a bug in JENKINS-60354.

I'm not really sure how to fix this. I think FlowInterruptedException is overloaded with too many meanings at this point. One approach would be to factor out a ResultCarryingException interface, have FlowInterruptedException and some new FooException implement that interface, and then go through and switch everything that currently just uses FlowInterruptedException for a result to use ResultCarryingException when inspecting exception and FooException when creating them, and only use FlowInterruptedException when we only want to capture actual Pipeline interruption.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might rather suggest amending RetryStep to check for the presence of a CauseOfInterruption.

Copy link
Member

@dwnusbaum dwnusbaum Dec 4, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might rather suggest amending RetryStep to check for the presence of a CauseOfInterruption.

Good idea, and much simpler to implement, although I do feel like it would be good to separate the different use cases of FlowInterruptedException at some point so that it is easier to reason about changes like this in isolation.

I think catchError and warnError probably have the same issue as retry and would also need to be updated.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, it looks like WorkflowRun.doTerm does not set a CauseOfInterruption when it aborts the build, so that would need to be tested and maybe changed since it should be rethrown by retry. WorkflowRun.doKill does not set a CauseOfInterruption either, but I don't think it matters in that case because of the way the build is aborted.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps just add a fatal flag to FlowInterruptedException, used by true build interruptions, and have steps like retry only ignore it in this mode. I guess that boils down to a similar proposal.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants