-
Notifications
You must be signed in to change notification settings - Fork 226
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[TEP-0120] Add proposal for concurrency controls #818
Conversation
00a8977
to
1d102cb
Compare
The appeal of a solution involving a separate CRD is that any higher-level controller (e.g. Triggers, Workflows, Pipelines as Code) could get concurrency controls | ||
for "free" by creating a ConcurrencyControl, and handing the logic off to a separate controller. | ||
However, in practice, it's difficult to do this in a way that achieves good separation of concerns between reconcilers. Because the PipelineRun controller | ||
(or ConcurrencyControl controller) is not responsible for creating PipelineRuns, it has to rely on other components starting PipelineRuns as pending, and matching | ||
these PipelineRuns to the ConcurrencyControl via some strategy such as labels, ownerReferences, or modifying PipelineRuns themselves. | ||
This controller's logic is tied to the logic of these other components, requiring updates to be coordinated and essentially creating an | ||
unwritten contract between reconcilers. This solution adds significant complexity compared to the proposed solution while still requiring some concurrency code to be | ||
added to each higher-level controller that wants to support concurrency controls. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really tried to get this solution working but I struggled to do so in a way that wasn't super complex-- this is my best attempt at explaining why.
1d102cb
to
699747c
Compare
Proof of concept: tektoncd/triggers#1446 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there is a "slight" difference of approach in what I had and yours @lbernick.
My initial thought on concurrency was to provides the concurrency control primitives to tektoncd/pipelne so that other tools (be it upstream or elsewhere) could use them (hence having it on PipelineRun
or at least in tektoncd/pipeline
).
What is proposed here, however, is an "end-user" proposal, for Triggers. The two approaches are not antinomic, but one might use the 2nd. I do think the proposal on trigger probably make sense for triggers. But I do think the "primitives for concurrency control" shouldn't be in triggers (Triggers "description" is "… that allows you to create Kubernetes resources based on information it extracts from event payloads.").
As proposed here, any tool that wants to implement concurrency control and that doesn't rely on triggers, will need to "copy" what's proposed (or come up with its own approach) — which was already the case today.
### Reconciler logic | ||
|
||
When a Trigger with a concurrency spec creates a new PipelineRun, it will substitute the parameters in the concurrency key and apply the concurrency key as a label | ||
with key "triggers.tekton.dev/concurrency". It will use an informer to find all PipelineRuns with the same concurrency key from the same Trigger, using label selectors. | ||
The reconciler will patch any matching PipelineRuns as canceled before creating the new PipelineRun, but will not wait for cancelation to complete. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From my point of view, the only "job" of triggers should be to create that label and most likely create the PipelineRun
as pending, and have something else handle that label (aka not the triggers controller).
for "free" by creating a ConcurrencyControl, and handing the logic off to a separate controller. | ||
However, in practice, it's difficult to do this in a way that achieves good separation of concerns between reconcilers. Because the PipelineRun controller | ||
(or ConcurrencyControl controller) is not responsible for creating PipelineRuns, it has to rely on other components starting PipelineRuns as pending, and matching | ||
these PipelineRuns to the ConcurrencyControl via some strategy such as labels, ownerReferences, or modifying PipelineRuns themselves. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because the PipelineRun controller
(or ConcurrencyControl controller) is not responsible for creating PipelineRuns, it has to rely on other components starting PipelineRuns as pending, and matching
these PipelineRuns to the ConcurrencyControl via some strategy such as labels, ownerReferences, or modifying PipelineRuns themselves
Althought I slightly agree, the PipelineRun
reconciler has all the possibilities to do whatever it wants with the newly created PipelineRun
. We could decide that, if a "conccurency" label is set on a PipelineRun, we make it pending. This is a choice we can make (or not).
This solution assumes that PipelineRuns using concurrency will typically be created by tooling such as Pipelines as Code, a Workflow, or similar, | ||
and would likely not have different concurrency strategies. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which "is" kind of true, as this solution adds it in Triggers.
Yeah, I agree - Triggers is upstream of Pipelines, and isn't necessarily the only place users would want to be able to control concurrency. |
@vdemeester could you elaborate on what a "concurrency primitive" might look like? |
/assign @vdemeester @abayer |
699747c
to
9f357b8
Compare
@abayer @vdemeester @chmouel @dibyom I've opened an alternative POC in the Pipelines repo for "concurrency primitives". PTAL and let me know if this is in line with what you are thinking: tektoncd/pipeline#5501 |
So, I agree with @vdemeester 's comment about the primitive based approach. To me, that means that the solution has to be loosely coupled i.e. we should not require changes to pipelines/triggers code and ideally no changes to the existing crds either. This does not mean we cannot add those changes in later if we decide this is a very core component of say a trigger. Also, this does not mean we cannot provide a more opinionated approach in another higher level crd such as workflows. But the most basic implementation should be flexible and loosely coupled. Here is roughly what I had in mind (let me know if some of this is infeasible)
Some caveats/drawbacks of this approach I can think of
|
thanks @dibyom for the feedback! I am experimenting with a solution in a separate reconciler, similar to what you've described. The one thing I might not end up implementing is the reconciler changing the pipelinerun status from pending -> running, partially because it's a bit of a code smell for the reconciler to edit the spec of the pipelinerun it's reconciling Would be great to hear others' thoughts as well! |
There's one other issue with the mutating admission webhook strategy that @RafaeLeal mentioned to me. Labels and annotations aren't available during the create request (they're patched after creation), so we'd have to mark the pipelinerun as pending during an update request. However I'm not sure how to know during an update request whether it's appropriate to mark a pipelinerun as pending, and our webhook doesn't currently allow running pipelineruns to be changed to pending (although ofc this is something we can change) |
POC number 3 as requested! I haven't yet implemented a mutating admission webhook but that's next on the list. |
9f357b8
to
4a6a28e
Compare
@ all -- I've updated the TEP to list out alternatives, and the proposed solution to basically state that we're going to experiment with a separate reconciler and use it in dogfooding, and we'll design a higher level api for concurrency in workflows once the workflows api exists and after getting experience testing out the experimental project. Also added links to existing POCs. Hoping this can be merged without major changes, please lmk what you think. |
This commit adds design details for canceling concurrent PipelineRuns.
4a6a28e
to
afb5ed6
Compare
@abayer @vdemeester PTAL, hoping to merge during API WG on Monday |
/approve Looks like a good start to me - looking forward to seeing where we end up going with this! |
- Do we want to allow a PipelineRun to be part of multiple concurrency groups? | ||
- For example, when we later support queueing, we might want to allow users to configure a rule like "Cancel all but the last PR per pull request, and only allow 5 CI runs per repo at a time." | ||
- What concurrency strategies should we support? | ||
- For the initial version of this proposal, we will only support cancelation. However, should we support all possible strategies for canceling PipelineRuns, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the default should be "graceful cancellation", aka cancel but run finally tasks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds good-- the initial version I'm working on uses regular cancelation as the default but this should be super easy to change.
- For the initial version of this proposal, we will only support cancelation. However, should we support all possible strategies for canceling PipelineRuns, | ||
including cancelation, graceful cancelation, and graceful stopping? | ||
- Should we attempt to prevent users from interfering with concurrency controls? | ||
- Many of the proposed solutions rely on labels, and a user editing labels could change PipelineRun behavior. Is this desired? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think so yes. A user can change labels on a Pod
from a TaskRun
and provoke a weird behavior for example. This is very similar here, we cannot completely stop the user from shooting its foot, and that's fine 🙃
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: abayer, vdemeester The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
LGTM in the working group |
This commit adds design details for canceling concurrent PipelineRuns.
/kind tep