Skip to content

Commit

Permalink
[TEP-0120] Add proposal for concurrency controls
Browse files Browse the repository at this point in the history
This commit adds design details for canceling concurrent PipelineRuns.
  • Loading branch information
lbernick committed Sep 9, 2022
1 parent 4312b76 commit 00a8977
Showing 1 changed file with 327 additions and 0 deletions.
327 changes: 327 additions & 0 deletions teps/0120-canceling-concurrent-pipelineruns.md
Original file line number Diff line number Diff line change
Expand Up @@ -102,6 +102,333 @@ queueing, or more advanced concurrency strategies.

## Proposal

Add a new field `concurrency` to `trigger.spec`, for example:

```yaml
apiVersion: triggers.tekton.dev/v1beta1
kind: Trigger
spec:
bindings:
- name: reponame
value: $(body.repository.full-name)
template:
ref: ci-pipeline-template
concurrency:
params:
- name: reponame
key: $(params.reponame)
strategy: cancel
```
Any PipelineRuns created by this Trigger with the same concurrency key will be subject to the specified concurrency strategy,
regardless of what namespace the PipelineRuns were created in.
Here's an example EventListener that creates CI PipelineRuns, and will cancel a running PipelineRun when a new one is triggered
for the same pull request.
```yaml
apiVersion: triggers.tekton.dev/v1beta1
kind: EventListener
metadata:
name: github-ci-eventlistener
spec:
triggers:
- name: github-checks-trigger
bindings:
- name: pull-request-id
value: $(body.check_suite.pull_requests[0].id)
- name: head-sha
value: $(body.check_suite.head_sha)
concurrency:
params:
- name: pull-request-id
key: $(params.pull-request-id)
strategy: cancel
interceptors:
ref:
kind: ClusterInterceptor
name: github
template:
spec:
params:
- name: head-sha
resourcetemplates:
- apiVersion: tekton.dev/v1beta1
kind: PipelineRun
spec:
pipelineRef:
name: ci-pipeline
params:
- name: head-sha
value: $(tt.params.head-sha)
```
TODO(Lee): Do we want to support multiple concurrency controls per Trigger? No clear use case for multiple cancels,
but might want a cancel and a queue when that's supported later
### Components of Concurrency Spec
#### Params
Concurrency parameter names should match the names of the TriggerBindings, and support substitution in the same way as parameters in TriggerTemplates.
#### Key
A string used to match PipelineRuns to each other. PipelineRuns created by the same Trigger with the same concurrency key are considered part of the same
concurrency "group". We should support parameter substitution, and may choose to support substitution of context-related variables
like [those supported in Pipelines](https://tekton.dev/docs/pipelines/variables/) (for example, `context.pipelineRun.namespace`).

#### Strategy

For the initial implementation of this proposal, we will support only canceling PipelineRuns.

Initially supported strategies are:
- "Cancel": [Cancel](https://tekton.dev/docs/pipelines/pipelineruns/#cancelling-a-pipelinerun) any running PipelineRuns with the same key.
(When we support canceling TaskRuns, this will be the only supported cancelation strategy.)
- "CancelRunFinally": [Gracefully cancel](https://tekton.dev/docs/pipelines/pipelineruns/#gracefully-cancelling-a-pipelinerun) any running PipelineRuns with the same key.
The new PipelineRun will start without waiting for the existing PipelineRuns' finally Tasks to complete.
- "StopRunFinally": [Gracefully stop](https://tekton.dev/docs/pipelines/pipelineruns/#gracefully-cancelling-a-pipelinerun) any running PipelineRuns with the same key.
The new PipelineRun will start without waiting for the existing PipelineRuns' finally Tasks to complete.

### Reconciler logic

When a Trigger with a concurrency spec creates a new PipelineRun, it will substitute the parameters in the concurrency key and apply the concurrency key as a label
with key "triggers.tekton.dev/concurrency". It will use an informer to find all PipelineRuns with the same concurrency key from the same Trigger, using label selectors.
The reconciler will patch any matching PipelineRuns as canceled before creating the new PipelineRun, but will not wait for cancelation to complete.

## Future Work: Workflows

[TEP-0098: Workflows](./0098-workflows.md) proposes creating a Workflows API for an easier end-to-end, getting started experience with Tekton.
End-to-end CI/CD Workflows should support concurrency controls, but we have a lot of flexibility for how concurrency can be configured in Workflows.
For example, we could automatically add cancelation to CI Workflows. Adding concurrency controls to Triggers doesn't prevent us
from also adding them to Workflows in the future, especially if Workflows will be responsible for creating and managing Triggers.

## Alternative Designs

### Separate concurrency CRD (and maybe reconciler)

In this solution, concurrency controls are defined in their own CRDs, and Triggers could reference them. For example:

```yaml
kind: ConcurrencyControl
name: pull-requests
spec:
params:
- name: pull-request-id
key: $(params.pull-request-id)
strategy: cancelRunFinally
---
kind: Trigger
name: github-pr-trigger
spec:
template:
ref: github-template
bindings:
- name: pull-request-id
value: $(body.pull-request.id)
concurrency:
ref: pull-requests
```

A ConcurrencyControl would apply to any PipelineRuns in that namespace, and its parameters would be substituted with the PipelineRun's parameters.

Here, the Triggers controller (or a mutating admission webhook) would create all PipelineRuns as "pending", and label them with the ConcurrencyControl
they're referencing (for example: `tekton.dev/concurrencyControl: pull-requests`).
The PipelineRun reconciler (or a new ConcurrencyControl reconciler) would be responsible for patching PipelineRuns to start or cancel them,
and reconciling ConcurrencyControls. It would find the concurrency control specified in the PipelineRun label, calculate the concurrency key, and apply
the key as a label to the PipelineRun (for example: `tekton.dev/concurrencyKey: 1234`). It would then query for any PipelineRuns with the same key
and cancel all but the most recently started.

There are two approaches we could choose to take when a ConcurrencyControl is added or updated. The first approach, which is simpler, is to decide
that new/updated ConcurrencyControls do not apply to currently running PipelineRuns. If we choose this approach, a
ConcurrencyControl CRD doesn’t add much compared to a ConfigMap.

The other possible approach is to update PipelineRun concurrency keys whenever there’s an event related to a ConcurrencyControl.
In this approach, we’d add a custom handler to enqueue all running PipelineRuns when a ConcurrencyControl is updated, since the ConcurrencyControl doesn't "know"
what PipelineRuns it's responsible for. This approach means we’d have to recalculate a PipelineRun’s concurrency keys and cancel matching PipelineRuns
on each reconcile loop. In this scenario, it’s also not guaranteed that PipelineRuns get requeued in any particular order, so we would need to make sure that any PipelineRuns being canceled started before the one being reconciled.

Lastly, we wouldn't be able to prevent users from manipulating PipelineRun concurrency by editing the `tekton.dev/concurrencyControl` label.

The appeal of a solution involving a separate CRD is that any higher-level controller (e.g. Triggers, Workflows, Pipelines as Code) could get concurrency controls
for "free" by creating a ConcurrencyControl, and handing the logic off to a separate controller.
However, in practice, it's difficult to do this in a way that achieves good separation of concerns between reconcilers. Because the PipelineRun controller
(or ConcurrencyControl controller) is not responsible for creating PipelineRuns, it has to rely on other components starting PipelineRuns as pending, and matching
these PipelineRuns to the ConcurrencyControl via some strategy such as labels, ownerReferences, or modifying PipelineRuns themselves.
This controller's logic is tied to the logic of these other components, requiring updates to be coordinated and essentially creating an
unwritten contract between reconcilers. This solution adds significant complexity compared to the proposed solution while still requiring some concurrency code to be
added to each higher-level controller that wants to support concurrency controls.

Related solutions that have been proposed:

- PipelineRun referencing a separate CRD:

```yaml
kind: ConcurrencyControl
name: pull-requests
spec:
params:
- name: pull-request-id
key: $(params.pull-request-id)
strategy: cancelRunFinally
---
kind: PipelineRun
name: ci-pipeline-run
spec:
pipelineRef:
name: ci-pipeline
concurrency:
ref: pull-requests
```

- Separate CRD with a role similar to the ["ConfigMap" solution](#cluster-level-concurrency-configmap):

```yaml
kind: ConcurrencyControl
name: pipelinerun-pull-requests
spec:
kind: PipelineRun
selector:
matchLabels:
tekton.dev/pipeline: “ci-pipeline”
key: $(spec.params["pull-request"])
strategy: cancelRunFinally
```

### Configuration on TriggerTemplate

Instead of configuring concurrency on a Trigger, we could allow it to be configured on a TriggerTemplate and make use of the TriggerTemplate params, for example:

```yaml
kind: TriggerTemplate
spec:
params:
- name: pull-request-id
resourcetemplates:
- apiVersion: tekton.dev/v1beta1
kind: PipelineRun
metadata:
generateName: ci-pipeline-run-
spec:
pipelineRef:
name: ci-pipeline
concurrency:
key: $(tt.params.pull-request-id)
strategy: cancelRunFinally
```

However, TriggerTemplates can be used in multiple Triggers, which may have different concurrency needs.

### Configuration on Pipeline spec

We could add concurrency controls to `pipeline.spec`, and any PipelineRuns of the same Pipeline with the same key would be considered part of the same concurrency group.
For example:

```yaml
kind: Pipeline
metadata:
name: ci-pipeline
spec:
concurrency:
key: $(params.pull-request-id)
strategy: cancelRunFinally
```

However, different users might want to define different concurrency strategies for the same Pipeline.
For example, one user of the [build-push-gke-deploy Catalog Pipeline](https://github.com/tektoncd/catalog/tree/main/pipeline/build-push-gke-deploy)
might want to cancel concurrent runs for the same image, and another might want to cancel concurrent runs for the same image + cluster combination.

### Configuration on PipelineRun spec

We could add concurrency controls to `pipelineRun.spec`, as originally proposed in
[TEP ~ Automatically manage concurrent PipelineRuns](https://github.com/tektoncd/community/pull/716).
For example:

```yaml
kind: PipelineRun
spec:
concurrency:
key: 1234 # Pull request ID
strategy: cancelRunFinally
```

or within a TriggerTemplate:

```yaml
apiVersion: triggers.tekton.dev/v1beta1
kind: TriggerTemplate
spec:
params:
- name: repo
- name: pull-request-id
resourceTemplates:
- apiVersion: tekton.dev/v1beta1
kind: PipelineRun
metadata:
generateName: ci-pipeline-run-
spec:
pipelineRef:
name: ci-pipeline
concurrency:
key: $(tt.params.repo)-pr-$(params.pr-number)
strategy: cancel
```

Any PipelineRuns with the same concurrency key, regardless of which Pipeline they reference, will be considered part of the same concurrency group.
We could choose to scope concurrency groups to namespaces or to the cluster.

If two PipelineRuns have the same key but different concurrency strategies, reconciliation will fail.
This solution assumes that PipelineRuns using concurrency will typically be created by tooling such as Pipelines as Code, a Workflow, or similar,
and would likely not have different concurrency strategies.

This solution is not proposed because concurrency controls are used to manage multiple PipelineRuns, not to specify how a single PipelineRun should execute.
Although we don't have a concept of a "group" of PipelineRuns in the Tekton API, this configuration makes the most sense on an object responsible for creating
or managing multiple PipelineRuns.

### Cluster-level concurrency ConfigMap

We could specify controls in a cluster-level ConfigMap read by the PipelineRun controller, as originally proposed in
[Run concurrency keys/mutexes](https://hackmd.io/GK_1_6DWTvSiVHBL6umqDA). For example:

```yaml
kind: ConfigMap
metadata:
name: tekton-concurrency-control
namespace: tekton-pipelines
data:
rules:
- name: pipelinerun-pull-requests
kind: PipelineRun
selector:
matchLabels:
tekton.dev/pipeline: “ci-pipeline”
key: $(metadata.namespace)-$(spec.params.pull-request-id)
strategy: cancelRunFinally
```

When reconciling a PipelineRun, the PipelineRun controller would need to check each of the concurrency rules and determine which of the rules it matches,
based on label selectors. For each matching rule, it would compute the concurrency key and add it as a label to the PipelineRun.
If we want to prevent users from interfering with concurrency controls by setting their own labels, we will need to compute the PipelineRun's concurrency keys
from this ConfigMap on each reconcile loop.

This solution implies that PipelineRuns may belong to multiple concurrency groups. If a PipelineRun has multiple concurrency keys,
any running PipelineRuns that have a matching concurrency key will be canceled.

If this ConfigMap is edited, the changes will apply only to PipelineRuns created after the edit.

This solution isn't proposed because concurrency strategies aren't defined alongside the functionality that needs to have its concurrency controlled.
This may be a conceptually confusing way to match strategy (e.g. cancel and replace) with functionality (e.g. run CI for a pull request).
In addition, it leaves cluster authors, rather than PipelineRun users, in charge of concurrency.

Related solutions we could explore:
- Defining concurrency rules in a ConfigMap, but restricting configuration to one rule per Pipeline.
- Using [TEP-0085: Per-Namespace Controller Configuration](./0085-per-namespace-controller-configuration.md), we could create namespaced versions of these ConfigMaps.

## Alternative Syntax

TODO

## Design Evaluation

TODO

## References
Expand Down

0 comments on commit 00a8977

Please sign in to comment.