Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TEP-0080] Proposal: TaskRun Pre/Post steps #502

Closed
wants to merge 2 commits into from

Conversation

mattmoor
Copy link
Member

This is the first of a handful of TEPs I'm putting together to weave a more complete story. Once the constituent TEPs have been staged, I can open up an Uber-TEP to discuss the bigger picture they are trying to compose into (a la Voltron), but want to keep the individual aspects finely scoped.

@tekton-robot tekton-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Aug 19, 2021
@imjasonh
Copy link
Member

The main requirement is a mechanism to enable users to scope the intent of things
like Hermekton's ExecutionPolicy, but the execution of the result must be
equivalent to append(pre, steps, post) modulo features that are sensitive to the
distinction (e.g. Hermekton).

Would it be sufficient to support per-step hermeticity? My (incomplete) understanding of the hermekton TEP was that it was mainly a matter of how the entrypointer ran the step's command+args, which in my (incomplete) understanding would be doable with a per-step executionMode field.

@mattmoor
Copy link
Member Author

@imjasonh possibly, though (to cross the streams a bit) that would also mean that in a world where parameters can have type Step (ref) that the TaskRun wouldn't be in control of whether that step were hermetically executed, which may be undesirable.

To crib my example from there:

params:
    - name: source-step
      description: The step to execute to fetch source
      schema:
        type: dev.tekton.taskruns.Step # This is a strawperson
  presteps:
  - $(params.source-step)
  steps:
  ... # More literal steps

I sort of like the idea of having clear expectations about whether a parameter of type Step should/shouldn't be executed hermetically.

@imjasonh
Copy link
Member

I sort of like the idea of having clear expectations about whether a parameter of type Step should/shouldn't be executed hermetically.

Absolutely.

The way this TEP is framed, pre- and post-steps seem to mainly be a matter of cordoning off which steps are hermetic and which aren't, which doesn't motivate the API change by itself since hermkton is not widely used today (possibly because it's too hard, which this TEP would fix 🙃). I suspect your other planned TEPs help drive this need better, in which case I'm content to wait and see how that all shapes up.

@mattmoor
Copy link
Member Author

I suspect your other planned TEPs help drive this need better

I'm working on them! Just trying to keep the parts bite-sized.

@bobcatfish
Copy link
Contributor

Hey @mattmoor a few initial questions:

  • Would there be some way to reuse existing tasks for pre and post steps e.g. if you wanted to do a git clone as a pre step or would you need to copy the steps?
  • Would post steps run in the case of an error? This was an issue with PipelineResources - e.g. folks would want to do something at the end of a task, like upload results, but if the task steps failed it wouldnt happen so they'd have to do something artificial to make the steps pass - today with onError this gets a bit easier but if you still ultimately want the task to fail you'd have to include a final post step to do that - and it starts to look more and more like the "finally" syntax we already have in pipelines - and is why ive been pursuing running a pipeline in a pod as the solution to the use cases you are describing
  • Would this functionality be available from a pipeline as well (which generates taskruns) or just from a taskrun directly?

You might find TEP-0044 interesting - esp. the task composition alternative which suggests something very similar to what you are proposing but from the context of using a task in a pipeline.

I've been trying to find a way to meet the use cases you are describing and it feels like the main missing piece is being able to use the composition expressed in a pipeline but running in a way that doesnt require volumes to share data, i.e. in one pod - which is why ive been pursuing TEP-0044 and #447

@mattmoor
Copy link
Member Author

Would there be some way to reuse existing tasks for pre and post steps e.g. if you wanted to do a git clone as a pre step or would you need to copy the steps?

As it stands today, yes, however, one of my comments on your PR was about enabling structured substitutions as well as structured params/results, and allowing params / results to have types reflecting those substitutions. The example I gave was specifically around being able to pass a Step as a parameter to a task. I hack around this in mink by just passing a container, but this leaves a lot to be desired.

Would post steps run in the case of an error?

They can if we say we want that, I'm happy to call it "finally" as well, if that's what we want. Ultimately it seems like you'd need to capacity to express "finally" in the TaskRun spec in order for PipelineRun->TaskRun transpilation to WAI with "finally", or go the "straight to Pod" route, which has other gotchas.

Would this functionality be available from a pipeline as well (which generates taskruns) or just from a taskrun directly?

My expectation is that this is part of TaskSpec (lmk if there's a way to better convey that), so Tasks could define these too, and even take aspects of the pre/post work as parameters. This would means the TaskRuns executed by a Pipeline (or anything else) would fundamentally have this available to them.

You might find TEP-0044 interesting - esp. the task composition alternative which suggests something very similar to what you are proposing but from the context of using a task in a pipeline.

I've been trying to find a way to meet the use cases you are describing and it feels like the main missing piece is being able to use the composition expressed in a pipeline but running in a way that doesnt require volumes to share data, i.e. in one pod - which is why ive been pursuing TEP-0044 and #447

I can take a look (there's a lot to read through and grok), but my major concerns in general are that Pipeline is overkill for what we want, and it's likely impossible to fully transpile ~any conformant Pipeline into a Task, so it is a potential minefield for folks using them that way. It'd be good to understand how you hope to help folks navigate what's supported vs. not.

@vdemeester vdemeester added the kind/tep Categorizes issue or PR as related to a TEP (or needs a TEP). label Aug 23, 2021
@jerop
Copy link
Member

jerop commented Aug 23, 2021

/assign @bobcatfish @vdemeester @imjasonh @jerop

@tekton-robot tekton-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 24, 2021
@tekton-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
To complete the pull request process, please assign bobcatfish
You can assign the PR to them by writing /assign @bobcatfish in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@tekton-robot tekton-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 26, 2021
@bobcatfish
Copy link
Contributor

and it's likely impossible to fully transpile ~any conformant Pipeline into a Task

I agree - in the long run as we keep pursuing this, I think we'll break away from the intermediate TaskRun and go straight to a pod (or possibly converge Pipelines and Tasks to be more similar but that's one of the more out there options :D) - there's a bit more about this in the "what comes next" in #447

but my major concerns in general are that Pipeline is overkill for what we want

When you say "overkill" could you elaborate a bit? Usually it seems like the main argument against Pipelines for something like this is because of the overhead of requiring volumes for sharing data, but I'm interested if there are other aspects that make Pipelines overkill if that is no longer a problem?

@mattmoor
Copy link
Member Author

When you say "overkill" could you elaborate a bit?

Pipelines are a DAG abstraction, whereas Tasks are already a sequence of colocated steps. Needing to involve Pipeline just to scope aspects of Tasks semantics in still-sequential execution is why it seems like overkill.

in the long run as we keep pursuing this, I think we'll break away from the intermediate TaskRun and go straight to a pod

I have a large number of concerns about this, many/all of which I think I've mentioned. To list a few here (since it also speaks to the "overkill" comment):

  • I doubt it is possible to capture the full semantics of Pipeline in a single Pod (certainly not without heroic work), so there are conformance issues and ultimately there's a high risk of user confusion around what is/isn't supported by this non-conformant variant
  • There is bound to be a huge amount of redundant code across these implementations, and even well factored as packages, that code will likely become much more complex (needing to reason about things in terms of the more complex DAG model).
  • There is already significant conceptual leakage across Pipeline/Task, and a variety of layering violations. This is bound to make those things substantially worse.

Cloud Build supports DAG execution, and we specifically dropped it from knative/build (👋 taskruns) because it was a complexity nightmare, which was deemed (by folks on the team) more trouble than it was worth. Pipeline would only be more complex to try to fully capture due to the possibility of things like distinct service accounts across TaskRuns.

So I'm genuinely wondering whether there is a 80/20 or even 90/10 sweet spot here, where we can capture the vast majority of use cases with a much simpler model.

Copy link
Member

@vdemeester vdemeester left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So reading the proposal, it seems to be solely about hermekton but reading the comments, it seems to be for a broader set of reason why we would want this. I would like to either have the specifics of hermekton be discussed in an update of the hermekton TEP or have this TEP have more use cases and motivation "reasons".

I think this is one of the reason of @imjasonh's comment. If we look solely at hermekton what benefit has this versus updating the hermekton TEP to support per-step hermeticity, and I agree. (ok reading the other comments again is exactly about that 😛).

My expectation is that this is part of TaskSpec (lmk if there's a way to better convey that), so Tasks could define these too, and even take aspects of the pre/post work as parameters. This would means the TaskRuns executed by a Pipeline (or anything else) would fundamentally have this available to them.

You might find TEP-0044 interesting - esp. the task composition alternative which suggests something very similar to what you are proposing but from the context of using a task in a pipeline. […]

I can take a look (there's a lot to read through and grok), but my major concerns in general are that Pipeline is overkill for what we want, and it's likely impossible to fully transpile ~any conformant Pipeline into a Task, so it is a potential minefield for folks using them that way. It'd be good to understand how you hope to help folks navigate what's supported vs. not.

I tend to agree, the initial problem statement of TEP-0044 seems to be very similar to this proposal and pre/post steps could be a possible implementation of TEP-0044. The idea in both being : how to keep Tasks simple and shareable while still making composable without necessarly the need for a Pipeline ?

For example, if we were to allow to inject steps in a TaskRun (before the execution and after), wouldn't this be the wanted behavior without impacting the current Task definition ? I am not entirely sure to see the value of adding those in the TaskSpec for example, compared to having it in the TaskRunSpec. The go-build task would change, but at runtime, when you want to run a quick go build, hermetically, on your project, you would "inject" a git clone before (or a gcs …) and something to report after. Those could be other embedded Step spec, or could refer other Tasks (git-clone for example). This is something that was proposed (a bit differently) here.

I definitely understand the value of being able to use one Task without going with Pipeline complexity for simple use case where you just want to build an image, … This is something that I think we should cover better than we do today. The question is, how ? 😛

Comment on lines +39 to +40
- `pre-steps`: a section of the TaskRun execution prior to the "real" work. This
would generally consist of "setup" and "download" related work.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This point is basically init containers right ? If we were to allow a user to inject init containers (pre-steps, whatever we call them), that would be feeling that need, right ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, although Tekton has the capacity to do this pre-work with sidecars, which could be nice.

Comment on lines +42 to +43
- `post-steps`: a section of the TaskRun execution posterior to the "real" work.
This would generally consist of "tear-down" and "upload" related work.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This point is handled by TEP-0040, aka be able to run steps no matter what was the result before, doesn't it ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you are assuming (because I haven't stated as much!) that this runs like a finally block (e.g. regardless of success/failure). I could go either way on this, but I see (at least) three possibilities:

  1. post-steps have finally semantics, which would enable TEP-0040 like cleanup logic.
  2. post-steps have normal semantics, which would complement TEP-0040 in that folks could use post-steps like a finally block, if they so choose.
  3. post-steps augments TEP-0040, so that folks can write onError: post-steps (or something) to indicate that when a step fails, that execution should flow to the post-steps (perhaps without executing the intervening steps).

IMO 2. and 3. leave this as non-conflicting/complementary to TEP-0040, where 1. this might subsume/replace TEP-0040.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly, when I first wrote this 2. is what I had in mind, but I think you read it as 1., so it would be good to flesh this out and hopefully that shows why I wasn't thinking of this as conflicting with TEP-0040 😅

Comment on lines +47 to +52
One of the primary motivations for this work is to enable users to help express
their intent when utilizing "[hermekton](./0025-hermekton.md)" builds. Hermekton
lets users disable network access during step execution, but as a `bool` it doesn't
provide any real leeway to have "pre" steps fetch sources (e.g. `git clone`) or
dependencies (e.g. `go mod download`) over the network, or "post" steps publish
results (e.g. `gsutil cp`) outside of that network jail.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading the comment should we have more points here than just kermekton ? Especially as "as a bool it doesn't provide any real leeway to …" could be seem as something to work on as part of hermekton and then as a consequence here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's certainly most complementary to Hermekton at the moment, but as I mention above (re: TEP-0040) there are ways that this could be quite complementary as a way of expressing finally-like semantics (without that necessarily being part of what we do here).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The re-usable unit in Tekton is the Task - in that sense I think it would be nice to keep hermekton features associated to tasks. Cloning a repo and uploading results are things that could be done in tasks specialised for that, and they would not run in hermetic mode.

Comment on lines +56 to +58
Enable TaskRuns to hermetically execute the bulk of their work, while enabling the
execution of "pre" steps that can fetch inputs and "post" steps that can publish
outputs.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above

@dlorenc
Copy link
Contributor

dlorenc commented Aug 30, 2021

I'm having a hard time following the comparisons between this and TEP-0040. It's possible they overlap because TEP-0040 is pretty broad, but the actual design still seems to be TBD.

@vdemeester
Copy link
Member

vdemeester commented Aug 30, 2021

I'm having a hard time following the comparisons between this and TEP-0040. It's possible they overlap because TEP-0040 is pretty broad, but the actual design still seems to be TBD.

It definitely is broader, and TEP-0044 design is also TBD, but they share similarities in what they are trying to achieve, aka be able to "inject" some containers before and after a Task without the need to rely on Pipeline. This is why we are comparing to it, and a decision on this TEP would affect TEP-0044 heavily (no matter what the decision is), thus trying to take it into account while reviewing this TEP.

@bobcatfish
Copy link
Contributor

bobcatfish commented Sep 3, 2021

I'm having a hard time following the comparisons between this and TEP-0040.

To me the overlap is in the problem that each TEP is trying to address - this seems to me like it could be a possible solution to the problems described in TEP-0040. TEP-0040 also tries to describe what some of these solutions could be and compares them to each other, so I want to make sure we look at this potential solution in light of the others we've discussed.

(I think im just restating what @vdemeester already said 🙏 )

mattmoor added a commit to mattmoor/community-2 that referenced this pull request Sep 5, 2021
This change offers an alternative "fence"-based approach, which is intended to be less invasive than: tektoncd#502
@mattmoor
Copy link
Member Author

mattmoor commented Sep 5, 2021

FWIW, I posted #511 as a potential alternative to this, in the context of Hermekton.

mattmoor added a commit to mattmoor/community-2 that referenced this pull request Sep 5, 2021
This change offers an alternative "fence"-based approach, which is intended to be less invasive than: tektoncd#502
@tekton-robot
Copy link
Contributor

@mattmoor: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@tekton-robot tekton-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 8, 2021
@jerop jerop removed their assignment Oct 7, 2021
@tekton-robot
Copy link
Contributor

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale with a justification.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close with a justification.
If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/lifecycle stale

Send feedback to tektoncd/plumbing.

@tekton-robot tekton-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 5, 2022
@mattmoor mattmoor closed this Jan 24, 2022
@bobcatfish bobcatfish changed the title Proposal: TaskRun Pre/Post steps [TEP-0080] Proposal: TaskRun Pre/Post steps Feb 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/tep Categorizes issue or PR as related to a TEP (or needs a TEP). lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants