Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TEP-0082 Workspace Hinting #510

Merged
merged 1 commit into from Oct 27, 2021
Merged

TEP-0082 Workspace Hinting #510

merged 1 commit into from Oct 27, 2021

Conversation

ghost
Copy link

@ghost ghost commented Sep 3, 2021

Workspaces are a general-purpose way of declaring portions of a Task's
filesystem to be supplied at runtime. Their general-purpose nature means
that automated systems aren't able to attach meaning to them. A Workspace
is "just a Workspace" to a machine whereas humans can use the "description" field
of a Workspace to figure out if the intented use is a credential, a shared
data volume, etc.

This TEP proposes adding hints to workspace declarations in both Tasks and
Pipelines that allow a system to assign basic meaning (and therefore some
sane default bindings) to workspaces.

@tekton-robot tekton-robot requested review from ncskier and sm43 September 3, 2021 16:42
@tekton-robot tekton-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Sep 3, 2021
@vdemeester
Copy link
Member

/assign

@vdemeester
Copy link
Member

/kind tep

@tekton-robot tekton-robot added the kind/tep Categorizes issue or PR as related to a TEP (or needs a TEP). label Sep 3, 2021
@bobcatfish
Copy link
Contributor

/assign

- name: deploy-key
readOnly: true
optional: true
profile: credential
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does readOnly overrides the profile defaults provided by profile credential ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

readOnly: true is added to the volumeMount in the step's container (this is how it works today with workspaces). I think that would remain the same after this feature is implemented too.

There aren't any defaults attached to this "credential" profile tho - it's just a string hint to the system creating the PipelineRuns / TaskRuns.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Scott, I was confused by :

Each of these profiles can have some fairly reasonable assumptions attached to them. A "credential" Workspace likely should be populated from a Secret or Secret-like volume, for example. A "cache" Workspace might most often be supplied with a long-lived read-only Persistent Volume. A "data" Workspace might be safely assumed to require an ephemeral Persistent Volume that lives only as long as the PipelineRun.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be phrased differently or edited to make it clearer? I do mention directly after this that Most importantly: these specific decisions should be left up to the platform but maybe there's a better way to communicate the point?

Copy link
Contributor

@ScrapCodes ScrapCodes Sep 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be due to my own limitation. I was also wondering to myself, is this hinting mechanism sufficient to address the said use case i.e.

The Tekton Workflows project is currently exploring ways to pass Secrets from a high-level Workflow description into a PipelineRun. This is made considerably more difficult because Pipelines can't indicate which of their Workspaces might be the right one to bind those Secrets to.

e.g. a Pipeline may need multiple secrets from different workspaces, and indicating all of them are secret credential profile - will it be sufficient information to bind?

Forgive me, if this question is as rookie as the previous.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You nailed it: that's the biggest drawback I think. I described it like this in the "Cons" section below:

Given a Task that optionally accepts both an "ssh-key" and "basic-auth" Workspace marked with the "credential" profile, would an automated system still be able to infer a correct mapping?

I'm going to rewrite this section to clear things up, I reckon it would read better with some edits.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rewritten it, hopefully a bit clearer. Lemme know what you think.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good !! 😄

@tekton-robot tekton-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 7, 2021
@afrittoli
Copy link
Member

/assign @afrittoli

Copy link
Contributor

@bobcatfish bobcatfish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've included a few question but generally makes lots of sense to me and nice coverage of alternatives 👍 👍 👍

I feel the explicit hinting meets the user requirements a bit better than the defaulting option; however I like that the default option gives us like 80% of what we need, plus other features, and doesn't preclude adding the explicit hinting (tho i have a feeling if we go that route we'll eventually add both)

/approve


### Non-Goals

- Adding constraint-checking or any other logic to Pipelines to validate bound Workspaces based on workspace hints.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious why not - even if it wasn't in the initial implementation maybe this would be a reasonable follow on? curious if you excluded this just to keep the scope down or if there's another motivation that it might be interesting to elaborate on

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe it's something like: even if the pipeline author hints that a volume claim could be used, the pipeline might still function correctly with other types and we don't want to arbitrarily prevent that?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah you've hit the nail on the head with both your comments here.

If we expand scope to include validation then we'll end up burning a lot of cycles designing the shape of that, how it works, and how configurable it is. I allude to this in the "Questions around future extensions stemming from this change are quite nuanced" bit later on but I'll bring some of that same sentiment up here to support this Non-Goal. I'll also add this as possible future work as well.


### Use Cases (optional)

The Tekton Workflows project is currently exploring ways to pass Secrets from a high-level Workflow description into a PipelineRun. This is made considerably more difficult because Pipelines can't indicate which of their Workspaces might be the right one to bind those Secrets to.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there anything we can link to around this? even a discussion the working group notes maybe

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll see if I can find something in our WG notes and link it in here.


## Requirements

- Hinting must be optional: we don't want to suddenly invalidate every Task or Pipeline that currently includes a Workspace.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kind of a tangent, but maybe this is something we'd eventually require in catalog workspaces 🤔 + @jerop

default:
secret:
secretName: my-docker-json
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm interesting option - my default thought is I don't like encoding this runtime info here, but if it's just a default, maybe that's a nice balance?

this also feels like it might be solving a slightly different use case/requirements 🤔 🤔 🤔 but maybe this is a two birds with one stone situation? it sounds like the implication is that this default provides the "hinting" needed for an outside tool to know what kind of workspace the author intends to be used here 🤔

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly right on the hinting: it's an unspoken assumption here that the outside tool can use defaults to inform how it builds a PipelineRun / TaskRun. Two birds with one stone is spot on.

Another con here is that it embeds the k8s type directly in the workspace declaration (and therefore this bit of our API spec) but 🤷 I think the horse might have left the stable on that issue. We embed the volume types in workspace bindings already.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another con here is that it embeds the k8s type directly in the workspace declaration (and therefore this bit of our API spec) but 🤷 I think the horse might have left the stable on that issue. We embed the volume types in workspace bindings already.

My hope here is that we could define our conformance surface such that the actual workspace binding values are not included - in the current spec, only emptyDir support is required and the result are optional: https://github.com/tektoncd/pipeline/blob/main/docs/api-spec.md#workspacebinding

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's ok to hint that a workspace that is meant to provide credentials, the source of this credentials is up to the consumer of the task - it could be a secret or it could be a PVC provisioned with creds extracted from a vault of some kind.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is already mentioned later in the TEP :)


#### Pros:
- Very explicit about the Task's expectations for the content provided.
- Pipelines can override a Tasks' expected types - so, for example, a Task might expect a ConfigMap but a Pipeline might override with a PV instead.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe another pro: adding defaults wouldn't preclude us from adding more explicit hinting later - i.e. there are benefits to providing this functionality in general (e.g. a pipeline that just works out of the box without having to deal with workspaces), AND if this isn't enough we could still add explicit hints (and/or even enforcement) later without needing to undo any of this?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add this, thanks.

- Not just an API change - there will be some logic involved here on the Pipelines controller
side to apply the default workspace config to runs.
- Questions around future extensions stemming from this change are quite nuanced:
- What if an author wants the "default" to actually be a requirement and for the TaskRun to fail if it's missing? For example a deploy Task requiring a Secret with name "cluster-key" to exist in the TaskRun's namespace.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ooo good point


#### Cons

- Syntactically different from the "profiles" alternative but otherwise not functionally all that different.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also i feel like this sets an interesting (imo bad?) precedent for embedding information in descriptions - i guess we kind of do something like this with labels but i feel like at least that's the purpose of labels

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I'll include this too.

@tekton-robot tekton-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 15, 2021
to the external / platform. Our own Workflows project may be able to
utilize these profiles, for example, to make informed choices when
creating a `PipelineRun` based solely on `Pipeline` YAML, supplied list
of volumes and set of Secret references.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@tekton-robot tekton-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 17, 2021
@tekton-robot tekton-robot added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Sep 23, 2021
@tekton-robot tekton-robot added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Oct 6, 2021
@bobcatfish
Copy link
Contributor

looks like this is nearly ready to go but I think we need a review from @afrittoli 🙏

Copy link
Member

@afrittoli afrittoli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this!
We do already have API fields that are not directly used by Tekton (see "description") so it may be ok to add more?

Another couple of alternatives:

  • the external system could use loosely coupled metadata - an external json file or annotations added to the task that would add the required extra workspace metadata

  • syntactic alternatives to workspaces which come with special defaults - e.g. credentials are always readonly:

workspaces: 
- name: data
credentials:
- name: git
- name: shortlivedtoken
  readOnly: false

The example above could be used in a task clones a private git repo and that generates a short lived token and stores it for the next task to use.

This would be functionally similar to profiles, but it would require API changes to add new profiles.

/approve


## Requirements

- Hinting must be optional: we don't want to suddenly invalidate every
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+100


- Adding constraint-checking or any other logic to Pipelines to validate
bound Workspaces based on workspace hints. The potential scope related
to a feature like this would be subtly massive. This TEP is trying to
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might decide in future that we want to have higher level abstractions in the Tekton API directly - if so, the new API could easily work on top of the hinting mechanism.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, added this extra detail!

Comment on lines 303 to 306
- If a Task has multiple "credential" Workspaces, a system still needs
to figure out how to bind them. This might be automatic (an internal
rule like "names must match") or it could require some UI to be
exposed asking the user what the correct mapping is.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this specific to this alternative? It seems to be a common issue or limitation.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think you're right. Removed.

- It's a bit unclear what the incentive for including `profiles` would
be for Catalog Task authors. How would they "figure out" the purpose
and correct values to put in here?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is the main issue.
Since the consumer is external, the list valid values would be unknown.
If Tekton defines a list of valid values, it may not be useful to external systems.
It would be easier if those values had some meaning for Tekton pipelines too - e.g. we could consider "credentials" workspaces to be read-only by default. We could have checks to prevent the content of a "credentials" workspace ending up on the taskrun log.

@tekton-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: afrittoli, bobcatfish, vdemeester

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@vdemeester
Copy link
Member

Note this has been approved by 3 owners from different companies, so @sbwsg we can merge as is and iterate over it or wait for some update here, as you prefer 😉

@bobcatfish
Copy link
Contributor

/lgtm

up to you if you want to merge as-is or make any last updates first @sbwsg :

/hold

@tekton-robot tekton-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 25, 2021
@tekton-robot tekton-robot added the lgtm Indicates that a PR is ready to be merged. label Oct 25, 2021
Workspaces are a general-purpose way of declaring portions of a Task's
filesystem to be supplied at runtime. Their general-purpose nature means
that automated systems aren't able to attach meaning to them. A Workspace
is "just a Workspace" to a machine whereas humans can use the "description" field
of a Workspace to figure out if the intented use is a credential, a shared
data volume, etc.

This TEP proposes adding hints to workspace declarations in both Tasks and
Pipelines that allow a system to assign basic meaning (and therefore some
sane default bindings) to workspaces.
@tekton-robot tekton-robot removed the lgtm Indicates that a PR is ready to be merged. label Oct 26, 2021
@ghost
Copy link
Author

ghost commented Oct 26, 2021

Thanks for the reviews, I've added the new alternatives @afrittoli proposed. Will need one more lgtm with the pushed changes.

/hold cancel

@tekton-robot tekton-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 26, 2021
@bobcatfish
Copy link
Contributor

Thanks @sbwsg !

/lgtm

@tekton-robot tekton-robot added the lgtm Indicates that a PR is ready to be merged. label Oct 27, 2021
@tekton-robot tekton-robot merged commit 16b0c42 into tektoncd:main Oct 27, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. kind/tep Categorizes issue or PR as related to a TEP (or needs a TEP). lgtm Indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
Status: Proposed
Development

Successfully merging this pull request may close these issues.

5 participants