Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adding metadata to taskSpec in PipelineTask #2826

Merged
merged 1 commit into from
Aug 20, 2020

Conversation

pritidesai
Copy link
Member

@pritidesai pritidesai commented Jun 17, 2020

Changes

Adding metadata to taskSpec within PipelineTask to allow specifying labels and annotations when a task is embedded using taskSpec. These labels and annotations are propagated to taskRun and then to the pods.

apiVersion: tekton.dev/v1beta1
kind: PipelineRun
metadata:
  name: pipelinerun-with-taskspec-to-echo-greetings
spec:
  pipelineSpec:
    tasks:
      - name: echo-greetings
         taskSpec:
           metadata:
             labels: [ …]
             annotations: [...]
         steps:
...

Metadata is already supported as part of Tasks and Pipelines while respective CRDs are created. But was not possible to specify with embedded resources.

These labels are commonly used to identify and filter pods for further actions (such as collecting pod metrics, and cleaning up completed pod with certain labels, etc) even being part of one single Pipeline.

  pipelineSpec:
    tasks:
    - name: echo
      metadata:
        labels:
          pipeline-sdk-type: kfp
      taskSpec:
       ...
    - name: echo2
      metadata:
        labels:
          pipeline-sdk-type: tfx
      taskSpec:
       ...

Closes #2767

Submitter Checklist

These are the criteria that every PR should meet, please check them off as you
review them:

  • Includes tests (if functionality changed/added)
  • Includes docs (if user facing)
  • Commit messages follow commit message best practices
  • Release notes block has been filled in or deleted (only if no user facing changes)

See the contribution guide for more details.

Double check this list of stuff that's easy to miss:

Reviewer Notes

If API changes are included, additive changes must be approved by at least two OWNERS and backwards incompatible changes must be approved by more than 50% of the OWNERS, and they must first be added in a backwards compatible way.

Release Notes

Pipeline authors can now specify metadata while embedding tasks (using taskSpec) into their pipeline.

@tekton-robot tekton-robot requested review from a user and vdemeester June 17, 2020 02:10
@tekton-robot tekton-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Jun 17, 2020
@tekton-robot
Copy link
Collaborator

This PR cannot be merged: expecting exactly one kind/ label

Available kind/ labels are:

kind/bug: Categorizes issue or PR as related to a bug.
kind/flake: Categorizes issue or PR as related to a flakey test
kind/cleanup: Categorizes issue or PR as related to cleaning up code, process, or technical debt.
kind/design: Categorizes issue or PR as related to design.
kind/documentation: Categorizes issue or PR as related to documentation.
kind/feature: Categorizes issue or PR as related to a new feature.
kind/misc: Categorizes issue or PR as a miscellaneuous one.

1 similar comment
@tekton-robot
Copy link
Collaborator

This PR cannot be merged: expecting exactly one kind/ label

Available kind/ labels are:

kind/bug: Categorizes issue or PR as related to a bug.
kind/flake: Categorizes issue or PR as related to a flakey test
kind/cleanup: Categorizes issue or PR as related to cleaning up code, process, or technical debt.
kind/design: Categorizes issue or PR as related to design.
kind/documentation: Categorizes issue or PR as related to documentation.
kind/feature: Categorizes issue or PR as related to a new feature.
kind/misc: Categorizes issue or PR as a miscellaneuous one.

@pritidesai
Copy link
Member Author

/kind feature

@tekton-robot tekton-robot added the kind/feature Categorizes issue or PR as related to a new feature. label Jun 17, 2020
@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
internal/builder/v1beta1/pipeline.go 82.0% 82.2% 0.2
pkg/apis/pipeline/v1beta1/pipeline_types.go 62.1% 67.6% 5.5
pkg/apis/pipeline/v1beta1/pipeline_validation.go 96.2% 96.3% 0.1
pkg/reconciler/pipelinerun/pipelinerun.go 83.2% 83.4% 0.3

@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
internal/builder/v1beta1/pipeline.go 82.0% 82.2% 0.2
pkg/apis/pipeline/v1beta1/pipeline_types.go 62.1% 67.6% 5.5
pkg/apis/pipeline/v1beta1/pipeline_validation.go 96.2% 96.3% 0.1
pkg/reconciler/pipelinerun/pipelinerun.go 83.2% 83.4% 0.3

@@ -6,6 +6,9 @@ spec:
pipelineSpec:
tasks:
- name: echo-good-morning
metadata:
Copy link
Member

@jlpettersson jlpettersson Jun 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is its own field now? so it will not necessarily be written "just above" taskSpec. If we could have the Task as a field, optional metadata would work out of the box, as with volumeClaimTemplate where I can declare a PersistentVolumeClaim with or without metadata, https://github.com/tektoncd/pipeline/blob/master/pkg/apis/pipeline/v1beta1/workspace_types.go#L61
I would prefer that.

Copy link
Member Author

@pritidesai pritidesai Jun 25, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jlpettersson I think your proposal would help avoid adding k8s specific types in the PipelineTask, we can introduce a new field TaskTemplate or EmbeddedTask in PipelineTask which can of type *Task:

type PipelineTask struct {
	TaskTemplate *Task `json:"taskTemplate,omitempty"`

PipelineTask can have one of TaskRef, TaskSpec, or TaskTemplate.

I understand we are looking over to Custom Tasks which would help solve many use cases but this would solve the issue in short term and could be extended to custom tasks if needed. I am looking for feedback on this 🙏

@vdemeester @bobcatfish @jlpettersson thoughts?

Copy link
Member

@jlpettersson jlpettersson Jun 25, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think something like that is the most proper solution, yes.
We haven't used "Template" for declarations in other places, but it may work. Other alternatives are:

Task *Task 'json:"task, omitempty"'

or

TaskDeclaration *Task 'json:"task, omitempty"'

Most correct description about what this is, is probably: inline task declaration

I don't think this is related to CustomTask at all, it will be the exact same problem. I think CustomTask recently has become the most misunderstood thing about Tekton.

@@ -93,6 +95,9 @@ type PipelineResult struct {
// PipelineTask defines a task in a Pipeline, passing inputs from both
// Params and from the output of previous tasks.
type PipelineTask struct {
// +optional
metav1.ObjectMeta `json:"metadata,omitempty"`

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we had Task *v1beta1.Task instead of (or in addition to - for compatibility) the current TaskSpec *TaskSpec, then metadata would be included as optional - by design

...as how I understand from the volumeClaimTemplate.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but this does not work for TaskRef if we want a "custom" metadata - but if we want to use the metdata as-is, I think it should work?

Copy link
Member Author

@pritidesai pritidesai Jun 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @jlpettersson, with taskRef, the referring task can have metadata which should be sufficient like we have today. My focus was to only add metadata with taskSpec as pipeline has no means of specifying task metadata with taskSpec.

@bobcatfish
Copy link
Collaborator

I think this might require a TEP - since it's an API change, and also this is introducing more k8s specific stuff into a PipelineTask - I'd at least like to discuss it before adding it

@bobcatfish
Copy link
Collaborator

looks like @vdemeester didnt feel it required a TEP (#2767 (comment)) that's ok with me, but could we at least discuss this at the API working group and talk about why we need it?

@pritidesai
Copy link
Member Author

pritidesai commented Jun 18, 2020

looks like @vdemeester didnt feel it required a TEP (#2767 (comment)) that's ok with me, but could we at least discuss this at the API working group and talk about why we need it?

yup, will add it for the discussion in the API group 🙏

@Tomcli
Copy link
Contributor

Tomcli commented Jun 22, 2020

Here is one of our use cases with this new API:
With the current API, we can have two tasks in our pipeline that use the same label key:

apiVersion: tekton.dev/v1beta1
kind: Task
metadata:
  labels:
    pipelines.kubeflow.org/pipeline-sdk-type: kfp
  name: echo
spec:
  steps:
  - args:
    - echo hello world
    command:
    - sh
    - -c
    image: library/bash:4.4.23
    name: main
---
apiVersion: tekton.dev/v1beta1
kind: Task
metadata:
  labels:
    pipelines.kubeflow.org/pipeline-sdk-type: tfx
  name: echo2
spec:
  steps:
  - args:
    - echo hello world
    command:
    - sh
    - -c
    image: library/bash:4.4.23
    name: main
---
apiVersion: tekton.dev/v1beta1
kind: Pipeline
metadata:
  name: sequential-pipeline
spec:
  tasks:
  - name: echo
    taskRef:
      name: echo
  - name: echo2
    runAfter:
    - echo
    taskRef:
      name: echo2
---
apiVersion: tekton.dev/v1beta1
kind: PipelineRun
metadata:
  name: sequential-pipeline-run
spec:
  pipelineRef:
    name: sequential-pipeline

Here, we can filter these pods based on a label. e.g. kubectl get pod -l pipelines.kubeflow.org/pipeline-sdk-type=kfp. It is also useful with labelSelector when applying affinity on these tasks.

However, when we try to convert the above pipeline into a single pipelineRun kind, we don't have a way to express the labels or annotations on the task level. Therefore, we want to have the above API to represent the pipelineRun in this way.

apiVersion: tekton.dev/v1beta1
kind: PipelineRun
metadata:
  name: sequential-pipeline-run
spec:
  pipelineSpec:
    tasks:
    - name: echo
      metadata:
        labels:
          pipelines.kubeflow.org/pipeline-sdk-type: kfp
      taskSpec:
        steps:
        - args:
          - echo hello world
          command:
          - sh
          - -c
          image: library/bash:4.4.23
          name: main
    - name: echo2
      metadata:
        labels:
          pipelines.kubeflow.org/pipeline-sdk-type: tfx
      runAfter:
      - echo
      taskSpec:
        steps:
        - args:
          - echo hello world
          command:
          - sh
          - -c
          image: library/bash:4.4.23
          name: main

@pritidesai
Copy link
Member Author

pritidesai commented Jun 22, 2020

thanks @Tomcli for the example, it looks great, would also like to know:

  1. How are we utilizing these pods after they are filtered based on the pipeline-sdk-type?
  2. Is it possible to use any other type instead of metadata to specify labels and annotations like json?

I understand the expectation is accurate that the labels and annotations should propagate to each taskRun irrespective of how task definition is specified i.e. taskRef or taskSpec. The biggest drawback with this approach is:

PipelineTask level metadata would cause conflicts with the metadata coming from the task itself (while using taskRef). We could restrict it at the pipeline task level which would avoid any conflict but trying to understand the use cases to better design this.

@Tomcli
Copy link
Contributor

Tomcli commented Jun 22, 2020

thanks @Tomcli for the example, it looks great, would also like to know:

  1. How are we utilizing these pods after they are filtered based on the pipeline-sdk-type?
  2. Is it possible to use any other type instead of metadata to specify labels and annotations like json?

I understand the expectation is accurate that the labels and annotations should propagate to each taskRun irrespective of how task definition is specified i.e. taskRef or taskSpec. The biggest drawback with this approach is:

PipelineTask level metadata would cause conflicts with the metadata coming from the task itself (while using taskRef). We could restrict it at the pipeline task level which would avoid any conflict but trying to understand the use cases to better design this.

  1. We have a metadata service that uses this label to filter out pods that need additional actions (such as collecting pod metrics or cleaning up completed pod with certain labels). K8S labels are useful for selecting objects like this and we want to use it in a single pipelineRun spec as well.

  2. I'm not sure is there any other type that can reproduce the same behavior as k8s labels and annotations. If there's any alternative solution I can also give it a try.

I know there will be conflicts when using the taskRef. We only want to override the labels and annotations if the provided keys don't exist. It's similar to the conflicts we have for volumeClaimTemplate and podTemplate.

@animeshsingh
Copy link

PipelineTask level metadata would cause conflicts with the metadata coming from the task itself (while using taskRef)

This is similar to paradigms where a superset has "default" values, and unless a subset from that superset defines its own values which override the "default", the default gets applied to subset as well.

@pritidesai pritidesai changed the title adding metadata to PipelineTask adding metadata to taskSpec in PipelineTask Jul 1, 2020
Copy link
Member

@vdemeester vdemeester left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice 👯
One nit, and one question 😉

pkg/apis/pipeline/v1alpha1/pipeline_conversion.go Outdated Show resolved Hide resolved
// Propagate labels and annotations from PipelineRun and TaskSpec to TaskRun.
labels := getTaskrunLabels(pr, rprt.PipelineTask.Name)
annotations := getTaskrunAnnotations(pr)
if rprt.PipelineTask.TaskSpec != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One question here : which labels should take precedence ? My gut is that the TaskRun labels should take precedence over the one defined in the Pipeline ; otherwise a Pipeline can define a label that override tekton.dev/Task or something and it could break some UX/UI/…. This code does the other way around (labels define in the Pipeline tasks take over the TaskRun labels).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the way its implemented, labels defined in pipeline would take precedence. You are right, it is possible that pipeline defined tekton.dev label would break and takes over the TaskRun labels. I will change the order here.

Should we apply the same precedence order in Conditions? I can create a separate PR for conditions.

Also, I think labels with tekton.dev/ should be reserved for the platform and we should not allow such labels, anything starting with tekton.dev/. What do you think? We could add this kind of validation in this PR as part of ValidateTaskSpecMetadata.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we apply the same precedence order in Conditions? I can create a separate PR for conditions.

I would think so indeed 😉

Also, I think labels with tekton.dev/ should be reserved for the platform and we should not allow such labels, anything starting with tekton.dev/. What do you think? We could add this kind of validation in this PR as part of ValidateTaskSpecMetadata.

Indeed, we should probably add that kind of validation to ValidateTaskSpecMetadata (for some reason I thought that was already the case 😜 )

Copy link
Member Author

@pritidesai pritidesai Jul 2, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, in general Metadata is validated for valid Name (no special character . and max length of 63 characters). I have added validation on labels as part of taskSpec metadata to restrict using tekton.dev/Task. Not sure if we can block in general any label with tekton.dev in taskSpec metadata (git and docker credentials use tekton.dev but they belong to k8s Secrets) or should we rather restrict labels with tekton.dev/ from every Tekton resource i.e. Pipeline, Task, etc

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have updated the order so that TaskRun labels takes higher precedence.

@tekton-robot tekton-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 1, 2020
@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
internal/builder/v1beta1/pipeline.go 82.1% 82.8% 0.6
pkg/apis/pipeline/v1beta1/pipeline_types.go 62.1% 64.7% 2.6
pkg/apis/pipeline/v1beta1/pipeline_validation.go 96.2% 96.3% 0.0
pkg/reconciler/pipelinerun/pipelinerun.go 85.9% 86.2% 0.3

@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
internal/builder/v1beta1/pipeline.go 82.1% 82.8% 0.6
pkg/apis/pipeline/v1beta1/pipeline_types.go 62.1% 70.0% 7.9
pkg/apis/pipeline/v1beta1/pipeline_validation.go 96.2% 96.3% 0.0
pkg/reconciler/pipelinerun/pipelinerun.go 86.0% 86.6% 0.7

@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
internal/builder/v1beta1/pipeline.go 82.1% 82.8% 0.6
pkg/apis/pipeline/v1beta1/pipeline_types.go 62.1% 70.0% 7.9
pkg/apis/pipeline/v1beta1/pipeline_validation.go 96.2% 96.3% 0.0
pkg/reconciler/pipelinerun/pipelinerun.go 86.0% 86.7% 0.7

@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
internal/builder/v1beta1/pipeline.go 81.2% 81.8% 0.6
pkg/apis/pipeline/v1beta1/pipeline_types.go 62.1% 60.0% -2.1
pkg/reconciler/pipelinerun/pipelinerun.go 85.6% 86.3% 0.7

@pritidesai
Copy link
Member Author

@vdemeester @imjasonh @bobcatfish @jlpettersson its ready for review, PTAL 🙏

Adding metadata to TaskSpec in PipelineTask to allow specifying metadata.
This metadata will be propogated to taskRun and then to the pods.

```
apiVersion: tekton.dev/v1beta1
kind: PipelineRun
metadata:
  name: pipelinerun-with-taskspec-to-echo-greetings
spec:
  pipelineSpec:
    tasks:
      - name: echo-greetings
        taskSpec:
          metadata:
            labels: [ …]
          steps:
...
```

Metadata is already supported as part of Tasks and Pipelines while
respective CRDs are created. But was not possible to specify with
embedded resources.
@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
internal/builder/v1beta1/pipeline.go 81.2% 81.8% 0.6
pkg/apis/pipeline/v1beta1/pipeline_types.go 62.1% 60.0% -2.1
pkg/reconciler/pipelinerun/pipelinerun.go 85.6% 86.3% 0.7

@imjasonh
Copy link
Member

imjasonh commented Aug 8, 2020

/lgtm

🎉

@tekton-robot tekton-robot added the lgtm Indicates that a PR is ready to be merged. label Aug 8, 2020
@vdemeester vdemeester added kind/feature Categorizes issue or PR as related to a new feature. and removed kind/feature Categorizes issue or PR as related to a new feature. labels Aug 11, 2020
Copy link
Member

@vdemeester vdemeester left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/meow

@tekton-robot
Copy link
Collaborator

@vdemeester: cat image

In response to this:

/lgtm
/meow

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@tekton-robot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: vdemeester

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@tekton-robot tekton-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 11, 2020
@pritidesai
Copy link
Member Author

/test check-pr-has-kind-label

@pritidesai
Copy link
Member Author

/retest

@vdemeester vdemeester added kind/feature Categorizes issue or PR as related to a new feature. and removed kind/feature Categorizes issue or PR as related to a new feature. labels Aug 19, 2020
@vdemeester
Copy link
Member

/test pull-tekton-pipeline-integration-tests

@vdemeester
Copy link
Member

2020/08/19 08:26:57 main.go:319: Something went wrong: failed to prepare test environment: --provider=gke boskos failed to acquire project: resources not found

Hum, boskos is in trouble 😅

@vdemeester
Copy link
Member

/test pull-tekton-pipeline-integration-tests

@pritidesai
Copy link
Member Author

one more try 🙃

/test pull-tekton-pipeline-integration-tests

@vdemeester
Copy link
Member

/test pull-tekton-pipeline-integration-tests

@vdemeester vdemeester added kind/feature Categorizes issue or PR as related to a new feature. and removed kind/feature Categorizes issue or PR as related to a new feature. labels Aug 20, 2020
@tekton-robot tekton-robot merged commit 74cc03a into tektoncd:master Aug 20, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. kind/feature Categorizes issue or PR as related to a new feature. lgtm Indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add metadata to PipelineTask to specific additional labels and annotations
8 participants