Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to structure kustomize packages for Tekton CD pipelines for Kubeflow #544

Closed
jlewi opened this issue Dec 13, 2019 · 11 comments
Closed

Comments

@jlewi
Copy link
Contributor

jlewi commented Dec 13, 2019

We are trying to define Tekton pipelines to continuously build and update our kustomize manifests for Kubeflow applications: #450

We currently have some initial pipelines checked in here
https://github.com/kubeflow/kubeflow/tree/master/components/base
https://github.com/kubeflow/kubeflow/tree/master/components/profile-controller/ci

The way the kustomize packages are laid out makes a number of operations inconvenient.
I'd like to figure out a good way to restructure the kustomize packages and tekton resources.

Here are somethings I'd like to facilitate.

  1. I'd like to define different versions of the pipelines corresponding to different versions of Kubeflow

    • i.e. we'd like to be able to trigger builds from master branches and update kubeflow/manifests@master vs. triggering from release branches and updating the release branch of kubeflow/manifests
  2. We'd like to make it easy for release engineer to kick off PipelineRun's for all Kubeflow applications or to kick off runs for individual pipelines

The current layout has the following challenges.

  • PipelineRun's are structured as overlays; so its easy to trigger a run for a specific application
    but not runs for all the applications

  • The PipelineRun name is not unique on each build; so firing off multiple runs of a pipeline isn't easy

    • You either need to delete the previous run or you need to modify the pipeline run.

A couple ideas questions

  • Should the PipelineRuns be a completely separate kustomize package from the other resources e.g we have the layout

    pipeline-definitions
         base
             pipeline.yaml
            ....
         overlays
             master
                 ....
             v0.7
                 ....
    run-definitions
           base
               pipelinerun.yaml
            overlays
               master
                   ....
               v0.7
                   ....
    
    • This would make it easy to do a bulk run of all the pipelines at a specific version

    • We could use kustomize suffix generation to generate unique names for the runs

    • To run a specific pipeline we can use apply's label selector e.g.

      kustomize build --reorder none profile-controller/ci | kubectl apply -l type=run -f -
      
    • Downside is the pipeline definitions and run definitions can't easily share config

      • Is there a way to selectively enable suffix generation?
@kkasravi
Copy link
Contributor

i had refactored pipelines here https://github.com/kkasravi/kfctl-config-files/tree/master/tk/tk-pipelines.
This is based on having kfctl do the generation using a configfile that just has the tk-pipelines manifest with overlays and parameters that are used by the embedded Pipeline and the TaskRefs. I haven't created the configfile to generate the PipelineRun for {build-push-task,pull-request-resource} (you had collapsed this into 1 task). The older style configfile is here https://github.com/kkasravi/kfctl-config-files/blob/master/kfdef/ci-profile-controller-pipeline-run.yaml. It needs to be updated to just the the tk-pipelines manifest and the overlay(s) and parameters defined in this manifest

It solves some of the problems noted

  • PipelineRun uses $(generateName) so each generated PipelineRun would have a unique name
  • Pipeline is embedded in PipelineRun so the config can be shared
  • Tasks also are generated with unique names so each generate PipelineRun would reference different Tasks using the same suffix

1 similar comment
@kkasravi
Copy link
Contributor

i had refactored pipelines here https://github.com/kkasravi/kfctl-config-files/tree/master/tk/tk-pipelines.
This is based on having kfctl do the generation using a configfile that just has the tk-pipelines manifest with overlays and parameters that are used by the embedded Pipeline and the TaskRefs. I haven't created the configfile to generate the PipelineRun for {build-push-task,pull-request-resource} (you had collapsed this into 1 task). The older style configfile is here https://github.com/kkasravi/kfctl-config-files/blob/master/kfdef/ci-profile-controller-pipeline-run.yaml. It needs to be updated to just the the tk-pipelines manifest and the overlay(s) and parameters defined in this manifest

It solves some of the problems noted

  • PipelineRun uses $(generateName) so each generated PipelineRun would have a unique name
  • Pipeline is embedded in PipelineRun so the config can be shared
  • Tasks also are generated with unique names so each generate PipelineRun would reference different Tasks using the same suffix

@kkasravi
Copy link
Contributor

I updated https://github.com/kkasravi/kfctl-config-files/blob/master/kfdef/ci-profile-controller-pipeline-run.yaml to work with tk-pipelines.

If you do kfctl build -f ci-profile-controller-pipeline-run.yaml you'll get

WARN[0000] ignoring var image_name specified in kustomize/tk-pipelines/overlays/build-push-task/kustomization.yaml  filename="kustomize/kustomize.go:710"
WARN[0000] ignoring var image_name specified in kustomize/tk-pipelines/overlays/update-manifests-task/kustomization.yaml  filename="kustomize/kustomize.go:710"

and generated kustomize subdir

@jlewi
Copy link
Contributor Author

jlewi commented Dec 13, 2019

@kkasravi I really don't want to use kfctl here. Conceptually it is really confusing. I think of Kubeflow as Kubeflow control plane. Not as some general configuration management solution.

We already have two layers of configuration management

  • The substitution/reference functionality Tekton provides
  • Tools like kustomize/helm etc... that could be used to write Tekton pipelines

So its not clear to me why we would need kfctl.

Is $(generatename) kfctl magic or is that kustomize functionality?

Why do we want unique Task names? I thought we would use well defined Tasks to allow reusability across runs & pipelines.

@kkasravi
Copy link
Contributor

i'm not suggesting we use kfctl - it's just a way to generate the pipelinerun code.

$(generateName) is kfctl magic that would generate a unique suffix.
instead of

name: $(generateName)

it could be

generateName: ci-profiler-

but you would need to use kubectl create instead of apply

Task names don't need to be unique as long as they're completely parameterized.

@kkasravi
Copy link
Contributor

@jlewi
i guess my suggestion is to change the existing code to embed the pipeline into pipelinerun so there's only one manifest that holds all the overlays and parameters. The overlays would include PipelineResources and Tasks where Tasks could be deployed early since they're reusable. I believe you can now embed PipelineResources into PipelineRun

@kkasravi
Copy link
Contributor

@jlewi
kfctl would not be involved and just changing the parameters, overlays would allow you to go from profiler-controller to a different controller

@jlewi
Copy link
Contributor Author

jlewi commented Dec 14, 2019

@kkasravi I don't think generateName works with kustomize (I believe I tried that). I think we could work around that by doing something like

kustomize build ... | yq .metadata.name=someunique-name | kubectl apply -f -

(Thinking out loud) we have two dimensions along which we want to stamp out multiple pipeline definitions

  • version - e.g. master vs. v0.X.Y
  • application - e.g. profile controller vs. jupyter controller

Lets start with the Tasks
https://github.com/kubeflow/kubeflow/blob/master/components/base/task.yaml

Right now we have two

  1. Build and Push
  2. Update Manifests

These are already parameterized using Tekton Resources.

Is there any reason we would need multiple instances of these tasks? i.e can we just use Tekton parameterization?

Lets assume we don't use kustomize at all. Then we need to do the following

  1. Define a single Pipeline resource

  2. Define a PipelineResource for every application and the cross product of repos and version

    • e.g. 1 resource for kubeflow @master and another for kubelfow@v0.7-branch
    • num repos x num versions could be large
    • One option would be to define one overlay for each version
    • We could also just write "Make" script to auto-generate these combinations
  3. Define 1 PipelineRun for every application x version

What if we start as follows

  1. Check in YAML specs for all of the resources as listed above
  2. Maybe use kustomize or other tools behind those scenes to generate those manifests when writing by hand becomes burdensome?

@kkasravi
Copy link
Contributor

kkasravi commented Dec 14, 2019

if we embed PipelineResource and Pipeline into PipelineRun do we reduce the permutations?
For PipelineRun below if we leverage the /spec/params section then these could be propagated down into the embedded PipelineResources (/spec/resources/resourceSpec) and Pipeline (/spec/pipelineSpec). The Tasks would also reference these values set in /spec/params.

apiVersion: tekton.dev/v1alpha1
kind: PipelineRun
metadata:
  name: $(uniquename)
  namespace: kubeflow-ci
spec:
  params:
  - name: docker_target
    value: $(docker_target)
  - name: image_name
    value: $(image_name)
  - name: image_url
    value: $(image_url)
  - name: path_to_context
    value: $(path_to_context)
  - name: path_to_docker_file
    value: $(path_to_docker_file)
  - name: container_image
    value: $(container_image)
  - name: path_to_manifests_dir
    value: $(path_to_manifests_dir)
  - name: kubeflow_version
    value: $(kubeflow_version)
  - name: manifests_version
    value: $(manifests_version)
  pipelineSpec:
    params:
    - name: docker_target
      type: string
    - name: image_name
      type: string
    - name: path_to_context
      type: string
    - name: path_to_docker_file
      type: string
    resources:
    - name: kubeflow
      type: git
    - name: $(params.image_name)
      type: image
    - name: manifests
      type: git
    tasks:
    - name: build-push
      params:
      - name: docker_target
        value: $(params.docker_target)
      - name: image_name
        value: $(params.image_name)
      - name: path_to_context
        value: $(params.path_to_context)
      - name: path_to_docker_file
        value: $(params.path_to_docker_file)
      resources:
        inputs:
        - name: kubeflow
          resource: kubeflow
        outputs:
        - name: $(params.image_name)
          resource: $(params.image_name)
      taskRef:
        kind: namespaced
        name: build-push
    - name: update-manifests
      params:
      - name: container_image
        value: $(params.container_image)
      - name: path_to_manifests_dir
        value: $(params.path_to_manifests_dir)
      resources:
        inputs:
        - name: kubeflow
          resource: kubeflow
        - name: manifests
          resource: manifests
        - from:
          - build-push
          name: $(params.image_name)
          resource: $(params.image_name)
      runAfter:
      - build-push
      taskRef:
        kind: namespaced
        name: update-manifests
  resources:
  - name: kubeflow
    resourceSpec:
      params:
      - name: revision
        value: $(params.kubeflow_version)
      - name: url
        value: git@github.com:kubeflow/kubeflow.git
      type: git
  - name: manifests
    resourceSpec:
      params:
      - name: revision
        value: $(params.manifests_version)
      - name: url
        value: git@github.com:kubeflow/manifests.git
      type: git
  - name: $(params.image_name)
    resourceSpec:
      params:
      - name: url
        value: $(params.image_url)
      type: image
  serviceAccount: tk-pipelines-service-account

@jlewi
Copy link
Contributor Author

jlewi commented Dec 17, 2019

@kkasravi if we go the route of embedding everything in PipelineRun; does that end up being any different from using multiple resources and relying on kustomize commonPrefix to give each resource a unique name?

jlewi pushed a commit to jlewi/testing that referenced this issue Dec 20, 2019
* kubeflow/testing is the central repository where all reusable engprod
  code goes. As a result it makes sense for the reusable Tekton definitions
  and scripts related to continuous building and updating of Kubeflow
  applications to live here

* For now, we will also centralize the definition of pipelines for all
  applications in this repository (as opposed to having them live in
  the application repositories).

  * This should make it easier to manage and automatically update
    all applications.

* Per kubeflow#544 redo how we use kustomize and Tekton to
  parameterize the pipelines.

  * Individual runs of pipelines will rely completely on Tekton parameter
    substitution to create a run to build an image for a specific
    application at a specific commit.

    * The image URL and git commit of source code will be Tekton
      PipelineResources that are inlined in the PipelineRun

    * For any parameters we will use Tekton parameters and inline the values
      in the PipelineRun

  * PipelineRuns will just be created as YAML files and not as kustomize
    overlays

  * Right now we have a single kustomize package which defines the reusable
    elements which are Tekton Tasks and Pipeline resources

    * Right now we have a single Pipeline for all applications but in the future
      we might have application specific pipelines

* rebuild_manifests.sh should use the image tag v0.x.y-${commit} rather
      than the digest.

   * This image tagging scheme will be the basis for determining whether
     the image is already up to date (kubeflow#545)

   * since we now specify the full image_url rather than using image name
     we need a parameter for the src_image that is used with the kustomize
     edit function.
jlewi pushed a commit to jlewi/testing that referenced this issue Dec 20, 2019
* kubeflow/testing is the central repository where all reusable engprod
  code goes. As a result it makes sense for the reusable Tekton definitions
  and scripts related to continuous building and updating of Kubeflow
  applications to live here

* For now, we will also centralize the definition of pipelines for all
  applications in this repository (as opposed to having them live in
  the application repositories).

  * This should make it easier to manage and automatically update
    all applications.

* Per kubeflow#544 redo how we use kustomize and Tekton to
  parameterize the pipelines.

  * Individual runs of pipelines will rely completely on Tekton parameter
    substitution to create a run to build an image for a specific
    application at a specific commit.

    * The image URL and git commit of source code will be Tekton
      PipelineResources that are inlined in the PipelineRun

    * For any parameters we will use Tekton parameters and inline the values
      in the PipelineRun

  * PipelineRuns will just be created as YAML files and not as kustomize
    overlays

  * Right now we have a single kustomize package which defines the reusable
    elements which are Tekton Tasks and Pipeline resources

    * Right now we have a single Pipeline for all applications but in the future
      we might have application specific pipelines

* rebuild_manifests.sh should use the image tag v0.x.y-${commit} rather
      than the digest.

   * This image tagging scheme will be the basis for determining whether
     the image is already up to date (kubeflow#545)

   * since we now specify the full image_url rather than using image name
     we need a parameter for the src_image that is used with the kustomize
     edit function.

* Use a separate task for updating the manifests

  * Now that we are using image tags of the form "{TAG}-{COMMIT}" which
    is determined at pipeline construction time; we no longer
    need to pass the digest file between the build-push step and the
    update manifests task which makes it much easier to run
    them as separate task since we don't need a pod volume to share data.

Related to kubeflow#450 - CD pipelines for Kubeflow.
jlewi pushed a commit to jlewi/testing that referenced this issue Dec 21, 2019
* kubeflow/testing is the central repository where all reusable engprod
  code goes. As a result it makes sense for the reusable Tekton definitions
  and scripts related to continuous building and updating of Kubeflow
  applications to live here

* For now, we will also centralize the definition of pipelines for all
  applications in this repository (as opposed to having them live in
  the application repositories).

  * This should make it easier to manage and automatically update
    all applications.

* Per kubeflow#544 redo how we use kustomize and Tekton to
  parameterize the pipelines.

  * Individual runs of pipelines will rely completely on Tekton parameter
    substitution to create a run to build an image for a specific
    application at a specific commit.

    * The image URL and git commit of source code will be Tekton
      PipelineResources that are inlined in the PipelineRun

    * For any parameters we will use Tekton parameters and inline the values
      in the PipelineRun

  * PipelineRuns will just be created as YAML files and not as kustomize
    overlays

  * Right now we have a single kustomize package which defines the reusable
    elements which are Tekton Tasks and Pipeline resources

    * Right now we have a single Pipeline for all applications but in the future
      we might have application specific pipelines

* rebuild_manifests.sh should use the image tag v0.x.y-${commit} rather
      than the digest.

   * This image tagging scheme will be the basis for determining whether
     the image is already up to date (kubeflow#545)

   * since we now specify the full image_url rather than using image name
     we need a parameter for the src_image that is used with the kustomize
     edit function.

* Use a separate task for updating the manifests

  * Now that we are using image tags of the form "{TAG}-{COMMIT}" which
    is determined at pipeline construction time; we no longer
    need to pass the digest file between the build-push step and the
    update manifests task which makes it much easier to run
    them as separate task since we don't need a pod volume to share data.

Related to kubeflow#450 - CD pipelines for Kubeflow.
jlewi pushed a commit to jlewi/kubeflow that referenced this issue Dec 21, 2019
* Delete all the Tekton pipelines and scripts for continuous delivery
  of Kubeflow applications because they are moving into kubeflow/testing

* kubeflow/testing#551 is the PR moving the code into kubeflow/testing

Related to: kubeflow/testing#544 redo how we use kustomize and Tekton
            to parameterize the pipelines
k8s-ci-robot pushed a commit that referenced this issue Dec 27, 2019
* kubeflow/testing is the central repository where all reusable engprod
  code goes. As a result it makes sense for the reusable Tekton definitions
  and scripts related to continuous building and updating of Kubeflow
  applications to live here

* For now, we will also centralize the definition of pipelines for all
  applications in this repository (as opposed to having them live in
  the application repositories).

  * This should make it easier to manage and automatically update
    all applications.

* Per #544 redo how we use kustomize and Tekton to
  parameterize the pipelines.

  * Individual runs of pipelines will rely completely on Tekton parameter
    substitution to create a run to build an image for a specific
    application at a specific commit.

    * The image URL and git commit of source code will be Tekton
      PipelineResources that are inlined in the PipelineRun

    * For any parameters we will use Tekton parameters and inline the values
      in the PipelineRun

  * PipelineRuns will just be created as YAML files and not as kustomize
    overlays

  * Right now we have a single kustomize package which defines the reusable
    elements which are Tekton Tasks and Pipeline resources

    * Right now we have a single Pipeline for all applications but in the future
      we might have application specific pipelines

* rebuild_manifests.sh should use the image tag v0.x.y-${commit} rather
      than the digest.

   * This image tagging scheme will be the basis for determining whether
     the image is already up to date (#545)

   * since we now specify the full image_url rather than using image name
     we need a parameter for the src_image that is used with the kustomize
     edit function.

* Use a separate task for updating the manifests

  * Now that we are using image tags of the form "{TAG}-{COMMIT}" which
    is determined at pipeline construction time; we no longer
    need to pass the digest file between the build-push step and the
    update manifests task which makes it much easier to run
    them as separate task since we don't need a pod volume to share data.

Related to #450 - CD pipelines for Kubeflow.
k8s-ci-robot pushed a commit to kubeflow/kubeflow that referenced this issue Dec 30, 2019
* Delete all the Tekton pipelines and scripts for continuous delivery
  of Kubeflow applications because they are moving into kubeflow/testing

* kubeflow/testing#551 is the PR moving the code into kubeflow/testing

Related to: kubeflow/testing#544 redo how we use kustomize and Tekton
            to parameterize the pipelines
@jlewi
Copy link
Contributor Author

jlewi commented Dec 30, 2019

In #551 I followed @kkasravi 's suggestion and got rid of kustomize and just rely on Tekton's parameterization. I think this works pretty well.

@jlewi jlewi closed this as completed Dec 30, 2019
saffaalvi pushed a commit to StatCan/kubeflow that referenced this issue Feb 11, 2021
…low#4593)

* Delete all the Tekton pipelines and scripts for continuous delivery
  of Kubeflow applications because they are moving into kubeflow/testing

* kubeflow/testing#551 is the PR moving the code into kubeflow/testing

Related to: kubeflow/testing#544 redo how we use kustomize and Tekton
            to parameterize the pipelines
saffaalvi pushed a commit to StatCan/kubeflow that referenced this issue Feb 12, 2021
…low#4593)

* Delete all the Tekton pipelines and scripts for continuous delivery
  of Kubeflow applications because they are moving into kubeflow/testing

* kubeflow/testing#551 is the PR moving the code into kubeflow/testing

Related to: kubeflow/testing#544 redo how we use kustomize and Tekton
            to parameterize the pipelines
Adembc pushed a commit to Adembc/notebooks that referenced this issue Jun 14, 2024
* Delete all the Tekton pipelines and scripts for continuous delivery
  of Kubeflow applications because they are moving into kubeflow/testing

* kubeflow/testing#551 is the PR moving the code into kubeflow/testing

Related to: kubeflow/testing#544 redo how we use kustomize and Tekton
            to parameterize the pipelines
Adembc pushed a commit to Adembc/notebooks that referenced this issue Jun 14, 2024
* Delete all the Tekton pipelines and scripts for continuous delivery
  of Kubeflow applications because they are moving into kubeflow/testing

* kubeflow/testing#551 is the PR moving the code into kubeflow/testing

Related to: kubeflow/testing#544 redo how we use kustomize and Tekton
            to parameterize the pipelines
Adembc pushed a commit to Adembc/notebooks that referenced this issue Jun 22, 2024
…low#4593)

* Delete all the Tekton pipelines and scripts for continuous delivery
  of Kubeflow applications because they are moving into kubeflow/testing

* kubeflow/testing#551 is the PR moving the code into kubeflow/testing

Related to: kubeflow/testing#544 redo how we use kustomize and Tekton
            to parameterize the pipelines
Adembc pushed a commit to Adembc/notebooks that referenced this issue Jun 22, 2024
…low/kubeflow#4593)

* Delete all the Tekton pipelines and scripts for continuous delivery
  of Kubeflow applications because they are moving into kubeflow/testing

* kubeflow/testing#551 is the PR moving the code into kubeflow/testing

Related to: kubeflow/testing#544 redo how we use kustomize and Tekton
            to parameterize the pipelines
Adembc pushed a commit to Adembc/notebooks that referenced this issue Jun 22, 2024
…low/kubeflow#4593)

* Delete all the Tekton pipelines and scripts for continuous delivery
  of Kubeflow applications because they are moving into kubeflow/testing

* kubeflow/testing#551 is the PR moving the code into kubeflow/testing

Related to: kubeflow/testing#544 redo how we use kustomize and Tekton
            to parameterize the pipelines
Adembc pushed a commit to Adembc/notebooks that referenced this issue Jun 22, 2024
…low/kubeflow#4593)

* Delete all the Tekton pipelines and scripts for continuous delivery
  of Kubeflow applications because they are moving into kubeflow/testing

* kubeflow/testing#551 is the PR moving the code into kubeflow/testing

Related to: kubeflow/testing#544 redo how we use kustomize and Tekton
            to parameterize the pipelines
Adembc pushed a commit to Adembc/notebooks that referenced this issue Jun 23, 2024
…low/kubeflow#4593)

* Delete all the Tekton pipelines and scripts for continuous delivery
  of Kubeflow applications because they are moving into kubeflow/testing

* kubeflow/testing#551 is the PR moving the code into kubeflow/testing

Related to: kubeflow/testing#544 redo how we use kustomize and Tekton
            to parameterize the pipelines
Adembc pushed a commit to Adembc/notebooks that referenced this issue Jun 23, 2024
…low/kubeflow#4593)

* Delete all the Tekton pipelines and scripts for continuous delivery
  of Kubeflow applications because they are moving into kubeflow/testing

* kubeflow/testing#551 is the PR moving the code into kubeflow/testing

Related to: kubeflow/testing#544 redo how we use kustomize and Tekton
            to parameterize the pipelines
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants