Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Graceful sidecar support #936

Merged
merged 1 commit into from
Jun 17, 2019
Merged

Graceful sidecar support #936

merged 1 commit into from
Jun 17, 2019

Conversation

joseblas
Copy link
Contributor

@joseblas joseblas commented May 31, 2019

Changes

TL;DR Tekton Pipelines now gracefully start and stop with injected sidecars running alongside TaskRun pods.

Long Version :-

TaskRun pods can have sidecars injected by Admission Controllers. This is common practice with Istio where a proxy sidecar is injected into pods so that they can be included as part of a service mesh. Unfortunately there is no built-in Kubernetes mechanism to set the lifetime of a sidecar container to match that of the pod's "primary" container. In practice this means that pods injected with sidecars will not stop until both the primary and all sidecar containers are stopped.

Prior to this commit any non-istio injected sidecars would cause a TaskRun pod to run forever, even when all Step containers of a Task had completed. Istio injection was also disabled completely on TR pods. This commit introduces mechanisms to start and stop sidecars gracefully meaning that a) Step containers wait until all sidecars are in a ready state before starting and b) Sidecar containers are stopped when a TaskRun's Step containers are done. Additionally, the annotation that prevented Istio sidecar injection has been removed. Istio now adds sidecars to TaskRun pods as expected.

No end-to-end tests have been added as part of this PR because doing so would require Istio to be enabled on the CI cluster (or a dummy Admission Controller / Mutating Webhook thing to be written just for injecting containers into test Pods). The Istio requirement would put an undue burden on those users running tests in non-GKE environments. It's currently planned for a follow-up PR to introduce a) user-defined sidecars and b) a Tekton-injected sidecar that performs logging. Once that work is complete it will be much simpler to e2e test this sidecar code.

For reference, see the POC PR that was originally created here: #728

For the design proposal and associated discussion around the original POC please see #727

For the remaining work, test coverage, code organization, etc, see #926

Submitter Checklist

These are the criteria that every PR should meet, please check them off as you
review them:

Release Notes

Sidecar containers injected by, for example, Istio, will now gracefully start up and shutdown in tandem with a Task Steps' containers.

@tekton-robot tekton-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 31, 2019
@googlebot
Copy link

So there's good news and bad news.

👍 The good news is that everyone that needs to sign a CLA (the pull request submitter and all commit authors) have done so. Everything is all good there.

😕 The bad news is that it appears that one or more commits were authored or co-authored by someone other than the pull request submitter. We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that here in the pull request.

Note to project maintainer: This is a terminal state, meaning the cla/google commit status will not change from this state. It's up to you to confirm consent of all the commit author(s), set the cla label to yes (if enabled on your project), and then merge this pull request when appropriate.

ℹ️ Googlers: Go here for more info.

@tekton-robot tekton-robot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label May 31, 2019
@tekton-robot tekton-robot requested review from dlorenc and imjasonh May 31, 2019 12:09
@tekton-robot tekton-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label May 31, 2019
@ghost ghost mentioned this pull request May 31, 2019
3 tasks
@ghost
Copy link

ghost commented May 31, 2019

/ok-to-test

@tekton-robot tekton-robot removed the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label May 31, 2019
@ghost
Copy link

ghost commented May 31, 2019

/test pull-tekton-pipeline-integration-tests

1 similar comment
@ghost
Copy link

ghost commented May 31, 2019

/test pull-tekton-pipeline-integration-tests

@ghost
Copy link

ghost commented May 31, 2019

Hm, from the integration tests:

I0531 14:26:40.473] 2019/05/31 14:26:40 main.go:761: --gcp-project is missing, trying to fetch a project from boskos.
I0531 14:26:40.474] (for local runs please set --gcp-project to your dev project)
I0531 14:26:40.474] 2019/05/31 14:26:40 main.go:773: provider gke, will acquire project type gke-project from boskos
I0531 14:26:40.484] 2019/05/31 14:26:40 main.go:312: Something went wrong: failed to prepare test environment: --provider=gke boskos failed to acquire project: resource not found

This looks suspiciously like a configuration or platform issue.

@ghost ghost mentioned this pull request May 31, 2019
3 tasks
@joseblas
Copy link
Contributor Author

/test all

@joseblas
Copy link
Contributor Author

/retest

@joseblas
Copy link
Contributor Author

joseblas commented Jun 4, 2019

Hi @dicarlo2 @bobcatfish @sbwsg,
thinking about e2e testing I'm thinking about adding Istio to our Boskos config in order to be able to have Sidecars-by-istio in some namespaces to cover this.
Do you agree?

@ghost
Copy link

ghost commented Jun 4, 2019

Yeah I'm in favor of that if you can make it work, but defer to @bobcatfish and also @vdemeester in case they have concerns given our existing test infra. How would this work if someone is trying to run e2e tests locally or on a non-GKE Kubernetes provider?

@joseblas
Copy link
Contributor Author

joseblas commented Jun 4, 2019

Istio/sdecar related tests could be skipped if there is no Istio namespace.

@vdemeester
Copy link
Member

Having to install istio just for the e2e tests seems a bit much (and heavy) I feel. Discussed a bit with @bobcatfish and @sbwsg and we fell we can defer the e2e a bit (until we work on logging sidecar and thus using this as e2e tests for sidecars).

@ghost
Copy link

ghost commented Jun 5, 2019

/retest

@ghost
Copy link

ghost commented Jun 5, 2019

Hey @dicarlo2 I think we're pretty close to having your sidecar PR mergeable here. If you've got some time would you mind reviewing our updates to your work and verifying that the documentation and code comments we've added are accurate and gel with your intentions? Cheers!

@ghost
Copy link

ghost commented Jun 6, 2019

/retest

@ghost ghost changed the title [WIP] Sidecars (Based on Scott and DiCarlo works) Graceful sidecar support Jun 7, 2019
@tekton-robot tekton-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 7, 2019
@dicarlo2
Copy link
Contributor

LGTM, sorry for the delayed response!

@ghost ghost added cla: yes Trying to make the CLA bot happy with ppl from different companies work on one commit and removed cla: no labels Jun 10, 2019
@googlebot
Copy link

A Googler has manually verified that the CLAs look good.

(Googler, please make sure the reason for overriding the CLA status is clearly documented in these comments.)

ℹ️ Googlers: Go here for more info.

Copy link
Member

@vdemeester vdemeester left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/meow boxes

@tekton-robot
Copy link
Collaborator

@vdemeester: cat image

In response to this:

/lgtm
/meow boxes

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@tekton-robot tekton-robot added lgtm Indicates that a PR is ready to be merged. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Jun 17, 2019
@vdemeester
Copy link
Member

/test pull-tekton-pipeline-integration-tests

2 similar comments
@ghost
Copy link

ghost commented Jun 17, 2019

/test pull-tekton-pipeline-integration-tests

@ghost
Copy link

ghost commented Jun 17, 2019

/test pull-tekton-pipeline-integration-tests

@ghost
Copy link

ghost commented Jun 17, 2019

ಠ_ಠ

main.go:312: Something went wrong: failed to prepare test environment: --provider=gke boskos failed to acquire project: resource not found

@ghost
Copy link

ghost commented Jun 17, 2019

/test pull-tekton-pipeline-integration-tests

1 similar comment
@vdemeester
Copy link
Member

/test pull-tekton-pipeline-integration-tests

TL;DR Tekton Pipelines now gracefully start and stop with sidecars
running alongside TaskRun pods.

Long Version :-

TaskRun pods can have sidecars injected by Admission Controllers. This
is common practice with Istio where a proxy sidecar is injected into pods
so that they can be included as part of a service mesh. Unfortunately
there is no built-in Kubernetes mechanism to set the lifetime of a
sidecar container to match that of the pod's "primary" container. In
practice this means that pods injected with sidecars will not stop until
both the primary and all sidecar containers are stopped.

Prior to this commit injected sidecars would cause a TaskRun pod to run
forever, even when all Step containers of a Task had completed. This
commit introduces mechanisms to start and stop sidecars gracefully
meaning that a) Step containers wait until all sidecars are in a ready
state before starting and b) Sidecar containers are stopped when a
TaskRun's Step containers are done.

No end-to-end tests have been added as part of this PR because doing so
would require Istio to be enabled on the CI cluster (or a dummy
Admission Controller / Mutating Webhook thing to be written just for
injecting containers into test Pods). The Istio requirement would put an
undue burden on those users running tests in non-GKE environments. It's
currently planned for a follow-up PR to introduce a) user-defined
sidecars and b) a Tekton-injected sidecar that performs logging. Once that
work is complete it will be much simpler to e2e test this sidecar code.

Further to the above, a number of smaller refactors have taken place in
this PR:

- a sidecars package has been created to encapsulate the logic for
stopping sidecars with the nop container image
- the updateStatusFromPod func has been moved into a taskrun/status
package and broken down into constituent helpers
- a small amount of dead code has been eliminated

Co-authored-by: Jose Blas Camacho Taboada <joseblas@gmail.com>
Co-authored-by: Scott Seaward <sbws@google.com>
@tekton-robot tekton-robot removed the lgtm Indicates that a PR is ready to be merged. label Jun 17, 2019
@googlebot
Copy link

So there's good news and bad news.

👍 The good news is that everyone that needs to sign a CLA (the pull request submitter and all commit authors) have done so. Everything is all good there.

😕 The bad news is that it appears that one or more commits were authored or co-authored by someone other than the pull request submitter. We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that here in the pull request.

Note to project maintainer: This is a terminal state, meaning the cla/google commit status will not change from this state. It's up to you to confirm consent of all the commit author(s), set the cla label to yes (if enabled on your project), and then merge this pull request when appropriate.

ℹ️ Googlers: Go here for more info.

@googlebot googlebot added cla: no and removed cla: yes Trying to make the CLA bot happy with ppl from different companies work on one commit labels Jun 17, 2019
@ghost ghost added cla: yes Trying to make the CLA bot happy with ppl from different companies work on one commit and removed cla: no labels Jun 17, 2019
@googlebot
Copy link

A Googler has manually verified that the CLAs look good.

(Googler, please make sure the reason for overriding the CLA status is clearly documented in these comments.)

ℹ️ Googlers: Go here for more info.

Copy link
Member

@vdemeester vdemeester left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
🙏

@tekton-robot tekton-robot added the lgtm Indicates that a PR is ready to be merged. label Jun 17, 2019
@tekton-robot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: joseblas, vdemeester

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@tekton-robot tekton-robot merged commit ace3ab6 into tektoncd:master Jun 17, 2019
@ghost
Copy link

ghost commented Jun 17, 2019

🙌

@dicarlo2
Copy link
Contributor

awesome work!

bobcatfish added a commit to bobcatfish/pipeline that referenced this pull request Aug 19, 2019
Scott has been provided super detailed helpful reviews for a while now
(58 reviews as of
https://tekton.devstats.cd.foundation/d/46/pr-reviews-by-contributor?orgId=1&var-period=d&var-repo_name=tektoncd%2Fpipeline&var-reviewers="sbwsg").

He has contributed useful and technically challenging features such as
tektoncd#905 (recreate pods in face of
ResourceQuota errors), drove completion of
tektoncd#936
(Graceful sidecar support) and also tektoncd#871
(enforcing default TaskRun timeouts).
@bobcatfish bobcatfish mentioned this pull request Aug 19, 2019
1 task
@ghost ghost mentioned this pull request Aug 22, 2019
3 tasks
tekton-robot pushed a commit that referenced this pull request Aug 26, 2019
Scott has been provided super detailed helpful reviews for a while now
(58 reviews as of
https://tekton.devstats.cd.foundation/d/46/pr-reviews-by-contributor?orgId=1&var-period=d&var-repo_name=tektoncd%2Fpipeline&var-reviewers="sbwsg").

He has contributed useful and technically challenging features such as
#905 (recreate pods in face of
ResourceQuota errors), drove completion of
#936
(Graceful sidecar support) and also #871
(enforcing default TaskRun timeouts).
tekton-robot pushed a commit that referenced this pull request Aug 29, 2019
This commit adds user-defined sidecars to the TaskSpec. A sidecar is a
container intended to provide some auxiliary support to a pod's primary
containers. An example usecase is a mock API server for your app to hit
during testing.

This commit doesn't introduce any extra lifecycle management beyond that
added in #936 - sidecars will be stopped without any warning or cleanup
time. At the moment I'm thinking of leaving this as-is until the sidecars
KEP reaches general availability in Kubernetes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cla: yes Trying to make the CLA bot happy with ppl from different companies work on one commit lgtm Indicates that a PR is ready to be merged. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants