-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Graceful sidecar support #936
Conversation
So there's good news and bad news. 👍 The good news is that everyone that needs to sign a CLA (the pull request submitter and all commit authors) have done so. Everything is all good there. 😕 The bad news is that it appears that one or more commits were authored or co-authored by someone other than the pull request submitter. We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that here in the pull request. Note to project maintainer: This is a terminal state, meaning the ℹ️ Googlers: Go here for more info. |
/ok-to-test |
/test pull-tekton-pipeline-integration-tests |
1 similar comment
/test pull-tekton-pipeline-integration-tests |
Hm, from the integration tests:
This looks suspiciously like a configuration or platform issue. |
/test all |
/retest |
Hi @dicarlo2 @bobcatfish @sbwsg, |
Yeah I'm in favor of that if you can make it work, but defer to @bobcatfish and also @vdemeester in case they have concerns given our existing test infra. How would this work if someone is trying to run e2e tests locally or on a non-GKE Kubernetes provider? |
Istio/sdecar related tests could be skipped if there is no Istio namespace. |
Having to install istio just for the e2e tests seems a bit much (and heavy) I feel. Discussed a bit with @bobcatfish and @sbwsg and we fell we can defer the e2e a bit (until we work on logging sidecar and thus using this as e2e tests for sidecars). |
/retest |
Hey @dicarlo2 I think we're pretty close to having your sidecar PR mergeable here. If you've got some time would you mind reviewing our updates to your work and verifying that the documentation and code comments we've added are accurate and gel with your intentions? Cheers! |
/retest |
LGTM, sorry for the delayed response! |
A Googler has manually verified that the CLAs look good. (Googler, please make sure the reason for overriding the CLA status is clearly documented in these comments.) ℹ️ Googlers: Go here for more info. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/meow boxes
In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/test pull-tekton-pipeline-integration-tests |
2 similar comments
/test pull-tekton-pipeline-integration-tests |
/test pull-tekton-pipeline-integration-tests |
ಠ_ಠ
|
/test pull-tekton-pipeline-integration-tests |
1 similar comment
/test pull-tekton-pipeline-integration-tests |
TL;DR Tekton Pipelines now gracefully start and stop with sidecars running alongside TaskRun pods. Long Version :- TaskRun pods can have sidecars injected by Admission Controllers. This is common practice with Istio where a proxy sidecar is injected into pods so that they can be included as part of a service mesh. Unfortunately there is no built-in Kubernetes mechanism to set the lifetime of a sidecar container to match that of the pod's "primary" container. In practice this means that pods injected with sidecars will not stop until both the primary and all sidecar containers are stopped. Prior to this commit injected sidecars would cause a TaskRun pod to run forever, even when all Step containers of a Task had completed. This commit introduces mechanisms to start and stop sidecars gracefully meaning that a) Step containers wait until all sidecars are in a ready state before starting and b) Sidecar containers are stopped when a TaskRun's Step containers are done. No end-to-end tests have been added as part of this PR because doing so would require Istio to be enabled on the CI cluster (or a dummy Admission Controller / Mutating Webhook thing to be written just for injecting containers into test Pods). The Istio requirement would put an undue burden on those users running tests in non-GKE environments. It's currently planned for a follow-up PR to introduce a) user-defined sidecars and b) a Tekton-injected sidecar that performs logging. Once that work is complete it will be much simpler to e2e test this sidecar code. Further to the above, a number of smaller refactors have taken place in this PR: - a sidecars package has been created to encapsulate the logic for stopping sidecars with the nop container image - the updateStatusFromPod func has been moved into a taskrun/status package and broken down into constituent helpers - a small amount of dead code has been eliminated Co-authored-by: Jose Blas Camacho Taboada <joseblas@gmail.com> Co-authored-by: Scott Seaward <sbws@google.com>
So there's good news and bad news. 👍 The good news is that everyone that needs to sign a CLA (the pull request submitter and all commit authors) have done so. Everything is all good there. 😕 The bad news is that it appears that one or more commits were authored or co-authored by someone other than the pull request submitter. We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that here in the pull request. Note to project maintainer: This is a terminal state, meaning the ℹ️ Googlers: Go here for more info. |
A Googler has manually verified that the CLAs look good. (Googler, please make sure the reason for overriding the CLA status is clearly documented in these comments.) ℹ️ Googlers: Go here for more info. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
🙏
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: joseblas, vdemeester The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
🙌 |
awesome work! |
Scott has been provided super detailed helpful reviews for a while now (58 reviews as of https://tekton.devstats.cd.foundation/d/46/pr-reviews-by-contributor?orgId=1&var-period=d&var-repo_name=tektoncd%2Fpipeline&var-reviewers="sbwsg"). He has contributed useful and technically challenging features such as tektoncd#905 (recreate pods in face of ResourceQuota errors), drove completion of tektoncd#936 (Graceful sidecar support) and also tektoncd#871 (enforcing default TaskRun timeouts).
Scott has been provided super detailed helpful reviews for a while now (58 reviews as of https://tekton.devstats.cd.foundation/d/46/pr-reviews-by-contributor?orgId=1&var-period=d&var-repo_name=tektoncd%2Fpipeline&var-reviewers="sbwsg"). He has contributed useful and technically challenging features such as #905 (recreate pods in face of ResourceQuota errors), drove completion of #936 (Graceful sidecar support) and also #871 (enforcing default TaskRun timeouts).
This commit adds user-defined sidecars to the TaskSpec. A sidecar is a container intended to provide some auxiliary support to a pod's primary containers. An example usecase is a mock API server for your app to hit during testing. This commit doesn't introduce any extra lifecycle management beyond that added in #936 - sidecars will be stopped without any warning or cleanup time. At the moment I'm thinking of leaving this as-is until the sidecars KEP reaches general availability in Kubernetes.
Changes
TL;DR Tekton Pipelines now gracefully start and stop with injected sidecars running alongside TaskRun pods.
Long Version :-
TaskRun pods can have sidecars injected by Admission Controllers. This is common practice with Istio where a proxy sidecar is injected into pods so that they can be included as part of a service mesh. Unfortunately there is no built-in Kubernetes mechanism to set the lifetime of a sidecar container to match that of the pod's "primary" container. In practice this means that pods injected with sidecars will not stop until both the primary and all sidecar containers are stopped.
Prior to this commit any non-istio injected sidecars would cause a TaskRun pod to run forever, even when all Step containers of a Task had completed. Istio injection was also disabled completely on TR pods. This commit introduces mechanisms to start and stop sidecars gracefully meaning that a) Step containers wait until all sidecars are in a ready state before starting and b) Sidecar containers are stopped when a TaskRun's Step containers are done. Additionally, the annotation that prevented Istio sidecar injection has been removed. Istio now adds sidecars to TaskRun pods as expected.
No end-to-end tests have been added as part of this PR because doing so would require Istio to be enabled on the CI cluster (or a dummy Admission Controller / Mutating Webhook thing to be written just for injecting containers into test Pods). The Istio requirement would put an undue burden on those users running tests in non-GKE environments. It's currently planned for a follow-up PR to introduce a) user-defined sidecars and b) a Tekton-injected sidecar that performs logging. Once that work is complete it will be much simpler to e2e test this sidecar code.
For reference, see the POC PR that was originally created here: #728
For the design proposal and associated discussion around the original POC please see #727
For the remaining work, test coverage, code organization, etc, see #926
Submitter Checklist
These are the criteria that every PR should meet, please check them off as you
review them:
Release Notes