-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TaskRun that uses multiple PVC-backed workspaces will fail to schedule when using Affinity Assistant #3545
Comments
This is by design, although I don't find something explicit in the documentation that it cannot work with 2 workspace with different PVC 🤔 |
This seems strange. I thought we would only generate one affinity assistant for the pipeline run to effectively co-locate all pods that use the workspace(s) on one node. Perhaps the relationship is one affinity assistant per PVC? @jlpettersson can you help out here |
The current implementation creates an affinity assistant per PVC in the PipelineRun. But an alternative is to create an affinity assistant per PipelineRun. They have different pros and cons. I elaborated about this in the new design doc Task parallelism when using workspace And the placeholder pod also mount the PVC and "repels" from other placeholder pods - with this, two PVCs can not be mounted by a PVC. If changed to create an affinity assistant per PipelineRun it would be slightly different, but two different PipelineRuns trying to mount the same PVC could not be executed at the same time - which I think is the main reason to use more than one PVC? |
Issues go stale after 90d of inactivity. /lifecycle stale Send feedback to tektoncd/plumbing. |
Stale issues rot after 30d of inactivity. /lifecycle rotten Send feedback to tektoncd/plumbing. |
Rotten issues close after 30d of inactivity. /close Send feedback to tektoncd/plumbing. |
@tekton-robot: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
We are using v0.23.0 and what we are seeing is that we get one affinity assistant per "use" of a PVC. See here and here (it's the same PVC mounted with 2 different subPaths). If it helps I can send the final rendered PipelineRun so you don't need to figure it out from the Golang code. Is this expected? I see that at some point it didn't pass validation, but that was fixed. I think the correct behavior is, to also check when creating the affinity assistant and only create one instance per PVC (not per "mount"). In our case we had to disable the affinity assistant. Although we only had one PVC, the 2 assistants sometimes landed on different nodes and this failed with an error. See this commit for more: epinio/epinio@ef6d457 |
@jimmykarily I think you are right. The affinity assistant only looks at the combination of Workspace Name and PipelineRun Name, as per https://github.com/tektoncd/pipeline/blob/main/pkg/reconciler/pipelinerun/affinity_assistant.go#L56 So if you have two Workspaces, with the same PVC but different subPath set in WorkspaceBinding - it will not work as you correctly point out. On the other hand, this would not allow for two concurrent runs of the same pipeline, as they use the same PVC (possibly "pinned" to different nodes) - but if the AA could tolerate different subpaths on the PVC as you suggests, it would allow for parallel Task execution within the same PipelineRun. You could open a feature request about this and it could be better documented. Thanks for the comment. |
@jlpettersson you are right about the concurrent runs of the pipeline and we indeed avoid this situation in our code. In our case this makes sense because we use Tekton for application staging and we don't allow 2 staging processes to run for the same app at the same time. That said, this limitation comes from the fact that we are using a pre-existing PVC and having 2 affinity assistants doesn't help with that in any way. Actually the 2 assistants prevent the pipeline from running when they land on a different node.
or
I could open a PR that updates the documentation (option #2) but I'm not sure I can get the other proposed solution done fast. I would rather open a new issue for that. |
Expected Behavior
Pipelines that use multiple workspaces in a TaskRun should run when affinity assistants are enabled
Actual Behavior
The underlying pods fail to schedule due to multi-attach errors
Steps to Reproduce the Problem
Here's a sample pipeline that reproduces this issue:
If applied to a multi-node cluster, I see this error within the pod's events:
It seems like the root cause is that the affinity assistants were scheduled onto different nodes. These are the pods that I'm seeing:
Additional Info
Kubernetes version:
Output of
kubectl version
:Tekton Pipeline version:
The text was updated successfully, but these errors were encountered: