-
Notifications
You must be signed in to change notification settings - Fork 352
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Split builder from operator #583
Conversation
About "Support concurrent builds" we need to be careful as this may end up not reusing a context or not having a base image to reuse for incremental build. |
I'll continue working on improvements but I think that PR has met the original requirement and grown large enough for it to be reviewed. |
pkg/builder/kaniko/publisher.go
Outdated
@@ -111,6 +111,12 @@ func Publisher(ctx *builder.Context) error { | |||
args = baseArgs | |||
} | |||
|
|||
// The pod will be scheduled to nodes that are selected by the persistent volume |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How can it be shared with the kaniko pod if the kubernetes provider allows only for RWO volumes? Affinity was there to make the two pod coexist in the same node so that they can share the volume. Local volumes are interesting...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding is that the scheduling of the pods that use a persistent volume should be managed by the platform (Kubernetes). As we are using a PVC with the default storage class, it is up to the configured provisioner for the cluster to specify node affinity for the provisioned persistent volume. Then pods that use this PV will only be scheduled to nodes that are selected by the node affinity.
So multi-node clusters should be using local
volumes instead of hostPath
volumes. The provisioned local
volumes should have the node affinity defined for the node they've been provisioned on. This generalises for other kind of storage devices.
Single-node clusters like those managed by Minikube or MiniShift can still use hostPath
volumes. This is what the provisioner for the default storage class configured does actually.
In any cases, I don't think it is manageable to hard-code / handle the scheduling in the operator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For example, in a multi-node cluster, configured with a provisioner for the default storage class to use local
volumes, the build pod and the Kaniko pod will be scheduled on the same node as the node affinity defined for the local
volume provisioned on that node, as they both use the same PVC source hence the same volume.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I tell you what I remember from when I fixed the issue on GKE.
The operator was scheduled in any node and the volume was bound correctly. Whenever a build was started, a kaniko pod was created and it was randomly bound. It got bound only when the scheduler decided to assign the kaniko pod to the same node as the operator.
By adding affinity, I told the cluster where I wanted to have the kaniko pod, and it could always bind the volume.
That said... now I think we're removing the option of sharing data between operator and kaniko, so removing the affinity makes perfectly sense...
😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The build pods still need to share the build output with Kaniko. And the Kaniko warmer pod needs to share the cache as well to be effective.
I have to test on GKE, but it seems from your experience that the provisioner for the default storage class uses hostPath
.
In the long term, I think the affinity should be coming from the persistent volume and the scheduling managed by the platform, to support other kind of volumes and avoid the operator and the builds to be scheduled on a single node. But in the short term, until we do some more testing, I suggest:
- I add the affinity constraint back
- I set the build strategy to
pod
when thekaniko
publish strategy is set
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, right.. we still have 2 pods, even in pod mode.
Don't remember if it uses hostPath or another implementation (think more the latter), but I suggest that we do some testing on GKE before releasing next version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've just added back the constraint so it should operate as before. I'll investigate multi-nodes build support and the removal of the constraint eventually ASAP. I wonder how Knative build handles the build input / output...
Love the pod mode, that is compatible with knative build.. |
+1 |
Agreed. It should be quite straight forward with the existing building blocks. With that new builder service strategy, a deployment would be created and configured to delegate to the |
Fixes #125.
Todo:
Build
CRD controller to manage the build pod lifecycleIntegrationContext
andIntegration
CRs lifecycleNext possible items: