🌱 Add separate concurrency flag for cluster cache tracker #9116

chrischdi · 2023-08-03T15:00:34Z

What this PR does / why we need it:

Adds a separate flag --clustercache-tracker-concurrency to allow adjusting the concurrency of the created cluster cache tracker to all relevant executables: CAPI, CAPBK, KCP, CAPD.

This came up while adding the clustercache tracker to CAPV in this discussion.

Before this PR the other concurrency flags got re-used to also set the concurrency for the clustercache tracker.

Example from @sbueringer why this change makes sense:

Let's make an example. You have a mgmt cluster with 2k+ workload clusters. The cluster controller is doing some heavy lifting so you would want to use more workers (for 2k 10 should be still fine but it's just an example :D).

So if you want to increase the concurrency of the cluster controller you don't want the side effect that the cluster controller of the cluster cache tracker also gets more workers.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #

bootstrap/kubeadm/main.go

sbueringer · 2023-08-03T15:15:22Z

cc @vincepri @killianmuldoon

bootstrap/kubeadm/main.go

chrischdi · 2023-08-04T06:33:41Z

With marking it as deprecated, the old flag won't be visible anymore on -h. Defining it does not prevent the controller to run, instead the following is printed at the start:

❯ go run ./bootstrap/kubeadm --cluster-concurrency 1
Flag --cluster-concurrency has been deprecated, This flag has no function anymore and is going to be removed in a next release. Use "--clustercachetracker-concurrency" insteaed.
I0804 08:32:11.242317   50353 listener.go:44] "controller-runtime/metrics: Metrics server is starting to listen" addr="localhost:8080"
I0804 08:32:11.248101   50353 webhook.go:158] "controller-runtime/builder: Registering a mutating webhook" GVK="bootstrap.cluster.x-k8s.io/v1beta1, Kind=KubeadmConfig" path="/mutate-bootstrap-cluster-x-k8s-io-v1beta1-kubeadmconfig"
...

furkatgofurov7

One nit/question, otherwise looks good to me, thanks

bootstrap/kubeadm/main.go

main.go

chrischdi · 2023-08-04T07:22:17Z

One nit/question, otherwise looks good to me, thanks

Sorry about forgetting to push my commit 🤦 ☕

sbueringer · 2023-08-04T11:11:49Z

Restarted the linter

bootstrap/kubeadm/main.go

sbueringer · 2023-08-04T16:12:42Z

/lgtm
/approve

/hold
Let's give folks a bit more time to review. I would merge on Monday as written in Slack

k8s-ci-robot · 2023-08-04T16:12:51Z

LGTM label has been added.

Git tree hash: 5d527676520673ea1c4a4f289c53fd2cdfe6108d

k8s-ci-robot · 2023-08-04T16:12:53Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: sbueringer

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [sbueringer]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

killianmuldoon

/lgtm

killianmuldoon · 2023-08-04T18:45:07Z

/area clustercachetracker

furkatgofurov7

/lgtm

sbueringer · 2023-08-07T09:01:50Z

/hold cancel

vincepri · 2023-08-07T22:59:44Z

controlplane/kubeadm/main.go

@@ -134,6 +135,9 @@ func InitFlags(fs *pflag.FlagSet) {
 	fs.IntVar(&kubeadmControlPlaneConcurrency, "kubeadmcontrolplane-concurrency", 10,
 		"Number of kubeadm control planes to process simultaneously")

+	fs.IntVar(&clusterCacheTrackerConcurrency, "clustercachetracker-concurrency", 10,


clustercachetracker is an implementation detail that's now being exposed to users, could we find a name like child-cluster-concurrency or workload-cluster-concurrency?

I couldn't think of a better name that expresses what we do :). So I thought it's a good idea to use something that expresses ~ the name of the controller.

We can use something like workload-cluster-concurrency but it seems misleading to be honest. It's not actually the concurrency of how many workload clusters we can reconcile. Also all our clusters are workload clusters.

Is something like clustercache-concurrency better? It doesn't point directly to CCT but it still expresses somewhat that it's the concurrency of the cache we use for clusters.

@vincepri wdyt about dropping the flag and just hard-coding to 10? 10 should be enough anyway so there is no need for a flag.

I just wanted to avoid that the cct is using a lot of workers if someone needs more workers for the regular cluster reconcilers

vincepri · 2023-08-07T23:00:26Z

test/infrastructure/docker/main.go

@@ -135,6 +136,9 @@ func initFlags(fs *pflag.FlagSet) {
 	fs.IntVar(&concurrency, "concurrency", 10,
 		"The number of docker machines to process simultaneously")

+	fs.IntVar(&clusterCacheTrackerConcurrency, "clustercachetracker-concurrency", 10,


If this isn't set, could we actually default to concurrency? It makes some sense to keep these in sync

I think usually you wouldn't use the same number of workers. Usually you need more workers for a controller when they otherwise can't keep up with the load. This mainly depends on reconcile duration.

Looking at the Reconcile func of ClusterCacheReconciler, it will always return almost instantly. So even with >10k Clusters I don't think we need more than 10 workers. (With 1 ms reconcile duration, 10 workers can reconcile 10k clusters in 1s)

In fact the only reconciler that I had to run with more than 10 workers with 2k clusters was KubeadmControlPlane (because it had reconcile durations of ~ 1-2 seconds). In that case I used 50 workers.

So I think while we would have to increase the concurrency of the Cluster controllers at some point to above 10. The ClusterCacheReconciler would still be fine with 10

k8s-ci-robot requested review from elmiko and fabriziopandini August 3, 2023 15:00

k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Aug 3, 2023

sbueringer reviewed Aug 3, 2023

View reviewed changes

bootstrap/kubeadm/main.go Outdated Show resolved Hide resolved

chrischdi force-pushed the pr-separate-concurrency-cc-tracker branch from 022f1fb to 6700138 Compare August 3, 2023 15:37

sbueringer reviewed Aug 3, 2023

View reviewed changes

bootstrap/kubeadm/main.go Show resolved Hide resolved

furkatgofurov7 reviewed Aug 4, 2023

View reviewed changes

bootstrap/kubeadm/main.go Outdated Show resolved Hide resolved

main.go Show resolved Hide resolved

chrischdi force-pushed the pr-separate-concurrency-cc-tracker branch from 6700138 to cbf78d4 Compare August 4, 2023 07:22

chrischdi force-pushed the pr-separate-concurrency-cc-tracker branch from cbf78d4 to fb958c0 Compare August 4, 2023 07:24

sbueringer reviewed Aug 4, 2023

View reviewed changes

bootstrap/kubeadm/main.go Outdated Show resolved Hide resolved

Add separate concurrency flag for cluster cache tracker

807e0eb

chrischdi force-pushed the pr-separate-concurrency-cc-tracker branch from fb958c0 to 807e0eb Compare August 4, 2023 15:43

k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 4, 2023

k8s-ci-robot assigned sbueringer Aug 4, 2023

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 4, 2023

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 4, 2023

killianmuldoon reviewed Aug 4, 2023

View reviewed changes

k8s-ci-robot assigned killianmuldoon Aug 4, 2023

k8s-ci-robot added the area/clustercachetracker Issues or PRs related to the clustercachetracker label Aug 4, 2023

furkatgofurov7 reviewed Aug 4, 2023

View reviewed changes

k8s-ci-robot assigned furkatgofurov7 Aug 4, 2023

k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 7, 2023

k8s-ci-robot merged commit 3c5b047 into kubernetes-sigs:main Aug 7, 2023

k8s-ci-robot added this to the v1.6 milestone Aug 7, 2023

vincepri reviewed Aug 7, 2023

View reviewed changes

chrischdi deleted the pr-separate-concurrency-cc-tracker branch August 8, 2023 06:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

🌱 Add separate concurrency flag for cluster cache tracker #9116

🌱 Add separate concurrency flag for cluster cache tracker #9116

chrischdi commented Aug 3, 2023

sbueringer commented Aug 3, 2023

chrischdi commented Aug 4, 2023

furkatgofurov7 left a comment

chrischdi commented Aug 4, 2023 •

edited

Loading

sbueringer commented Aug 4, 2023

sbueringer commented Aug 4, 2023 •

edited

Loading

k8s-ci-robot commented Aug 4, 2023

k8s-ci-robot commented Aug 4, 2023

killianmuldoon left a comment

killianmuldoon commented Aug 4, 2023

furkatgofurov7 left a comment

sbueringer commented Aug 7, 2023

vincepri Aug 7, 2023

sbueringer Aug 8, 2023 •

edited

Loading

sbueringer Aug 8, 2023

vincepri Aug 7, 2023

sbueringer Aug 8, 2023 •

edited

Loading

🌱 Add separate concurrency flag for cluster cache tracker #9116

🌱 Add separate concurrency flag for cluster cache tracker #9116

Conversation

chrischdi commented Aug 3, 2023

sbueringer commented Aug 3, 2023

chrischdi commented Aug 4, 2023

furkatgofurov7 left a comment

Choose a reason for hiding this comment

chrischdi commented Aug 4, 2023 • edited Loading

sbueringer commented Aug 4, 2023

sbueringer commented Aug 4, 2023 • edited Loading

k8s-ci-robot commented Aug 4, 2023

k8s-ci-robot commented Aug 4, 2023

killianmuldoon left a comment

Choose a reason for hiding this comment

killianmuldoon commented Aug 4, 2023

furkatgofurov7 left a comment

Choose a reason for hiding this comment

sbueringer commented Aug 7, 2023

vincepri Aug 7, 2023

Choose a reason for hiding this comment

sbueringer Aug 8, 2023 • edited Loading

Choose a reason for hiding this comment

sbueringer Aug 8, 2023

Choose a reason for hiding this comment

vincepri Aug 7, 2023

Choose a reason for hiding this comment

sbueringer Aug 8, 2023 • edited Loading

Choose a reason for hiding this comment

chrischdi commented Aug 4, 2023 •

edited

Loading

sbueringer commented Aug 4, 2023 •

edited

Loading

sbueringer Aug 8, 2023 •

edited

Loading

sbueringer Aug 8, 2023 •

edited

Loading