Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🌱 Improve release staging build speed #9536

Merged

Conversation

cahillsf
Copy link
Member

What this PR does / why we need it:
Last release cut (slack thread for ref) we were blocked pretty early on in the process waiting for the post-cluster-api-push-images job to complete. We cut two patch releases (1.4.6 and 1.5.1) with each taking ~28 minutes to complete.

This PR improves the speed of the release process by:

  • splitting the original make command into 3 sub-makes to enable parallelization within the cloudbuild job
  • passing -j 8 to the make command to parallelize each sub-make and -O to unify the logging output for each job

The results of my local testing have been promising, with the most recent job with the current changes completing in under 6 minutes: https://a.cl.ly/5zubWldx

Here is some historical testing data from my sandbox:

  • builds with the 3 make targets split into 3 separate targets (3 steps defined in the cloudbuild) with varying -j option vals and machine sizes: https://a.cl.ly/6quw9NJX
  • local run overview from the original config (the job ran in 26 minutes which is consistent with the results from the last patch release cut): https://a.cl.ly/4guRAqQK

changes based on the conversation and work in this reverted PR: #9392

/area release

@k8s-ci-robot k8s-ci-robot added area/release Issues or PRs related to releasing cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Oct 10, 2023
@cahillsf
Copy link
Member Author

cc @furkatgofurov7

Copy link
Contributor

@killianmuldoon killianmuldoon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy to get this merged as an experiment - but as with the previous version we should be ready to roll back quickly if needed.

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: killianmuldoon

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 10, 2023
@@ -1058,7 +1058,9 @@ release-binary: $(RELEASE_DIR)

.PHONY: release-staging
release-staging: ## Build and push container images to the staging bucket
REGISTRY=$(STAGING_REGISTRY) $(MAKE) docker-build-all docker-push-all release-alias-tag
REGISTRY=$(STAGING_REGISTRY) $(MAKE) docker-build-all
REGISTRY=$(STAGING_REGISTRY) $(MAKE) docker-push-all
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

am I correct to understand that docker-push-all and docker-build-all won't and can't run in parallel (because push depends on build)? if so which parts are parallelized by using the -j option? Is it the docker build commands themselves?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

am I correct to understand that docker-push-all and docker-build-all won't and can't run in parallel (because push depends on build)?

yes that's true. in one iteration of the previous PR i got around this with the suggestion of mdbooth by defining "docker-build-% as a dependency of docker-push-%". but this also introduced some complexity in the push target that we decided was not worth the tradeoff

if so which parts are parallelized by using the -j option? Is it the docker build commands themselves?

yep exactly, the default is to run one job at a time, by passing -j 8 each sub-make (e.g. docker-build-all, docker-push-all) make will run 8 jobs in parallel

@CecileRobertMichon
Copy link
Contributor

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 11, 2023
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: 929d2b0e08e5fb25b720878fe2307ad2ffc12635

@k8s-ci-robot k8s-ci-robot merged commit 11dfbdd into kubernetes-sigs:main Oct 11, 2023
14 checks passed
@k8s-ci-robot k8s-ci-robot added this to the v1.6 milestone Oct 11, 2023
@cahillsf
Copy link
Member Author

looks like the first job run with the changes was a success, completed in less time than the average of recent runs but not a huge improvement.

there is still the variable queue time within Cloudbuild that i assume is leading to some variation in the build times, will be interesting to see more data in the subsequent jobs

@cahillsf
Copy link
Member Author

taking a sample of the last 15 runs (as of right now) since the change, the jobs have been completing in ~21 minutes. Looking at the 25 runs prior to that, the average was ~28 mins. decent improvement IMO, going to cherry-pick to the release branches so we have this for the next cycle

/cherry-pick release-1.4

@k8s-infra-cherrypick-robot

@cahillsf: new pull request created: #9554

In response to this:

taking a sample of the last 15 runs (as of right now) since the change, the jobs have been completing in ~21 minutes. Looking at the 25 runs prior to that, the average was ~28 mins. decent improvement IMO, going to cherry-pick to the release branches so we have this for the next cycle

/cherry-pick release-1.4

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@cahillsf
Copy link
Member Author

/cherry-pick release-1.5

@k8s-infra-cherrypick-robot

@cahillsf: new pull request created: #9555

In response to this:

/cherry-pick release-1.5

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/release Issues or PRs related to releasing cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants