Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pre-heat the artifact caching proxy #4095

Merged

Conversation

MarkEWaite
Copy link
Contributor

@MarkEWaite MarkEWaite commented Dec 13, 2024

Pre-heat the artifact caching proxy

Try to detect issues with repo.jenkins-ci.org or with artifact caching proxy earlier when a failed build is less expensive. Will not detect every issue, but is hoped to detect some issues before the parallel stage starts.

Attempt to detect artifact caching proxy failures earlier in the build process by downloading the artifacts during the prep stage. The prep stage may not need the dependencies, but it is less expensive if the download fails early in the build process rather than failing later during plugin compatibility testing.

This is flawed because it does not account for the number of artifact caching proxy instances that may be running. It might pre-heat only one of the caches, but hopefully that is better than nothing.

Discussed in 10 Dec 2024 Jenkins infrastructure meeting

Interested in comments from @dduportal and @basil in case this is a waste of effort or a flawed implementation. Does not need to be merged before the weekly release, though it should be harmless, using some extra disc space on the agent that is running prep.sh.

Testing done

Confirmed that mvn -ntp dependency:go-offline has expected output on my computer.

Pre-heat? Time Source
No pre-heat 16:30 Another PR
No pre-heat 18:16 flyway-api
No pre-heat 15:41 commons-compress-api
No pre-heat 15:37 database
No pre-heat 16:00 byte-buddy
No pre-heat 17:19 Apply developer label to dependency updates
No pre-heat 16:59 ssh build agents
No pre-heat 17:15 AWS SDK v2
No pre-heat 18:45 Folders
No pre-heat 19:46 SAML
With pre-heat 23:35 This PR build 1
With pre-heat 17:12 This PR build 2
With pre-heat 18:10 This PR build 3
With pre-heat 21:06 This PR build 4
With pre-heat 17:12 This PR build 5
With pre-heat 17:05 This PR build 6
With pre-heat 18:44 This PR build 7
With pre-heat 17:34 This PR build 8
With pre-heat 17:20 This PR build 9
With pre-heat 21:21 This PR build 10
With pre-heat 17:34 This PR build 11

Submitter checklist

  • Make sure you are opening from a topic/feature/bugfix branch (right side) and not your main branch!
  • Ensure that the pull request title represents the desired changelog entry
  • Please describe what you did
  • Link to relevant issues in GitHub or Jira
  • Link to relevant pull requests, esp. upstream and downstream changes
  • Ensure you have provided tests - that demonstrates feature works or fixes the issue

Attempt to detect artifact caching proxy failures earlier in the build
process by downloading the artifacts during the prep stage.  The prep
stage may not need the dependencies, but it is less expensive if the
download fails early in the build process rather than failing later
during plugin compatibility testing.

This is flawed because it does not account for the number of artifact
caching proxy instances that may be running.  It might pre-heat only
one of the caches, but hopefully that is better than nothing.
@MarkEWaite MarkEWaite requested a review from a team as a code owner December 13, 2024 00:34
Argument was already used elsewhere as `-ntp` so it should use the
same form here.
@MarkEWaite MarkEWaite marked this pull request as draft December 13, 2024 00:49
@MarkEWaite MarkEWaite changed the title Pre-heat the artifact caching proxy [WIP] Pre-heat the artifact caching proxy Dec 13, 2024
@dduportal
Copy link
Contributor

At first sight, this looks like a nice compromise to limit bom releases to be slowed down due to ACP (at least until we stabilizes it). Additionally, it will protect BOM builds when the migration of ci.jenkins.io will happen.

This is at the cost of adding ~15 to ~20 min time on the prep.sh stage. The infrastructure cost added by this PR is nothing compared to the cost of retrying the 2 hours builds.

With my Infra Officer hat, I believe this is a fine addition. But I would want to hear from other BOM maintainers.

A few notes/nit picks (not mandatory to apply, only food for though):

  • Reminder that this PR should not exist if ACP was stable enough. Long term? We'll see re-evaluate the stability after migration to AWS (closer to Artifactory).
  • Even if ACP has 2 replicas (with 2 distinct disk caches), this PR is fine as it will be run on both PR and main branch, reducing the probability of failure.
  • Potential optimization: the "pre-heath" step could be a stage, parallel to prep (I'm not aware of any prep stage failing due to deps)
  • Another further try if this PR is not merged or not giving expected results: I would be interested in the results of staching/unctashing the .m2 cache after this dependency resolution. It would have add a strain on the controller, but it's using a local NVMe (both in azure today and in AWS as well) and is network-close to the agents (both in Azure and AWS).

@MarkEWaite
Copy link
Contributor Author

MarkEWaite commented Dec 13, 2024

This is at the cost of adding ~15 to ~20 min time on the prep.sh stage. The infrastructure cost added by this PR is nothing compared to the cost of retrying the 2 hours builds.

Thanks! As far as I can tell from comparing builds of this pull request with builds of other pull requests that only run the prep.sh step, the most I've seen it add was 7 minutes when a 16 minute run of prep.sh without pre-heat became a 23 minute run on the first pre-heat attempt. Later pre-heat runs have ranged from 17 minutes to 21 minutes. The comparison times in the table are the duration of the prep.sh step on builds of this PR and other PR jobs.

@MarkEWaite MarkEWaite marked this pull request as ready for review December 16, 2024 00:14
@MarkEWaite MarkEWaite changed the title [WIP] Pre-heat the artifact caching proxy Pre-heat the artifact caching proxy Dec 16, 2024
@MarkEWaite MarkEWaite added chore Reduces future maintenance and removed work-in-progress labels Dec 16, 2024
@MarkEWaite
Copy link
Contributor Author

Merging based on 3 approvals and data that shows average build time increases by less than 2 minutes with the mvn dependency:go-offline command added to the prep stage.

@MarkEWaite MarkEWaite merged commit d5101b3 into jenkinsci:master Dec 17, 2024
5 checks passed
@MarkEWaite MarkEWaite deleted the pre-heat-artifact-caching-proxy branch December 17, 2024 01:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
chore Reduces future maintenance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants