Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: ci: make ci more efficient #9910

Merged
merged 4 commits into from
Jan 13, 2023
Merged

feat: ci: make ci more efficient #9910

merged 4 commits into from
Jan 13, 2023

Conversation

geoff-vball
Copy link
Contributor

@geoff-vball geoff-vball commented Dec 19, 2022

Related Issues

Proposed Changes

  • Creates an initial build step that all commands can share so they don't all have to build themselves.
  • Uses medium+ sized containers for most things. Some jobs are kept on 2xl executors because of parallelism/RAM requirements. Medium+ uses 15 credits/min compared to 80/min for 2xl, and barely causes a slowdown in many of our tests. More info on executors here.
  • Did some general cleanup and refactoring.

Additional Info

You can look at the stats for a run of master as it is currently here.
You can look at the stats for this brach here.

Based on these results, we should see a decrease in credit usage of almost 75% on each successful run. In addition, I've decreased the timeout for tests from 30m to 20m, and rerunning failed tests will no longer have to rebuild lotus. This should get us some additional savings dealing with our flaky tests (they're next on my list to tackle though).

These changes only lead to a ci slowdown from about 9min to 11min, but will speed up re-running flaky tests. We can probably target our longer tests with more powerful executors if we want to get this time back down, but I think it's still completely in the realm of acceptable.

There is a chance that running on the less powerful executors might increase test flakiness. We can tune this on a test-by-test basis if we want. I'm hoping to tackle some of the flakiness directly so it's not even a consideration.

Checklist

Before you mark the PR ready for review, please make sure that:

  • Commits have a clear commit message.
  • PR title is in the form of of <PR type>: <area>: <change being made>
    • example: fix: mempool: Introduce a cache for valid signatures
    • PR type: fix, feat, build, chore, ci, docs, perf, refactor, revert, style, test
    • area, e.g. api, chain, state, market, mempool, multisig, networking, paych, proving, sealing, wallet, deps
  • New features have usage guidelines and / or documentation updates in
  • Tests exist for new functionality or change in behavior
  • CI is green

@geoff-vball geoff-vball force-pushed the gstuart/deps-caching branch 20 times, most recently from 6d4d43e to 9b0e21d Compare December 20, 2022 21:07
@geoff-vball geoff-vball marked this pull request as ready for review December 21, 2022 04:59
@geoff-vball geoff-vball requested a review from a team as a code owner December 21, 2022 04:59
Copy link
Contributor

@magik6k magik6k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Config changes look good.

However I'm not sure if the cost savings are worth slowing each CI iteration by 2min / 20%.

I'm not sure how much each run is in $ terms, but I'd be surprised if it's over single-digit-usd. Dev time is imo worth far more than that (math can/should be done for a more solid argument here).

@magik6k
Copy link
Contributor

magik6k commented Jan 4, 2023

(do note that you / others on the team will be staring at CI jobs hundreds of times in the next year, waiting to see if the test they wrote works; Minutes here turn into days/weeks of dev time)

@geoff-vball
Copy link
Contributor Author

@magik6k The pricing information I could find is here. The current CI uses about ~20k credits per run, which puts the cost per run around 12 USD. With our average usage over the last 90 days, we've been using about $6K/month (and most people were gone for 2 weeks).

Also note that in the happy path, this slows us down a bit, but it will actually speed up rerunning flaky tests because they won't have to build again.

@magik6k
Copy link
Contributor

magik6k commented Jan 13, 2023

Oooh, the faster re-run makes this objectively better.

@magik6k magik6k merged commit bb800fd into master Jan 13, 2023
@magik6k magik6k deleted the gstuart/deps-caching branch January 13, 2023 19:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants