feat: ci: make ci more efficient #9910

geoff-vball · 2022-12-19T16:24:19Z

Related Issues

Proposed Changes

Creates an initial build step that all commands can share so they don't all have to build themselves.
Uses medium+ sized containers for most things. Some jobs are kept on 2xl executors because of parallelism/RAM requirements. Medium+ uses 15 credits/min compared to 80/min for 2xl, and barely causes a slowdown in many of our tests. More info on executors here.
Did some general cleanup and refactoring.

Additional Info

You can look at the stats for a run of master as it is currently here.
You can look at the stats for this brach here.

Based on these results, we should see a decrease in credit usage of almost 75% on each successful run. In addition, I've decreased the timeout for tests from 30m to 20m, and rerunning failed tests will no longer have to rebuild lotus. This should get us some additional savings dealing with our flaky tests (they're next on my list to tackle though).

These changes only lead to a ci slowdown from about 9min to 11min, but will speed up re-running flaky tests. We can probably target our longer tests with more powerful executors if we want to get this time back down, but I think it's still completely in the realm of acceptable.

There is a chance that running on the less powerful executors might increase test flakiness. We can tune this on a test-by-test basis if we want. I'm hoping to tackle some of the flakiness directly so it's not even a consideration.

Checklist

Before you mark the PR ready for review, please make sure that:

Commits have a clear commit message.
PR title is in the form of of <PR type>: <area>: <change being made>
- example: fix: mempool: Introduce a cache for valid signatures
- PR type: fix, feat, build, chore, ci, docs, perf, refactor, revert, style, test
- area, e.g. api, chain, state, market, mempool, multisig, networking, paych, proving, sealing, wallet, deps
New features have usage guidelines and / or documentation updates in
- Lotus Documentation
- Discussion Tutorials
Tests exist for new functionality or change in behavior
CI is green

magik6k

Config changes look good.

However I'm not sure if the cost savings are worth slowing each CI iteration by 2min / 20%.

I'm not sure how much each run is in $ terms, but I'd be surprised if it's over single-digit-usd. Dev time is imo worth far more than that (math can/should be done for a more solid argument here).

magik6k · 2023-01-04T11:14:22Z

(do note that you / others on the team will be staring at CI jobs hundreds of times in the next year, waiting to see if the test they wrote works; Minutes here turn into days/weeks of dev time)

geoff-vball · 2023-01-10T18:03:54Z

@magik6k The pricing information I could find is here. The current CI uses about ~20k credits per run, which puts the cost per run around 12 USD. With our average usage over the last 90 days, we've been using about $6K/month (and most people were gone for 2 weeks).

Also note that in the happy path, this slows us down a bit, but it will actually speed up rerunning flaky tests because they won't have to build again.

magik6k · 2023-01-13T19:06:43Z

Oooh, the faster re-run makes this objectively better.

geoff-vball force-pushed the gstuart/deps-caching branch 20 times, most recently from 6d4d43e to 9b0e21d Compare December 20, 2022 21:07

geoff-vball marked this pull request as ready for review December 21, 2022 04:59

geoff-vball requested a review from a team as a code owner December 21, 2022 04:59

geoff-vball force-pushed the gstuart/deps-caching branch from 9b0e21d to f0aac28 Compare December 21, 2022 15:35

magik6k reviewed Jan 4, 2023

View reviewed changes

geoff-vball added 4 commits January 10, 2023 12:55

make build step

7c448af

Try medium+

f9404aa

Reorg some jobs

3bae3a3

rerun fails

fd1e0ad

geoff-vball force-pushed the gstuart/deps-caching branch from f0aac28 to fd1e0ad Compare January 10, 2023 17:55

magik6k approved these changes Jan 13, 2023

View reviewed changes

magik6k merged commit bb800fd into master Jan 13, 2023

magik6k deleted the gstuart/deps-caching branch January 13, 2023 19:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: ci: make ci more efficient #9910

feat: ci: make ci more efficient #9910

geoff-vball commented Dec 19, 2022 •

edited

Loading

magik6k left a comment •

edited

Loading

magik6k commented Jan 4, 2023

geoff-vball commented Jan 10, 2023

magik6k commented Jan 13, 2023

feat: ci: make ci more efficient #9910

feat: ci: make ci more efficient #9910

Conversation

geoff-vball commented Dec 19, 2022 • edited Loading

Related Issues

Proposed Changes

Additional Info

Checklist

magik6k left a comment • edited Loading

Choose a reason for hiding this comment

magik6k commented Jan 4, 2023

geoff-vball commented Jan 10, 2023

magik6k commented Jan 13, 2023

geoff-vball commented Dec 19, 2022 •

edited

Loading

magik6k left a comment •

edited

Loading