-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: ci: make ci more efficient #9910
Conversation
6d4d43e
to
9b0e21d
Compare
9b0e21d
to
f0aac28
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Config changes look good.
However I'm not sure if the cost savings are worth slowing each CI iteration by 2min / 20%.
I'm not sure how much each run is in $ terms, but I'd be surprised if it's over single-digit-usd. Dev time is imo worth far more than that (math can/should be done for a more solid argument here).
(do note that you / others on the team will be staring at CI jobs hundreds of times in the next year, waiting to see if the test they wrote works; Minutes here turn into days/weeks of dev time) |
f0aac28
to
fd1e0ad
Compare
@magik6k The pricing information I could find is here. The current CI uses about ~20k credits per run, which puts the cost per run around 12 USD. With our average usage over the last 90 days, we've been using about $6K/month (and most people were gone for 2 weeks). Also note that in the happy path, this slows us down a bit, but it will actually speed up rerunning flaky tests because they won't have to build again. |
Oooh, the faster re-run makes this objectively better. |
Related Issues
Proposed Changes
Additional Info
You can look at the stats for a run of master as it is currently here.
You can look at the stats for this brach here.
Based on these results, we should see a decrease in credit usage of almost 75% on each successful run. In addition, I've decreased the timeout for tests from 30m to 20m, and rerunning failed tests will no longer have to rebuild lotus. This should get us some additional savings dealing with our flaky tests (they're next on my list to tackle though).
These changes only lead to a ci slowdown from about 9min to 11min, but will speed up re-running flaky tests. We can probably target our longer tests with more powerful executors if we want to get this time back down, but I think it's still completely in the realm of acceptable.
There is a chance that running on the less powerful executors might increase test flakiness. We can tune this on a test-by-test basis if we want. I'm hoping to tackle some of the flakiness directly so it's not even a consideration.
Checklist
Before you mark the PR ready for review, please make sure that:
<PR type>: <area>: <change being made>
fix: mempool: Introduce a cache for valid signatures
PR type
: fix, feat, build, chore, ci, docs, perf, refactor, revert, style, testarea
, e.g. api, chain, state, market, mempool, multisig, networking, paych, proving, sealing, wallet, deps