Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ci: print log on test failure and clean cache #262

Closed
wants to merge 1 commit into from

Conversation

gabriele-0201
Copy link
Contributor

@gabriele-0201 gabriele-0201 commented Feb 27, 2024

cleaning cargo cache between runs is necessary to avoid using binaries from different prs. The main function of cargo-cache does is removing git repos that have been checked out

cargo install cargo-cache --no-default-features --features ci-autoclean should only be done in the gha docker file. Currently, it is also being executed in the GitHub workflow because the runner needs to be manually updated to make these changes. The command will be removed from the workflow in a follow-up PR once the runner is updated

Copy link
Contributor Author

Current dependencies on/for this PR:

This stack of pull requests is managed by Graphite.

@gabriele-0201 gabriele-0201 force-pushed the gab_ci_print_log_and_cache_clean branch 2 times, most recently from 8e8ed0e to d2ef598 Compare February 27, 2024 19:38
Copy link
Contributor

@rphmeier rphmeier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does cleaning the cargo cache not cause a complete rebuild from scratch?

Our CI times will go through the roof unless I misunderstand. Though it seems unavoidable without setting up something like sccache.

Copy link
Contributor

@pepyakin pepyakin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand the potential benefit of it in case of a normal GHA. There, each runner starts without any state and will have to download, uncompress the cache. Whenever it is done with it, it will update the cache, compress it and upload. Reducing the cache size and storing only necessary things that are slow to download or rebuild can indeed improve the CI performance.

It doesn't apply to our situation: we do not rely on the GHA provided machines and we are using our own. As such, we have plenty of cache which is available immediatelly.

What am I missing?

@gabriele-0201
Copy link
Contributor Author

we have plenty of cache which is available immediatelly

yes, but sharing the cache seems to cause another problem: crates like ikura-shim are not recompiled between runs. Yesterday, while running CI on #251, I encountered the shim version of #260. It seems that a shared $CARGO_HOME may be preventing cargo from recognizing and using the new binary versions, sticking to the old ones instead. I need a method to instruct cargo to recompile only the binaries in our repository while using the cached crates.

Following this guide, I first tested cargo-cache, which offers a seamless way to clear cache on CI with the cargo-cache --autoclean flag. This command simplifies the cleanup by removing checkouts while preserving necessary original files for disk reconstruction.

Initially uncertain about its impact:

  1. I wondered if it would trigger a full recompilation given our dependencies from the polkadot-sdk branch
  2. I questioned whether it would eliminate all locally stored binaries (and cause the recompilation of those)

After conducting the test, I found that CI times remained consistent at 5 minutes, regardless of using cargo cache, and the correct shim version was compiled. Now that I understood better what cargo-cache does I'm more confused that before, I really don't know why it worked.

I aim to understand where the local workspace stores incremental build data and how it influences binary recompilation (deleting items from git checkouts may not provide any benefits)

@gabriele-0201 gabriele-0201 force-pushed the gab_ci_print_log_and_cache_clean branch from d2ef598 to adb8339 Compare February 28, 2024 17:39
@pepyakin
Copy link
Contributor

Ok, this is a bit more clear, however I am still not convinced in that it's a good solution. Specifically, I don't understand exactly the reason of failure and how usage of cargo cache mends it.

TBH, I have a suspicion that you are on the wrong path. My reasoning is based on the following:

  1. cargo home mainly stores binaries by cargo install and rustup and sources from crates.io and git.
  2. it's not clear how you came to conclusion that you got the wrong binary exactly. However, I assume that you ran it using cargo run, not cargo install. Therefore, the $CARGO_HOME/bin shouldn't help.
  3. ikura-shim is also not a git/crates.io dependency, so cleaning the source parts of the cargo cache should not help either.

So it doesn't look like this is the root cause. Maybe, removing those caches somehow invalidates the build and triggers rebuilding, or something along those lines?

I aim to understand where the local workspace stores incremental build data and how it influences binary recompilation (deleting items from git checkouts may not provide any benefits)

it shouldn't really matter whether it's local workspace or anything else, or incremental or not, the compiled artifacts are stored under CARGO_TARGET, which is configured to be /cargo_target on the runner.

@gabriele-0201
Copy link
Contributor Author

however I am still not convinced in that it's a good solution

I agree, initially I thought cargo-cache was doing something different.

removing those caches somehow invalidates the build and triggers rebuilding, or something along those lines?

Yes, probably that's what happens. Yesterday, it worked only by chance. On the second push, it stopped working.

I will close this PR, fix the last things for #251, and then go back to this problem

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants