-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore: Use correct commit hash as github cache key if submodule is not checked out #107
Conversation
Rust (x.py) will check out submodules when building rustc - at least iirc. Per discussion, postponed till the weekend. |
@@ -28,7 +28,7 @@ jobs: | |||
- uses: dtolnay/rust-toolchain@nightly |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for identifying the problem. Would a better solution not be to guarantee that the submodules are checked out as part of the checkout
action?
- uses: dtolnay/rust-toolchain@nightly | |
submodules: true | |
- uses: dtolnay/rust-toolchain@nightly |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We talked about this a bit more in Discord yesterday and yes, currently your git submodules are out of sync with their state in git. This PR will therefore produce an out of sync cache — the initial checkout should recurse into submodules.
We decided to postpone that change (and merging this PR) until after deadlines and papers are met, as it has the potential to cause a CI failure :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We came to basically the same conclusion yesterday you also came to now. I will reproduce the gist of the messages here, to also have them in GitHub.
The checkout action seems to not necessarily purge old state when it runs. Therefore, old versions of your source tree can still linger on the runner machines. In the case of Enzyme-Rust an old version of enzyme was checked out into src/tools/enzyme
at some point (Ref: 7e3e90f4287068a41d3fb5a99127ee2857353b04
). Normally the old versions are cleaned up by the checkout action which resets the state to whatever the user/pipeline specified. Here, however, submodules are skipped in the yaml action configuration and therefore the old submodules are never updated.
Consequently, the CI pipeline currently builds with random, existing versions of submodules that do not reflect the state in git. This is quite bad IMHO.
The cache is currently consistent with the state of the runner file tree, so GitHub will restore an old cache but the name of the cache at least accurately describes its contents. With this PR, the cache will be tagged correctly according to git, but as your local state is out of sync with git, this won't really improve things.
To fix this I will try to check out all submodules first and see if that is fast enough in the CI (llvm is a bit hefty, I am not sure how nicely the checkout action optimizes this with existing local state). This is the ideal solution as git, the cache and the local file system will all agree on a single version. If it is too slow we need to be a bit more granular.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. I recall seeing x build
check out the submodule automatically so I suspect we are currently building with the correct submodule, but it's tagged with a stale hash (that might even be from the parent repo -- we've seen at least one case where the llvm
and enzyme
caches use the same hash).
This will also initialize all the "doc" submodules, which might prove too much.
Previously, the CI workflow file used
git -C src/tools/enzyme
to try and execute arev-parse
inside the submodule. Unfortunately, this only works if that submodule is actually checked out and enzyme does not specify that. I am not quite sure where they are checked out at all, but clearly they are at some point.The commit hash chosen for the key #86 (comment) here is neither the new nor the old enzyme commit, so this evidently didn't quite work out.
This PR changes the lookup to refer to the submodule instead, which should work regardless of its status on disk.