-
Notifications
You must be signed in to change notification settings - Fork 895
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Race on removing setup file for self-update in GitHub Actions windows-latest worker #2415
Comments
This is a file locking error; it could be an anti virus error of some sort I suppose, though I don't know what github actions runs for that. I wonder if pwsh holds a file handle on the binary. Can you try with cmd or some other shell around the binary? Perhaps have a look at the action configuration we use - we don't see that issue in our configuration. |
OK, let me check if this issue occurs or not when using |
Add |
Yes, I've added the option and now this issue no longer occurs on my project. rhysd/wain@d6856e5 But I think this is just a workaround and it means I need to use old rustup on CI. |
GitHub Actions updates its images (including rustup) quite often. (6 weeks I think). |
But as I described in this PR, rustup version is |
I tried cmd.exe but this issue still seems to occur. |
Anyway, I'll continue to use |
I mean (not sure about the duration) 6 weeks cycle. The new rustup was released not longer than 6 weeks. |
@Darleev Rustup versions are archived at |
Hello,
The issue not only about agent machines, but it is also actual for common windows machines. So, I believe it could be an issue from Rustup side. Could you please take a look? |
I wonder if this is a reason to switch to rename-then-eventually-clean-up which we initially thought might be overkill @rbtcollins ? |
Hmm, these instructions seem to suggestion running rustup while rustup-init is still running, which we would expect to fail with the current code. But lets assume they are not suggesting that. The underlying problem is probably self_update::self_replace() which exits the parent immediately then does file manipulation that will fail if other processes are executing rustup. The thing that we need here is to able to deal with rustup being in use when we try to operate on it, which is the more complicated thing I've sketched out before: we need a way to tell those processes to exit, we need a retry loop to deal with new process execution attempts starting up while we're busy shutting things down. |
Mmm, I think I tried to prototype an approach which renamed things out of the way and then had a background "try and clean things up" which was repeated each time a rustup proxy or rustup was run, swallowing errors. i.e. instead of trying to remove/replace It switched out the "might fail to replace" with a "short period of time where there isn't a $foo.exe present" though, so we discarded it at the time. |
I've written up a holistic approach that details all the bits and should be implementable by anyone with some time and interest and a modicum of windows knowledge. #2441 - this subsumes this bug and other self update issues on windows. |
Fixes these type of errors: https://github.com/firezone/firezone/actions/runs/8973627864/job/24644251114 Using `--no-self-update` as recommended here: rust-lang/rustup#2415 (comment) Probably was a regression introduced by a version bump in the Github runner's Rustup or something
Problem
Recently, I noticed my project's CI often fails when preparing Rust toolchain when running GitHub Actions workflow on windows-latest worker.
https://github.com/rhysd/wain/runs/865787365?check_suite_focus=true
Here is an extracted log:
Steps
.github/workflows/ci.yml
and make commitci.yml
:Possible Solution(s)
I'm not sure the reason of the error. But it seems a race on removing setup file. I know that retry logic is already implemented to handle this, but retry seems not a sufficient solution. I don't have solid approach to fix this.
Notes
Output of
rustup --version
before the self-update:Output of
rustup --version
after the self-update (when success):Output of
rustup show
:Frequency of the error is not 100%. Today I ran the workflow 9 times and failed 5 times due to this error.
The text was updated successfully, but these errors were encountered: