-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Retry git fetch operations when net.git_fetch_with_cli = true
is set
#9870
Conversation
This requires reading the `stderr` output of git matching it's output with a list of regular expressions. Unfortunatelly, this is heuristic, but the best we can do. The implementation is heavily inspired by [chromium git-retry script](https://chromium.googlesource.com/infra/infra/+/b9f6e35d2aa1ce9fcb385e414866e0b1635a6c77/go/src/infra/tools/git/retry_regexp.go#17) Stackoverflow mostly references python implementation (also in chromium codebase), however comments in python code say [it's deprecated](https://chromium.googlesource.com/chromium/tools/depot_tools.git/+/a3d1aaf112499b19d1e2aafe5c33dad3266a6226/git_common.py#77) and reference Go implementation, which this PR is based on. Some rough edges in this PR: - The ProcessBuilder used to spawn `git` process provides methods with `anyhow::Error` API which isn't very convenient for local error handling and leaves some margin for bugs - According to our experience with cargo failing on git CLI errors, not all patterns are covered by `chromium` impl (see the tests for examples), so I had to extend that list - Had to add `regex` crate as a dependency, however, it already is a transitive dependency of `cargo`, it's included via `env_logger`, `globset`, and `ignore` crates, so this doesn't increase the dependency tree
Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @ehuss (or someone else) soon. Please see the contribution instructions for more information. |
#[test] | ||
fn git_cli_error_is_spurious() { | ||
// This test is important because it also verifies that the regular | ||
// expressions used to match git CLI transient errors are compilable, | ||
// which is checked only at runtime. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, git diff is not smart enough here.
The actual change is that tests were moved under #[cfg(test)] mod tests
and this new test was added
Hey, @alexcrichton, I believe you are also a good candidate for the review here, since you seem to be an author of the code changed in this PR. |
// The test data is taken from real git failures | ||
assert_retryable( | ||
"fatal: unable to access 'org/repo.git': Failed to connect to github.com port 443: Timed out", | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a rustfmt
bug here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this! Could you also add some tests about errors not being retry-able?
/// The inner error must by constructed from [`ProcessError`] that | ||
/// was returned as a result of running `git` CLI operation. | ||
/// If it isn't true, the code won't panic, but [`Self::is_retryable()`] | ||
/// will always return `false`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of listing this as an invariant I think it might be better to do the downcast on construction of a GitCliError
and have the internals stored as a ProcessError
directly. This constructor could then return Result<Self>
or continue to return Self
and assert that the downcast succeeds.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am a bit scared to make this an assert!()
, because I added only unit tests that don't invoke the code of fetch_with_git_cli()
. I guess I should add a mocked binary implementation of git
to verify this works more rigorously. Otherwise, I am afraid that this code is invoked very randomly and catching potential panics of cargo in the asserts will be non-deterministic =(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This API was just added here so I think it's fine to have an assert here. It's true, though, that git-fetch-with-cli
is not heavily tested in Cargo's CI. If you'd like, though, then you could make the constructor argument here ProcessError
and then when this is invoked you can use error.downcast()?
to automatically propagate non-ProcessError
errors and only pass in the process error here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, thanks for the suggestions, I'll try to figure this out!
Ok(()) | ||
} | ||
|
||
/// Run the given command assuming it's spawning `git` CLI process | ||
/// retrying all the possible transient errors it may fail with. | ||
fn run_git_cli_with_retries(config: &Config, cmd: &ProcessBuilder) -> CargoResult<()> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we'll have much use for running this in multiple places, so it should be fine to fold this definition into the function above.
pub fn is_retryable(&self) -> bool { | ||
const GIT_TRANSIENT_ERRORS: &[&str] = &[ | ||
// Error patterns are taken from chromium. | ||
// Please, keep them 1-to-1 exactly the same as in their source code |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that we need to be so strict here. For example I don't think that we'll be looking into their source code every-so-often to see if these lists match, and it's not like someone should send a PR to chromium before updating the list here.
.as_deref() | ||
.map(|stderr| GIT_TRANSIENT_ERRORS_RE.is_match(&String::from_utf8_lossy(stderr))) | ||
.unwrap_or_else(|| { | ||
log::warn!( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is something we'll want to assert!
in the constructor of this error because otherwise no one will ever see this message.
|
||
let source = &format!("(?i){}", source); | ||
|
||
regex::Regex::new(source).expect("BUG: invalid git transient error regex") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this may be a use-case for RegexSet
?
Just a short note: I am not gone, I am going to return back to this a bit later (have a lot of stuff in my current project going on), if anyone wants to take over, feel free to do so, but I'll try to finish this anyway |
☔ The latest upstream changes (presumably #10265) made this pull request unmergeable. Please resolve the merge conflicts. |
Closing due to inactivity. We probably don't have capacity to accept a change of this kind at this time. |
Closes #9820
This requires reading the
stderr
output of git matching it's output with a list of regular expressions.Unfortunately, this is heuristic, but the best we can do. The implementation is heavily inspired by chromium git-retry script
Stackoverflow mostly references python implementation (also in chromium codebase), however, comments in python code say it's deprecated and reference Go implementation, which this PR is based on.
Some rough edges in this PR:
git
process provides methods withanyhow::Error
API which isn't very convenient for local error handling and leaves some margin for bugs. It usesProcessError
under the hood, having access to it would be handy, but I didn't want to break the error type tendency in the API of this structchromium
impl (see the tests for examples), so I had to extend that listregex
crate as a dependency, however, it already is a transitive dependency ofcargo
, it's included viaenv_logger
,globset
, andignore
crates, so this doesn't increase the dependency tree.