-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Network retry issue 1602 #2396
Network retry issue 1602 #2396
Conversation
Thanks for the PR @sbeckeriv! Some thoughts of mine:
|
Thanks for the reply. I will move the net_retry logic in to the utils config object. For the logic it self I believe I know how to do this. I am going to write a function that takes a closure and handles the logic. I will figure out a white list enum too.
I think with the closure concept I could push the logic down the stack, however, that would not blanket all network calls. Do you have some thoughts on how I might go about that? I think most network calls are else where crates. I could update curl-rust to support a retry option pretty easy. Make it an http setting like proxy and timeout. I read that libcurl does not directly support it. I need to read more around the ssh side. Let me know your thoughts. |
Nah I can help audit after the infrastructure in place, and then I can also help to ensure that we always push network requests through the same location. I'd also recommend that even if this were an option in curl-rust that we'd still do it in Cargo as we'd likely want to give a UI update as part of it as well anyway. |
Before I get to far thoughts on these changes: Wrapped retry logic: Thanks! It was fun to figure out the with_retry. |
Yeah, that looks great to me! I think we may want to have another callback for "was this error one we can retry from", but other than that it looks like the right infrastructure to me |
Hey, For the "was this error one we can retry from"; error handling is new to me. Should I pass it the CargoResult value or the unwrapped Err? What might that callback look like? How do I check its a Err I care about vs any other error? Thanks! |
@@ -134,7 +140,7 @@ impl<'cfg> PackageRegistry<'cfg> { | |||
Some(&(ref previous, _)) => { | |||
if previous.precise() == namespace.precise() { | |||
debug!("load/match {}", namespace); | |||
return Ok(()) | |||
return Ok(()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps you have rustfmt on auto-run? Could you back out these changes for now? They tend to clutter up the diff unfortunately :(
I think that it may be good to introduce a new trait: trait NetworkError {
pub fn may_be_spurious(&self) -> bool;
} That way you can define the retry function with a |
☔ The latest upstream changes (presumably #2397) made this pull request unmergeable. Please resolve the merge conflicts. |
I pushed down the network call in to the checkout command. I added the NetworkError trait. I am unsure how to apply the bounds in the retry method. I added it as a trait bound to T but the T is the return Ok value for the cargo result not the error. Any advice or examples of how to do this would be great! Thanks again |
/// Example: | ||
/// use util::network; | ||
/// cargo_result = network.with_retry(&config, || something.download()); | ||
pub fn with_retry<T, F>(config: &Config, mut callback: F) -> CargoResult<T> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the signature you want for this method is something along the lines of:
pub fn with_retry<T, E, F>(config: &Config, mut callback: F) -> CargoResult<T>
where F: FnMut() -> Result<T, E>,
E: NetworkError + CargoError
This would then match
on the result of callback
and call maybe_spurious
to see whether it is indeed a spurious error or not.
@sbeckeriv yeah the point of the bounds was to highlight the fact that |
The bounds make perfect sense now. I pushed the config down to the bottom fetch call. I am still getting "mismatched type" errors on the result I am returning. The generics at this level are beyond me. The error text https://gist.github.com/sbeckeriv/3866bd2ab7241da32b7c I have tried many combinations. I thought the error came from how I initialized the result value but I removed that. The normal human err is does not impl network error. I then thought it was because the shell command internally could cause an error but I removed the try with no luck. Do i need to rebuild the result or implement a from? Thanks again, |
No worries! Here's how I might write this function: pub fn with_retry<T, E, F>(config: &Config, mut callback: F) -> CargoResult<T>
where F: FnMut() -> Result<T, E>,
E: NetworkError
{
let mut remaining = try!(config.net_retry());
loop {
match callback() {
Ok(ret) => return Ok(ret),
Err(ref e) if e.maybe_spurious() && remaining > 0 => {
// print to shell
remaining -= 1;
}
Err(e) => return Err(Box::new(e)),
}
}
} Perhaps that can help you out? |
That does! When I first started I decomposed but I went the other way for reasons I do not remember. I need to update my test but it compiles now. Thanks! |
I have the retry at the lowest level fetch. I think this covers all the git actions. I am looking in to the interactions with the registry. |
As I figure out the code I am not sure I want to extern cargo in to crates-io. I am guessing there are separated in to there own project for a reason. Now I would require Config and network to cross back in to creates. I could just copy the concept but there is no config system for cargo-io helpers. Maybe this is figured out in another pull request? Let me know your thoughts. Thanks again! |
} | ||
|
||
impl NetworkError for git2::Error { | ||
fn maybe_spurious(&self) -> bool { true } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For git errors this may want to at least check the class
of the error for Net
, and perhaps look at the code
as well to ensure it's not a known non-spurious error.
Looking good! I think though that this may not catch where we download from the registry (using curl), just git so far. API calls with crates.io also look like they may need instrumentation as well, and you can probably just implement |
I updated the error checking. For git2::ErrorCode its not clear which errors relate to network activity. Maybe EOF for a disconnect? In my test I found that not connecting to a host gave an git2::ErrorClass::Os error. I am checking for both Os and Net now. Maybe my test is invalid? For curl errors I left a comment to what I believe the number maps to based off of this website https://curl.haxx.se/libcurl/c/libcurl-errors.html Thanks again for all the help. |
.file("src/lib.rs", ""); | ||
.file("src/lib.rs", "").file(".cargo/config", r#" | ||
[net] | ||
retry=0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In theory these modifications to tests shouldn't be necessary. All of these are indicative of errors that can never be resolved, so they shouldn't print out warnings anyway
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. These are leftover from before the trait. I should be able to remove these.
@@ -1,3 +1,4 @@ | |||
extern crate curl_sys; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this be added to the crate root?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved it to src/cargo/lib.rs I think thats the "root". Let me know if that is wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yeah that's it!
Looking good @sbeckeriv, thanks! |
Err(ref e) if e.maybe_spurious() && remaining > 0 => { | ||
try!(config.shell() | ||
.warn(format!("{} {} tries remaining", e, | ||
remaining))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps this could be along the lines of:
let msg = format!("spurious network error ({} tries remaining): {}", remaining, e);
try!(config.shell().warn(msg));
Awesome! Just a minor tweak to some wording, and then perhaps some squashing? Other than that r=me |
4fc800d
to
b3de4c2
Compare
@alexcrichton I made it one commit. Thanks for working with me and all the help! I learned a lot. Becker |
Ok(ret) => return Ok(ret), | ||
Err(ref e) if e.maybe_spurious() && remaining > 0 => { | ||
let msg = format!("spurious network error ({} tries \ | ||
remaining): {}", remaining, e); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the remaining count here is actually remaining - 1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The retry config is 2 by default.
count is 2
I run once and fail i output
2 tries remains
subtract 1 count is 1
run a second time
1 tries remaining
subtract 1 count is 0
retry again
just error.
The retry count is the count of retries not the count of total tries. I can subtract one from the config or update the docs to be explicitly clear it is the count of retries and not the total network tries?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah right yeah that makes sense, nevermind!
Looks like the Windows CI is failing for non-spurious reasons? |
b3de4c2
to
2f04d00
Compare
@bors: r+ 2f04d00ae2b5ddce01b7f5890e5311e287ee198f |
⌛ Testing commit 2f04d00 with merge e2648ce... |
💔 Test failed - cargo-win-msvc-64 |
Sorry I dont have access to a windows box right now. I might tonight. Does windows not load the .cargo/config file or does it output extra information? |
It should load that, yeah, but I suspect the problem here is that the exact error message is slightly different. It should be possible to just use |
Dearest Reviewer, This branch resolves rust-lang#1602 which relates to retrying network issues automatically. There is a new utility helper for retrying any network call. There is a new config called net.retry value in the .cargo.config file. The default value is 2. The documentation has also been updated to reflect the new value. Thanks Becker
2f04d00
to
7b03532
Compare
Works locally. Lets see! |
Network retry issue 1602 Dearest Reviewer, This branch resolves #1602 which relates to retrying network issues automatically. There is a new utility helper for retrying any network call. There is a new config called net.retry value in the .cargo.config file. The default value is 2. The documentation has also been updated to reflect the new value. Thanks Becker
☀️ Test successful - cargo-cross-linux, cargo-linux-32, cargo-linux-64, cargo-mac-32, cargo-mac-64, cargo-win-gnu-32, cargo-win-gnu-64, cargo-win-msvc-32, cargo-win-msvc-64 |
Dearest Reviewer,
This branch resolves #1602 which relates to retrying network
issues automatically. There is a new utility helper for retrying
any network call.
There is a new config called net.retry value in the .cargo.config
file. The default value is 2. The documentation has also been
updated to reflect the new value.
Thanks
Becker