-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prepare for not defaulting to master branch for git deps #8522
Conversation
r? @Eh2406 (rust_highfive has picked a reviewer for you, use r? to override) |
r? @ehuss |
a1213b2
to
c18e51f
Compare
src/cargo/core/resolver/encode.rs
Outdated
@@ -578,6 +596,10 @@ impl<'a> ser::Serialize for Resolve { | |||
root: None, | |||
metadata, | |||
patch, | |||
version: match self.version() { | |||
ResolveVersion::V3 => Some(1), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't it rather start at 3? Would be less confusing IMO
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only for someone reading Cargo's source code, but for everyone else seeing changes to their Cargo.lock
for the first time it's probably much less surprising to start at 1 instead of 3.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The v2 format has changes to Cargo.lock
as well, in fact much larger ones. People wouldn't be seeing it for the first time. But ultimately this question is bikeshedding so I won't continue discussing it. Your choice. shrug.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we are bikeshedding it, I'd go for 3. I'm imagining a docs or blog post explaining the history of the lockfile format, for example the one explaining the corner cases where this brakes things.
"So now we get to the third change to the format. It is called
version=1
...
back in the first version (the one without a version specified not the one called "1")...
in the new third format (the one called "1" not the original.)"
We are making a complicated toppik harder to talk about. On the other hand if we start with 3, that blog post will have to explain what happened to 1 and 2. That feels like a strateforered one sentence explanation.
", which predates the version field."
Anyway defintly a bikeshed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's absolutely a bikeshed, but I think on balance it'll make Cargo a little more impenetrable for new contributors if the version numbers don't match. I think I'd rather see an explanation once for why we're starting with 3 (and what happened to 2), rather than having a mismatch for all time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm getting back into this sub-discussion for this one comment only. The format has changed prior to the "lockfiles 2.0" change. Some examples:
- 5430db6 has put checksums into Cargo.lock... yes there used to be a time when there were no checksums :p.
- Remove root from cargo lock #4571 has removed
[root]
- Marking Cargo.lock as generated #6180 has added a
@generated
comment at the top - Optimize lock file format for git merge conflicts #7070 added the new format commonly referred to as lockfiles v2
Not guaranteeing for completeness of this list. I'm not sure what the "perfect" version number should be. But if #7070 says that it introduced lockfiles v2, even if that was technically wrong, it's better to continue with 3 than jumping over version numbers or starting with 1 again.
I've also wondered about semver lockfile versions where the major version indicates compatibility breakage (like the one which this PR introduces) and the minor version indicates small changes that don't break older cargo's but cargo may not generate the lockfile 100% the same way, e.g. #6180 would be one such change. If cargo supports writing lockfiles with the major version but the minor version is unsupported, it should do a more thorough check than a strcmp and only update if needed (to prevent needless back and forth, and only have back and forth if there is a different reason to update the lockfile). Basically I propose getting control flow into the body of this if check if the minor version is unsupported but the major is. The reason why I haven't proposed this earlier was because I doubt that there could be minor version increases like the generated comment addition in the future. Maybe my imagination is limited, idk. The version can be made semver in the future as well, it'd require breaking support for old cargo, aka a major number increase, and mess up the nice error messages for old cargo (turn "this version isn't supported" into "expected number, got string"). It's really a bikeshed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've just gone ahead and flagged this as version = 3
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @alexcrichton for taking the time of writing this great PR! It's really great work! It looks like it's resolving the concerns from #8468 while still preparing for a future with support for non-master default branches.
From what I could gather, currently it doesn't support the new lockfile format (yet) in the sense that it fetches HEAD, right? If a cargo of this PR encounters a lockfile that a newer cargo built with a commit not on the master branch but in the main branch it'll still recognize that though, and only "break" on a cargo update, am I correct with this?
if head != master { | ||
config.shell().warn(&format!( | ||
"\ | ||
fetching `master` branch from `{}` but the `HEAD` \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This warning will fire in the case when someone has switched to the main branch since and the master branch is outdated, right? In that case, the suggestion might be wrong, maybe what the user wants is branch = "main"
. Cargo can't tell the new default branch but hinting at the possibility in the warning would be great (technically, cargo could list all branches and filter the ones which match with HEAD but that would require one would have to fetch basically all branches which is probably not something you'd want to do :p).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes the warning will fire in that case, but Cargo is not using HEAD
and is instead using the dependency as-if branch = "master"
was specified. The suggestion here is to write down in Cargo.toml
how to perform the same build in the future. It's true though that users may prefer to switch to branch = "something-else"
if their repository has both a master
and default other branch.
In general the warnings in this PR aren't great. I've attempted to at least get the bare minimum warnings emitted but honestly it's pretty difficult to get much fancier than that. This behavior of unifying dependencies and fetching is very deep within Cargo and is (at least from what I was doing) difficult to connect to "go change this line in the manifest". Much of this is that Cargo isn't preserving that kind of information to assist in debugging since this has come up rarely as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've thought about "or alternatively place branch = \"<main>\"
if you want to depend on the non-master main branch of a repository." or something like that, but it's very wordy and the warning is probably already a lot of text to sift through.
The resolver change to either unify or not unify |
Oh right there is the issue of projects having such hybrid uses as well, didn't think about it. How does that relate to version 3 lockfile formats?
What do you mean by that? Version 4 lockfile format? If cargo as of this pr unifies |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for putting in all this work!
src/cargo/sources/git/utils.rs
Outdated
//! longer compatible on stable and nightly with each other. The underlying | ||
//! issue is that Cargo was serializing `branch = "master"` *differently* on | ||
//! nightly than it was on stable. Historical implementations of Cargo | ||
//! desugared `branch = "master"` to having not dependency directives in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm having trouble reading this sentence. Can the wording bet tweaked somehow?
IIUC, it desugared branch = "master"
to not having a branch in the source
definition? I'm not sure what "dependency directives" means in this context.
My plan with Cargo is to eventually land a change which differentiates I don't think there's any need for a v4 format for this change, I plan to stop here. @ehuss updated some words! |
This commit is intended to be an effective but not literal revert of rust-lang#8364. Internally Cargo will still distinguish between `DefaultBranch` and `Branch("master")` when reading `Cargo.toml` files, but for almost all purposes the two are equivalent. This will namely fix the issue we have with lock file encodings where both are encoded with no `branch` (and without a branch it's parsed from a lock file as `DefaultBranch`). This will preserve the change that `cargo vendor` will not print out `branch = "master"` annotations but that desugars to match the lock file on the other end, so it should continue to work. Tests have been added in this commit for the regressions found on rust-lang#8468.
This commit implements a warning in Cargo for when a dependency directive is using `DefaultBranch` but the default branch of the remote repository is not actually `master`. We will eventually break this dependency directive and the warning indicates the fix, which is to write down `branch = "master"`.
This commit lays the groundwork for an eventual V3 of the lock file format. The changes in this format are: * A `version` indicator will be at the top of the file so we don't have to guess what format the lock is in, we know for sure. Additionally Cargo now reading a super-from-the-future lock file format will give a better error. * Git dependencies with `Branch("master")` will be encoded with `?branch=master` instead of with nothing. The motivation for this change is to eventually switch Cargo's interpretation of default git branches.
This commit implements a simple warning to indicate when `DefaultBranch` is unified with `Branch("master")`, meaning `Cargo.toml` inconsistently lists `branch = "master"` and not. The intention here is to ensure that all projects uniformly use one or the other to ensure we can smoothly transition to the new lock file format.
40fb733
to
7f65cae
Compare
I've gone over this several times, and I can't think of any issues that haven't already been highlighted. Thanks again Alex! |
📌 Commit 7f65cae has been approved by |
☀️ Test successful - checks-actions |
Thank you both! |
Update cargo 14 commits in aa6872140ab0fa10f641ab0b981d5330d419e927..974eb438da8ced6e3becda2bbf63d9b643eacdeb 2020-07-23 13:46:27 +0000 to 2020-07-29 16:15:05 +0000 - Fix O0 build scripts by default without `[profile.release]` (rust-lang/cargo#8560) - Emphasize git dependency version locking behavior. (rust-lang/cargo#8561) - Update lock file encodings on changes (rust-lang/cargo#8554) - Fix sporadic lto test failures. (rust-lang/cargo#8559) - build-std: Fix libraries paths following upstream (rust-lang/cargo#8558) - Flag git http errors as maybe spurious (rust-lang/cargo#8553) - Display builtin aliases with `cargo --list` (rust-lang/cargo#8542) - Check manifest for requiring nonexistent features (rust-lang/cargo#7950) - Clarify test name filter usage (rust-lang/cargo#8552) - Revert Cargo Book changes for default edition (rust-lang/cargo#8551) - Prepare for not defaulting to master branch for git deps (rust-lang/cargo#8522) - Include `+` for crates.io feature requirements in the Cargo Book section on features (rust-lang/cargo#8547) - Update termcolor and fwdansi versions (rust-lang/cargo#8540) - Cargo book nitpick in Manifest section (rust-lang/cargo#8543)
Normalize SourceID in `cargo metadata`. The SourceID in `cargo metadata` can have different values, but they can be equivalent in Cargo. This results in different serialized forms, which prevents comparing the ID strings. In this particular case, `SourceKind::Git(GitReference::Branch("master"))` is equivalent to `SourceKind::Git(GitReference::DefaultBranch)`, but they serialize differently. The reason these end up differently is because the `SourceId` for a `Package` is created from the `Dependency` declaration. But the `SourceId` in `Cargo.lock` comes from the deserialized file. If you have an explicit `branch = "master"` in the dependency, then versions prior to 1.47 would *not* include `?branch=master` in `Cargo.lock`. However, since 1.47, internally Cargo will use `GitReference::Branch("master")`. Conversely, if you have a new `Cargo.lock` (with `?branch=master`), and then *remove* the explicit `branch="master"` from `Cargo.toml`, you'll end up with another mismatch in `cargo metadata`. The solution here is to use the variant from the `Package` when serializing the resolver in `cargo metadata`. I chose this since the `Package` variant is displayed on other JSON messages (like artifact messages), and I think this is the only place that the resolver variants are exposed (other than `Cargo.lock` itself). I'm not convinced that this was entirely intended, since there is [code to avoid this](https://github.com/rust-lang/cargo/blob/6a38927551df9bbe2dea340bf92d3e53abccf890/src/cargo/core/resolver/encode.rs#L688-L695), and at the time #8522 landed, I did not realize this would change the V2 lock format. However, it's probably too late to try to reverse that, and I don't think there are any other problems other than this `cargo metadata` inconsistency. Fixes #8756
This commit follows through with work started in rust-lang#8522 to change the default behavior of `git` dependencies where if not branch/tag/etc is listed then `HEAD` is used instead of the `master` branch. This involves also changing the default lock file format, now including a `version` marker at the top of the file notably as well as changing the encoding of `branch=master` directives in `Cargo.toml`. If we did all our work correctly then this will be a seamless change. First released on stable in 1.47.0 (2020-10-08) Cargo has been emitting warnings about situations which may break in the future. This means that if you don't specify `branch = 'master'` but your HEAD branch isn't `master`, you've been getting a warning. Similarly if your dependency graph used both `branch = 'master'` as well as specifying nothing, you were receiving warnings as well. These two situations are broken by this commit, but it's hoped that by giving enough times with warnings we don't actually break anyone in practice.
Change git dependencies to use `HEAD` by default This commit follows through with work started in #8522 to change the default behavior of `git` dependencies where if not branch/tag/etc is listed then `HEAD` is used instead of the `master` branch. This involves also changing the default lock file format, now including a `version` marker at the top of the file notably as well as changing the encoding of `branch=master` directives in `Cargo.toml`. If we did all our work correctly then this will be a seamless change. First released on stable in 1.47.0 (2020-10-08) Cargo has been emitting warnings about situations which may break in the future. This means that if you don't specify `branch = 'master'` but your HEAD branch isn't `master`, you've been getting a warning. Similarly if your dependency graph used both `branch = 'master'` as well as specifying nothing, you were receiving warnings as well. These two situations are broken by this commit, but it's hoped that by giving enough times with warnings we don't actually break anyone in practice.
This PR is the "equivalent" of #8503 for the nightly channel of Rust (the master branch of Cargo). The purpose of this change is to fix the breakage from #8364 but to still pave a path forward to enabling the feature added in #8364, namely reinterpreting dependency directives without a
branch
directive as depending onHEAD
insead of depending onmaster
.This is a series of commits which implements a few mitigation strategies, but they're all adding up into a larger plan to roll out this change with, ideally, no breaking changes for those "reasonably" keeping up with Rust. The changes here are:
DefaultBranch
andBranch("master")
, so it knows what you wrote in the manifest. These two, however, hash and equate together so everything in Cargo considers them equivalent.DefaultBranch
and the branch'smaster
branch doesn't actually matchHEAD
. This avenue of breakage didn't arise but I added it here for completionism about not having to deal with breakage from this change again.version
marker at the top of the file as well as encoding git dependencies different. TodayDefaultBranch
andBranch("master")
encode the same way, but in the future they will encode differently.DefaultBranch
andBranch("master")
in your build. The intention here is to get everyone on one or the other so the transition to the new lock file format can go smoothly.With all of these pieces in place the intention is that there is no breakage from #8364 as well. Additionally projects that would break will receive warnings. (note that projects "broken" today, those we've got issues for, will likely not get warnings, but that's because it was accidental we broke them). The long-term plan is to let this bake for quite some time, and then as part of the same change land an update to the new lock file format as well as "fixing"
SourceId
to consider these two directives different. When this change happens we should have guaranteed, for all currently active projects, they're either only usingbranch = "master"
or no directive at all. We can therefore unambiguosly read the previousCargo.lock
to transition it into a newCargo.lock
which will either have?branch=master
or nothing.I'm not sure how long this will need to bake for because unfortunately version changes to
Cargo.lock
cannot be taken lightly. We haven't even bumped the default format for old lock files toV2
yet, they're still onV1
. Hopefully in getting this in early-ish, though, we can at least get the ball rolling.Closes #8468