-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: deduplicate dependencies by artifact target #11478
Conversation
Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @ehuss (or someone else) soon. Please see the contribution instructions for more information. |
6308495
to
1a01ae0
Compare
FYI: @Byron |
Byron probably is working on gitoxide integration. I will take a look then :) r? @weihanglo BTW, could we have some tests for this patch? |
@weihanglo I was tagging him since he wrote this code originally. |
@weihanglo I'd love to do that, but I was having troubles coming up with a minimal reproduction of this issue, it works (as in, upstream fails) with Do you agree with the approach overall? |
Update: I took my time to untangle the |
1a01ae0
to
e151080
Compare
I managed to reproduce the error in rvolosatovs@440517b, which the fix is conveniently based on top of, the test fails on that commit and is fixed by the child commit. The way I understand this, it's a good old "dependency hell" compilation issue where |
Sorry I haven't yet got time to do a full review on this. Just note that I think the bug isn't related to workspace, so maybe use path dependencies to make the gist of the test case clearer? |
☔ The latest upstream changes (presumably #11434) made this pull request unmergeable. Please resolve the merge conflicts. |
e151080
to
d0b97ae
Compare
@weihanglo I've rebased on top of latest You should be able to confirm both test case failures on |
I have a branch at https://github.com/bstrie/cargo/tree/fix/bindeps-target that adds regression tests for both #10525 and #10837. I can confirm that these tests both fail on the current head, and both succeed on top of this PR. @rvolosatovs , can you pull this branch and add it to this PR? It is already rebased on top of the current revision. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me, I only have minor style and documentation comments. I'm not an authority on Cargo's codebase, but this looks like a low-risk addition that is unlikely to affect anyone that isn't using bindeps. @weihanglo , do you agree that this addresses what you described in your comment at #10837 (comment) ? This patch is mercifully only 60 logical lines excluding the tests, so I hope that it is not too much burden on you to review. I have confirmed that the test cases satisfactorily cover #10837, and once my own branch has been merged it will cover #10525 as well.
@weihanglo I had a style question about test cases, in my branch where I add the regression test for #10525 I copied the code verbatim from the issue, using the mock crates I listed there, but would you prefer that I "normalize" them to all the same dummy names used in the other test cases? |
d0b97ae
to
75f47f6
Compare
I've integrated @bstrie's test case for #10525 in d9959b7 @weihanglo PTAL |
LGTM |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks reasonable to me. Thank you for posting this fix!
tests/testsuite/artifact_dep.rs
Outdated
} | ||
|
||
#[cargo_test] | ||
fn issue_10525() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe a descriptive title.
And other tests should be decouple_from
instead of deduplication
, no?
fn issue_10525() { | |
fn same_target_transitive_dependency_decouple_from_hostdep() { | |
// See https://github.com/rust-lang/cargo/issues/10525 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. I simplified this test case quite a bit and it turned out that it has nothing to do with proc macros and seems to be a duplicate of #11463 #10837
I left the test in regardless, since it has a slightly different structure.
Note, that I uncovered a pretty bizarre inconsistency - if e
crate is renamed to a
, the issue does not occur in upstream. How is naming related here is beyond me, but it does seem that there are more things broken the resolver
cc @bstrie
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Part of my rationale in leaving in the proc macros was to exercise different codepaths as compared to the other test. In addition, while this test case is greatly simplified compared to the original reproducer, care was taken to not simplify it too much, in order to produce the same error message as the original maximized bug. I don't mind if it gets simplified further, although this runs the risk of reducing the value of this test as a regression test. I leave the judgment here up to @weihanglo as to whether or not this test is valuable to have, and whether or not it should exercise the proc macro machinery.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I second @bstrie's opinion of going through the code path of proc macro feature resolution. In the adapted version of test, if you change target = "…"
to a target other than host, it compiles. However, in the original one, since it uses hostdeps, it fails to compile for any kind of target specified. This is written down in the PR description of #10525:
Any other value in the
target
field (e.g.x86_64-unknown-linux-musl
) produces the same error, so it is the presence of this key that is essential.
src/cargo/ops/cargo_compile/mod.rs
Outdated
/// building the unit graph, and then run this second pass which will try to | ||
/// combine shared dependencies safely. By adding a hash of the dependencies | ||
/// to the `Unit`, this allows the `CompileKind` to be changed back to `Host` | ||
/// without fear of an unwanted collision. | ||
/// without fear of an unwanted collision for build-dependencies. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand change here correctly, it strives to share artifact deps with other same deps for the same target, right? I'd recommend this having its own commit. And maybe need a test to verify this behaviour?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
During the construction of unit_graph
, artifact_target_for_features
was set to decouple the dependencies, in this step, duplicates are merged. I am not sure a separate commit makes sense here, since this is only relevant for artifact dependencies with artifact_target_for_features
set, so I could have a single commit, which breaks artifact deps feature, since cargo would fail to compile such packages with errors about duplicate paths and a follow-up fixing the issue, but I don't see why would that be preferrable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed it is not required to split into two commits.
…I could have a single commit, which breaks artifact deps feature, since cargo would fail to compile such packages with errors about duplicate paths…
Could you point out which test would fail if Cargo doesn't rebuild this? I had a hard time finding that 😞.
The other change request from me was a test to verify that the decoupled artifact deps are eventually shard after rebuild. That could prevent any regression I believe. I could construct a test later if you want one from me :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It indeed looks like this case is not covered by existing test suite, but the resulting compiler fails on real-word codebase with a wall of errors like these without the dependency sharing:
warning: output filename collision.
The lib target `webpki` in package `webpki v0.22.0` has the same output filename as the lib target `webpki` in package `webpki v0.22.0`.
Colliding filename is: /home/rvolosatovs/src/github.com/enarx/enarx/target/x86_64-unknown-linux-musl/debug/deps/libwebpki-19484ee9a1315a74.rmeta
The targets should have unique names.
Consider changing their names to be unique or compiling them separately.
This may become a hard error in the future; see <https://github.com/rust-lang/cargo/issues/6313>.
warning: output filename collision.
The lib target `webpki-roots` in package `webpki-roots v0.22.4` has the same output filename as the lib target `webpki-roots` in package `webpki-roots v0.22.4`.
Colliding filename is: /home/rvolosatovs/src/github.com/enarx/enarx/target/x86_64-unknown-linux-musl/debug/deps/libwebpki_roots-f41f1a48aec910b8.rlib
The targets should have unique names.
Consider changing their names to be unique or compiling them separately.
This may become a hard error in the future; see <https://github.com/rust-lang/cargo/issues/6313>.
warning: output filename collision.
The lib target `webpki-roots` in package `webpki-roots v0.22.4` has the same output filename as the lib target `webpki-roots` in package `webpki-roots v0.22.4`.
Colliding filename is: /home/rvolosatovs/src/github.com/enarx/enarx/target/x86_64-unknown-linux-musl/debug/deps/libwebpki_roots-f41f1a48aec910b8.rmeta
The targets should have unique names.
Consider changing their names to be unique or compiling them separately.
This may become a hard error in the future; see <https://github.com/rust-lang/cargo/issues/6313>.
warning: output filename collision.
The lib target `witx` in package `witx v0.9.1 (https://github.com/bytecodealliance/wasmtime?rev=6f50ddaaf2ab8205b6e850361dd2cc662819f431#6f50ddaa)` has the same output filename as the lib target `witx` in package `witx v0.9.1 (https://github.com/bytecodealliance/wasmtime?rev=6f50ddaaf2ab8205b6e850361dd2cc662819f431#6f50ddaa)`.
Colliding filename is: /home/rvolosatovs/src/github.com/enarx/enarx/target/debug/deps/libwitx-158c96bbbf8bb0ec.rlib
The targets should have unique names.
Consider changing their names to be unique or compiling them separately.
This may become a hard error in the future; see <https://github.com/rust-lang/cargo/issues/6313>.
warning: output filename collision.
The lib target `witx` in package `witx v0.9.1 (https://github.com/bytecodealliance/wasmtime?rev=6f50ddaaf2ab8205b6e850361dd2cc662819f431#6f50ddaa)` has the same output filename as the lib target `witx` in package `witx v0.9.1 (https://github.com/bytecodealliance/wasmtime?rev=6f50ddaaf2ab8205b6e850361dd2cc662819f431#6f50ddaa)`.
Colliding filename is: /home/rvolosatovs/src/github.com/enarx/enarx/target/debug/deps/libwitx-158c96bbbf8bb0ec.rmeta
The targets should have unique names.
Consider changing their names to be unique or compiling them separately.
This may become a hard error in the future; see <https://github.com/rust-lang/cargo/issues/6313>.
thread 'main' panicked at 'build script output collision for proc-macro2 v1.0.43/c9c8981de1417d5d
old=BuildOutput { library_paths: [], library_links: [], linker_args: [], cfgs: ["use_proc_macro", "wrap_proc_macro", "proc_macro_span"], check_cfgs: [], env: [], metadata: [], rerun_if_changed: ["build.rs"], rerun_if_env_changed: [], warnings: [] }
new=BuildOutput { library_paths: [], library_links: [], linker_args: [], cfgs: ["use_proc_macro", "wrap_proc_macro", "proc_macro_span"], check_cfgs: [], env: [], metadata: [], rerun_if_changed: ["build.rs"], rerun_if_env_changed: [], warnings: [] }', src/cargo/core/compiler/custom_build.rs:1010:39
I'll work on adding a test case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See 548b252
I've now split this into two commits, first commit basically introducing the bug, next one adding test reproducing it and then the fix
4540be6
to
bdc4f31
Compare
@weihanglo I believe all comments are addressed. |
bdc4f31
to
d73330d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your update. This looks really nice. We're getting closer to merge!
but uncovered a concerning naming-related inconsistency #11478 (comment) , which could be related to lexicographic sorting of the dependencies in some way.
I believe it's actually the same issue. This line makes Cargo skip redundant unit computation. Before this pull request, either with package named a
or e
cannot help discriminate the difference of package c
being built for artifact dep or normal dep.
Expand to read the detail
Below is how unit_dependencies
are computed with the package named a
.
("foo", [])
-> unit_deps: [("a", []), ("bar", [])]
("a", [])
-> unit_deps: [("c", []), ("d", ["feature"])]
("c", [])
-> unit_deps: [("d", ["feature"])]
("d", ["feature"])
-> unit_deps: []
("d", ["feature"]) # cache hit, skipping...
("bar", [])
-> unit_deps: [("b", [])]
("b", [])
-> unit_deps: [("c", [])]
("c", []) # cache hit, skipping...
You can see that UnitDep
s of a
are computed first, so we get only one c
, which depends on d
with feature
enabled. Therefore, when we compute unit deps of b
, cache hit for c
. so they reuse d
with feature
enabled for the whole dependency graph.
If we change package name a
to e
:
("foo", [])
-> unit_deps: [("bar", []), ("e", [])]
("bar", [])
-> unit_deps: [("b", [])]
("b", [])
-> unit_deps: [("c", [])]
("c", [])
-> unit_deps: [("d", [])]
("d", [])
-> unit_deps: []
("e", [])
-> unit_deps: [("c", []), ("d", ["feature"])]
("c", []) # cache hit, skipping...
("d", ["feature"])
-> unit_deps: []
Cargo now computes bar
first, and get a c
with d without any feature. Next when computing deps of e
, Cargo skips and reuses preivous c
. That leads to havning two different versions of d
in the subgraph of e
. (One with feature
enabled and one without). And that confuses rustc.
src/cargo/ops/cargo_compile/mod.rs
Outdated
/// building the unit graph, and then run this second pass which will try to | ||
/// combine shared dependencies safely. By adding a hash of the dependencies | ||
/// to the `Unit`, this allows the `CompileKind` to be changed back to `Host` | ||
/// without fear of an unwanted collision. | ||
/// without fear of an unwanted collision for build-dependencies. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed it is not required to split into two commits.
…I could have a single commit, which breaks artifact deps feature, since cargo would fail to compile such packages with errors about duplicate paths…
Could you point out which test would fail if Cargo doesn't rebuild this? I had a hard time finding that 😞.
The other change request from me was a test to verify that the decoupled artifact deps are eventually shard after rebuild. That could prevent any regression I believe. I could construct a test later if you want one from me :)
tests/testsuite/artifact_dep.rs
Outdated
} | ||
|
||
#[cargo_test] | ||
fn issue_10525() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I second @bstrie's opinion of going through the code path of proc macro feature resolution. In the adapted version of test, if you change target = "…"
to a target other than host, it compiles. However, in the original one, since it uses hostdeps, it fails to compile for any kind of target specified. This is written down in the PR description of #10525:
Any other value in the
target
field (e.g.x86_64-unknown-linux-musl
) produces the same error, so it is the presence of this key that is essential.
This is a unit test reproducing the bindep transitive dependency bug based upon examples from - rust-lang#10837 - rust-lang#11463 Signed-off-by: Roman Volosatovs <rvolosatovs@riseup.net>
Signed-off-by: Roman Volosatovs <rvolosatovs@riseup.net>
Signed-off-by: Roman Volosatovs <rvolosatovs@riseup.net>
d73330d
to
107f266
Compare
Signed-off-by: Roman Volosatovs <rvolosatovs@riseup.net>
This prevents collisions for transitive dependencies Signed-off-by: Roman Volosatovs <rvolosatovs@riseup.net>
107f266
to
385bba3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks really great! Both of the commit history and patch are quite clear. Thank you @rvolosatovs for working on this patch. Also thanks @bstrie for taking care of artifact dependencies for a long while. I might not be a good communicator but still happy to collaborate with you :)
The only concern I have is artifact_target_for_features
being duplicated. It is not that bad as we already added a link to UnitFor::artifact_target_for_features
. If @Byron or someone from the Cargo team feels it is wrong, please call it out. Thanks!
@bors r+ |
☀️ Test successful - checks-actions |
7 commits in c994a4a638370bc7e0ffcbb0e2865afdfa7d4415..2381cbdb4e9b07090f552d34a44a529b6e620e44 2022-12-18 21:50:58 +0000 to 2022-12-23 12:19:27 +0000 - fix: deduplicate dependencies by artifact target (rust-lang/cargo#11478) - Add warning if potentially-scrapable examples are skipped due to dev-dependencies (rust-lang/cargo#11503) - Don't scrape examples from library targets by default (rust-lang/cargo#11499) - Stabilize terminal-width (rust-lang/cargo#11494) - Make sure that hash of `SourceId` is stable (rust-lang/cargo#11501) - Use workspace lockfile when running `cargo package` and `cargo publish` (rust-lang/cargo#11477) - Show `--help` if there is no man page for subcommand (rust-lang/cargo#11473)
Update cargo 7 commits in c994a4a638370bc7e0ffcbb0e2865afdfa7d4415..2381cbdb4e9b07090f552d34a44a529b6e620e44 2022-12-18 21:50:58 +0000 to 2022-12-23 12:19:27 +0000 - fix: deduplicate dependencies by artifact target (rust-lang/cargo#11478) - Add warning if potentially-scrapable examples are skipped due to dev-dependencies (rust-lang/cargo#11503) - Don't scrape examples from library targets by default (rust-lang/cargo#11499) - Stabilize terminal-width (rust-lang/cargo#11494) - Make sure that hash of `SourceId` is stable (rust-lang/cargo#11501) - Use workspace lockfile when running `cargo package` and `cargo publish` (rust-lang/cargo#11477) - Show `--help` if there is no man page for subcommand (rust-lang/cargo#11473) r? `@ghost`
What does this PR try to resolve?
In cases when a compile target is specified for a bindep and the crate depending on it, cargo fails to deduplicate the crate dependencies and attempts to build the dependent crate only once with non-deterministic feature set, which breaks e.g. https://github.com/rvolosatovs/musl-bindep-feature-bug
Fix the issue by including the optional artifact compile target in the
Unit
in order to avoid wrongfully deduplicating the dependent cratesFixes #11463
Fixes #10837
Fixes #10525
Note, that this issue is already accounted for by
cargo
, but in different context a similar situation can occur while building the build script, which:That's why this PR is simply reusing the existing functionality for deduplication
How should we test and review this PR?
Build https://github.com/rvolosatovs/musl-bindep-feature-bug
Additional information
This is based on analysis by @weihanglo in #10837 (comment)
I experimented with adding the whole
UnitFor
to the internal unit struct, but that seems unnecessary.It would probably be nicer to refactor
IsArtifact
and instead turn it into a 3-variant enum with a possible compile target, but I decided against that to minimize the diff. Perhaps it's worth a follow-up?