Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: deduplicate dependencies by artifact target #11478

Merged
merged 6 commits into from
Dec 23, 2022

Conversation

rvolosatovs
Copy link
Contributor

@rvolosatovs rvolosatovs commented Dec 13, 2022

What does this PR try to resolve?

In cases when a compile target is specified for a bindep and the crate depending on it, cargo fails to deduplicate the crate dependencies and attempts to build the dependent crate only once with non-deterministic feature set, which breaks e.g. https://github.com/rvolosatovs/musl-bindep-feature-bug

Fix the issue by including the optional artifact compile target in the Unit in order to avoid wrongfully deduplicating the dependent crates

Fixes #11463
Fixes #10837
Fixes #10525

Note, that this issue is already accounted for by cargo, but in different context a similar situation can occur while building the build script, which:

  1. may be built for different target than the actual package target
  2. may contain dependencies with different feature sets than the same dependencies in the dependency graph of the package itself

That's why this PR is simply reusing the existing functionality for deduplication

How should we test and review this PR?

Build https://github.com/rvolosatovs/musl-bindep-feature-bug

Additional information

This is based on analysis by @weihanglo in #10837 (comment)
I experimented with adding the whole UnitFor to the internal unit struct, but that seems unnecessary.

It would probably be nicer to refactor IsArtifact and instead turn it into a 3-variant enum with a possible compile target, but I decided against that to minimize the diff. Perhaps it's worth a follow-up?

@rustbot
Copy link
Collaborator

rustbot commented Dec 13, 2022

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @ehuss (or someone else) soon.

Please see the contribution instructions for more information.

@rustbot rustbot added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Dec 13, 2022
@rvolosatovs
Copy link
Contributor Author

cc @bstrie @weihanglo

@npmccallum
Copy link
Contributor

FYI: @Byron

@weihanglo
Copy link
Member

Byron probably is working on gitoxide integration. I will take a look then :)

r? @weihanglo

BTW, could we have some tests for this patch?

@rustbot rustbot assigned weihanglo and unassigned ehuss Dec 13, 2022
@npmccallum
Copy link
Contributor

@weihanglo I was tagging him since he wrote this code originally.

@rvolosatovs
Copy link
Contributor Author

rvolosatovs commented Dec 13, 2022

BTW, could we have some tests for this patch?

@weihanglo I'd love to do that, but I was having troubles coming up with a minimal reproduction of this issue, it works (as in, upstream fails) with rsa crate, but that's probably not something we want to depend upon in tests. Do you have some clever ideas on how to trigger this bug by using just local crates?
I'll try to do a bit more exploration myself meanwhile

Do you agree with the approach overall?

@rvolosatovs
Copy link
Contributor Author

Update: I took my time to untangle the rsa->num_bigint_dig->rand->rand_core dependency chain and managed to reproduce the bug by only using local crates (based on this dependency chain) rvolosatovs/musl-bindep-feature-bug@eab0536
I will proceed with adding a test case based on this. For now it's hard for me to wrap my head around it to simplify (if possible at all)

@rvolosatovs
Copy link
Contributor Author

rvolosatovs commented Dec 13, 2022

I managed to reproduce the error in rvolosatovs@440517b, which the fix is conveniently based on top of, the test fails on that commit and is fixed by the child commit.
I tried a few things, but unable to simplify the testcase further. Note that edition = is important and removing it at the top-level eliminates the issue

The way I understand this, it's a good old "dependency hell" compilation issue where foo::Interface does not implement foo::Interface, because bar depending on foo is given a different version of foo than it expected... albeit in a new light.
The feature bit is what triggers this difference, even though the feature itself does not affect the functionality

@rvolosatovs rvolosatovs marked this pull request as ready for review December 13, 2022 20:02
@weihanglo
Copy link
Member

Sorry I haven't yet got time to do a full review on this. Just note that I think the bug isn't related to workspace, so maybe use path dependencies to make the gist of the test case clearer?
Besides, for #10837 { …, lib = ture, target = "<target>" } it is worth one test as well. Thank you!

@bors
Copy link
Collaborator

bors commented Dec 14, 2022

☔ The latest upstream changes (presumably #11434) made this pull request unmergeable. Please resolve the merge conflicts.

@rvolosatovs
Copy link
Contributor Author

rvolosatovs commented Dec 14, 2022

Sorry I haven't yet got time to do a full review on this. Just note that I think the bug isn't related to workspace, so maybe use path dependencies to make the gist of the test case clearer? Besides, for #10837 { …, lib = ture, target = "<target>" } it is worth one test as well. Thank you!

@weihanglo I've rebased on top of latest master, added a test case with lib = true based on example from #10837 and updated the docs in a few places.

You should be able to confirm both test case failures on HEAD^ of this PR

@bstrie
Copy link
Contributor

bstrie commented Dec 15, 2022

I have a branch at https://github.com/bstrie/cargo/tree/fix/bindeps-target that adds regression tests for both #10525 and #10837. I can confirm that these tests both fail on the current head, and both succeed on top of this PR. @rvolosatovs , can you pull this branch and add it to this PR? It is already rebased on top of the current revision.

@bstrie
Copy link
Contributor

bstrie commented Dec 15, 2022

On further inspection it looks like the test that was added yesterday covers the #10837 case. I've updated my branch to only add a test for #10525.

Copy link
Contributor

@bstrie bstrie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me, I only have minor style and documentation comments. I'm not an authority on Cargo's codebase, but this looks like a low-risk addition that is unlikely to affect anyone that isn't using bindeps. @weihanglo , do you agree that this addresses what you described in your comment at #10837 (comment) ? This patch is mercifully only 60 logical lines excluding the tests, so I hope that it is not too much burden on you to review. I have confirmed that the test cases satisfactorily cover #10837, and once my own branch has been merged it will cover #10525 as well.

src/cargo/core/compiler/unit_dependencies.rs Outdated Show resolved Hide resolved
src/cargo/core/compiler/unit.rs Outdated Show resolved Hide resolved
@bstrie
Copy link
Contributor

bstrie commented Dec 16, 2022

@weihanglo I had a style question about test cases, in my branch where I add the regression test for #10525 I copied the code verbatim from the issue, using the mock crates I listed there, but would you prefer that I "normalize" them to all the same dummy names used in the other test cases?

@rvolosatovs
Copy link
Contributor Author

I've integrated @bstrie's test case for #10525 in d9959b7 @weihanglo PTAL

@bstrie
Copy link
Contributor

bstrie commented Dec 16, 2022

LGTM

Copy link
Member

@weihanglo weihanglo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks reasonable to me. Thank you for posting this fix!

src/cargo/core/compiler/unit.rs Outdated Show resolved Hide resolved
tests/testsuite/artifact_dep.rs Outdated Show resolved Hide resolved
}

#[cargo_test]
fn issue_10525() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a descriptive title.

And other tests should be decouple_from instead of deduplication, no?

Suggested change
fn issue_10525() {
fn same_target_transitive_dependency_decouple_from_hostdep() {
// See https://github.com/rust-lang/cargo/issues/10525

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. I simplified this test case quite a bit and it turned out that it has nothing to do with proc macros and seems to be a duplicate of #11463 #10837
I left the test in regardless, since it has a slightly different structure.

Note, that I uncovered a pretty bizarre inconsistency - if e crate is renamed to a, the issue does not occur in upstream. How is naming related here is beyond me, but it does seem that there are more things broken the resolver

cc @bstrie

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Part of my rationale in leaving in the proc macros was to exercise different codepaths as compared to the other test. In addition, while this test case is greatly simplified compared to the original reproducer, care was taken to not simplify it too much, in order to produce the same error message as the original maximized bug. I don't mind if it gets simplified further, although this runs the risk of reducing the value of this test as a regression test. I leave the judgment here up to @weihanglo as to whether or not this test is valuable to have, and whether or not it should exercise the proc macro machinery.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I second @bstrie's opinion of going through the code path of proc macro feature resolution. In the adapted version of test, if you change target = "…" to a target other than host, it compiles. However, in the original one, since it uses hostdeps, it fails to compile for any kind of target specified. This is written down in the PR description of #10525:

Any other value in the target field (e.g. x86_64-unknown-linux-musl) produces the same error, so it is the presence of this key that is essential.

src/cargo/ops/cargo_compile/mod.rs Outdated Show resolved Hide resolved
/// building the unit graph, and then run this second pass which will try to
/// combine shared dependencies safely. By adding a hash of the dependencies
/// to the `Unit`, this allows the `CompileKind` to be changed back to `Host`
/// without fear of an unwanted collision.
/// without fear of an unwanted collision for build-dependencies.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand change here correctly, it strives to share artifact deps with other same deps for the same target, right? I'd recommend this having its own commit. And maybe need a test to verify this behaviour?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

During the construction of unit_graph, artifact_target_for_features was set to decouple the dependencies, in this step, duplicates are merged. I am not sure a separate commit makes sense here, since this is only relevant for artifact dependencies with artifact_target_for_features set, so I could have a single commit, which breaks artifact deps feature, since cargo would fail to compile such packages with errors about duplicate paths and a follow-up fixing the issue, but I don't see why would that be preferrable

Copy link
Member

@weihanglo weihanglo Dec 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed it is not required to split into two commits.

…I could have a single commit, which breaks artifact deps feature, since cargo would fail to compile such packages with errors about duplicate paths…

Could you point out which test would fail if Cargo doesn't rebuild this? I had a hard time finding that 😞.

The other change request from me was a test to verify that the decoupled artifact deps are eventually shard after rebuild. That could prevent any regression I believe. I could construct a test later if you want one from me :)

Copy link
Contributor Author

@rvolosatovs rvolosatovs Dec 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It indeed looks like this case is not covered by existing test suite, but the resulting compiler fails on real-word codebase with a wall of errors like these without the dependency sharing:

warning: output filename collision.
The lib target `webpki` in package `webpki v0.22.0` has the same output filename as the lib target `webpki` in package `webpki v0.22.0`.
Colliding filename is: /home/rvolosatovs/src/github.com/enarx/enarx/target/x86_64-unknown-linux-musl/debug/deps/libwebpki-19484ee9a1315a74.rmeta
The targets should have unique names.
Consider changing their names to be unique or compiling them separately.
This may become a hard error in the future; see <https://github.com/rust-lang/cargo/issues/6313>.
warning: output filename collision.
The lib target `webpki-roots` in package `webpki-roots v0.22.4` has the same output filename as the lib target `webpki-roots` in package `webpki-roots v0.22.4`.
Colliding filename is: /home/rvolosatovs/src/github.com/enarx/enarx/target/x86_64-unknown-linux-musl/debug/deps/libwebpki_roots-f41f1a48aec910b8.rlib
The targets should have unique names.
Consider changing their names to be unique or compiling them separately.
This may become a hard error in the future; see <https://github.com/rust-lang/cargo/issues/6313>.
warning: output filename collision.
The lib target `webpki-roots` in package `webpki-roots v0.22.4` has the same output filename as the lib target `webpki-roots` in package `webpki-roots v0.22.4`.
Colliding filename is: /home/rvolosatovs/src/github.com/enarx/enarx/target/x86_64-unknown-linux-musl/debug/deps/libwebpki_roots-f41f1a48aec910b8.rmeta
The targets should have unique names.
Consider changing their names to be unique or compiling them separately.
This may become a hard error in the future; see <https://github.com/rust-lang/cargo/issues/6313>.
warning: output filename collision.
The lib target `witx` in package `witx v0.9.1 (https://github.com/bytecodealliance/wasmtime?rev=6f50ddaaf2ab8205b6e850361dd2cc662819f431#6f50ddaa)` has the same output filename as the lib target `witx` in package `witx v0.9.1 (https://github.com/bytecodealliance/wasmtime?rev=6f50ddaaf2ab8205b6e850361dd2cc662819f431#6f50ddaa)`.
Colliding filename is: /home/rvolosatovs/src/github.com/enarx/enarx/target/debug/deps/libwitx-158c96bbbf8bb0ec.rlib
The targets should have unique names.
Consider changing their names to be unique or compiling them separately.
This may become a hard error in the future; see <https://github.com/rust-lang/cargo/issues/6313>.
warning: output filename collision.
The lib target `witx` in package `witx v0.9.1 (https://github.com/bytecodealliance/wasmtime?rev=6f50ddaaf2ab8205b6e850361dd2cc662819f431#6f50ddaa)` has the same output filename as the lib target `witx` in package `witx v0.9.1 (https://github.com/bytecodealliance/wasmtime?rev=6f50ddaaf2ab8205b6e850361dd2cc662819f431#6f50ddaa)`.
Colliding filename is: /home/rvolosatovs/src/github.com/enarx/enarx/target/debug/deps/libwitx-158c96bbbf8bb0ec.rmeta
The targets should have unique names.
Consider changing their names to be unique or compiling them separately.
This may become a hard error in the future; see <https://github.com/rust-lang/cargo/issues/6313>.
thread 'main' panicked at 'build script output collision for proc-macro2 v1.0.43/c9c8981de1417d5d
old=BuildOutput { library_paths: [], library_links: [], linker_args: [], cfgs: ["use_proc_macro", "wrap_proc_macro", "proc_macro_span"], check_cfgs: [], env: [], metadata: [], rerun_if_changed: ["build.rs"], rerun_if_env_changed: [], warnings: [] }
new=BuildOutput { library_paths: [], library_links: [], linker_args: [], cfgs: ["use_proc_macro", "wrap_proc_macro", "proc_macro_span"], check_cfgs: [], env: [], metadata: [], rerun_if_changed: ["build.rs"], rerun_if_env_changed: [], warnings: [] }', src/cargo/core/compiler/custom_build.rs:1010:39

I'll work on adding a test case

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See 548b252

I've now split this into two commits, first commit basically introducing the bug, next one adding test reproducing it and then the fix

@rvolosatovs rvolosatovs force-pushed the fix/bindeps-target branch 3 times, most recently from 4540be6 to bdc4f31 Compare December 19, 2022 19:12
@rvolosatovs
Copy link
Contributor Author

rvolosatovs commented Dec 19, 2022

@weihanglo I believe all comments are addressed.
Note, that I added the new field to Debug implementation, which I missed earlier and improved a few docs
I also refactored #10525 test case significantly and found out that it's unrelated to proc macros, but uncovered a concerning naming-related inconsistency #11478 (comment) , which could be related to lexicographic sorting of the dependencies in some way.

Copy link
Member

@weihanglo weihanglo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your update. This looks really nice. We're getting closer to merge!


but uncovered a concerning naming-related inconsistency #11478 (comment) , which could be related to lexicographic sorting of the dependencies in some way.

I believe it's actually the same issue. This line makes Cargo skip redundant unit computation. Before this pull request, either with package named a or e cannot help discriminate the difference of package c being built for artifact dep or normal dep.

Expand to read the detail

Below is how unit_dependencies are computed with the package named a.

("foo", [])
    -> unit_deps: [("a", []), ("bar", [])]
("a", [])
    -> unit_deps: [("c", []), ("d", ["feature"])]
("c", [])
    -> unit_deps: [("d", ["feature"])]
("d", ["feature"])
    -> unit_deps: []
("d", ["feature"])  # cache hit, skipping...
("bar", [])
    -> unit_deps: [("b", [])]
("b", [])
    -> unit_deps: [("c", [])]
("c", []) # cache hit, skipping...

You can see that UnitDeps of a are computed first, so we get only one c, which depends on d with feature enabled. Therefore, when we compute unit deps of b, cache hit for c. so they reuse d with feature enabled for the whole dependency graph.

If we change package name a to e:

("foo", [])
    -> unit_deps: [("bar", []), ("e", [])]
("bar", [])
    -> unit_deps: [("b", [])]
("b", [])
    -> unit_deps: [("c", [])]
("c", [])
    -> unit_deps: [("d", [])]
("d", [])
    -> unit_deps: []
("e", [])
    -> unit_deps: [("c", []), ("d", ["feature"])]
("c", []) # cache hit, skipping...
("d", ["feature"])
    -> unit_deps: []

Cargo now computes bar first, and get a c with d without any feature. Next when computing deps of e, Cargo skips and reuses preivous c. That leads to havning two different versions of d in the subgraph of e. (One with feature enabled and one without). And that confuses rustc.

src/cargo/core/compiler/unit.rs Outdated Show resolved Hide resolved
tests/testsuite/artifact_dep.rs Outdated Show resolved Hide resolved
/// building the unit graph, and then run this second pass which will try to
/// combine shared dependencies safely. By adding a hash of the dependencies
/// to the `Unit`, this allows the `CompileKind` to be changed back to `Host`
/// without fear of an unwanted collision.
/// without fear of an unwanted collision for build-dependencies.
Copy link
Member

@weihanglo weihanglo Dec 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed it is not required to split into two commits.

…I could have a single commit, which breaks artifact deps feature, since cargo would fail to compile such packages with errors about duplicate paths…

Could you point out which test would fail if Cargo doesn't rebuild this? I had a hard time finding that 😞.

The other change request from me was a test to verify that the decoupled artifact deps are eventually shard after rebuild. That could prevent any regression I believe. I could construct a test later if you want one from me :)

src/cargo/ops/cargo_compile/mod.rs Outdated Show resolved Hide resolved
}

#[cargo_test]
fn issue_10525() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I second @bstrie's opinion of going through the code path of proc macro feature resolution. In the adapted version of test, if you change target = "…" to a target other than host, it compiles. However, in the original one, since it uses hostdeps, it fails to compile for any kind of target specified. This is written down in the PR description of #10525:

Any other value in the target field (e.g. x86_64-unknown-linux-musl) produces the same error, so it is the presence of this key that is essential.

rvolosatovs and others added 4 commits December 22, 2022 14:45
This is a unit test reproducing the bindep transitive dependency bug based upon examples from
- rust-lang#10837
- rust-lang#11463

Signed-off-by: Roman Volosatovs <rvolosatovs@riseup.net>
Signed-off-by: Roman Volosatovs <rvolosatovs@riseup.net>
Signed-off-by: Roman Volosatovs <rvolosatovs@riseup.net>
Signed-off-by: Roman Volosatovs <rvolosatovs@riseup.net>
This prevents collisions for transitive dependencies

Signed-off-by: Roman Volosatovs <rvolosatovs@riseup.net>
Copy link
Member

@weihanglo weihanglo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks really great! Both of the commit history and patch are quite clear. Thank you @rvolosatovs for working on this patch. Also thanks @bstrie for taking care of artifact dependencies for a long while. I might not be a good communicator but still happy to collaborate with you :)

The only concern I have is artifact_target_for_features being duplicated. It is not that bad as we already added a link to UnitFor::artifact_target_for_features. If @Byron or someone from the Cargo team feels it is wrong, please call it out. Thanks!

@weihanglo
Copy link
Member

@bors r+

@bors
Copy link
Collaborator

bors commented Dec 23, 2022

📌 Commit 385bba3 has been approved by weihanglo

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Dec 23, 2022
@bors
Copy link
Collaborator

bors commented Dec 23, 2022

⌛ Testing commit 385bba3 with merge 2381cbd...

@bors
Copy link
Collaborator

bors commented Dec 23, 2022

☀️ Test successful - checks-actions
Approved by: weihanglo
Pushing 2381cbd to master...

@bors bors merged commit 2381cbd into rust-lang:master Dec 23, 2022
@rvolosatovs rvolosatovs deleted the fix/bindeps-target branch December 23, 2022 13:07
bors pushed a commit to rust-lang-ci/rust that referenced this pull request Dec 23, 2022
7 commits in c994a4a638370bc7e0ffcbb0e2865afdfa7d4415..2381cbdb4e9b07090f552d34a44a529b6e620e44
2022-12-18 21:50:58 +0000 to 2022-12-23 12:19:27 +0000

- fix: deduplicate dependencies by artifact target (rust-lang/cargo#11478)
- Add warning if potentially-scrapable examples are skipped due to dev-dependencies (rust-lang/cargo#11503)
- Don't scrape examples from library targets by default (rust-lang/cargo#11499)
- Stabilize terminal-width (rust-lang/cargo#11494)
- Make sure that hash of `SourceId` is stable (rust-lang/cargo#11501)
- Use workspace lockfile when running `cargo package` and `cargo publish` (rust-lang/cargo#11477)
- Show `--help` if there is no man page for subcommand (rust-lang/cargo#11473)
bors added a commit to rust-lang-ci/rust that referenced this pull request Dec 23, 2022
Update cargo

7 commits in c994a4a638370bc7e0ffcbb0e2865afdfa7d4415..2381cbdb4e9b07090f552d34a44a529b6e620e44 2022-12-18 21:50:58 +0000 to 2022-12-23 12:19:27 +0000

- fix: deduplicate dependencies by artifact target (rust-lang/cargo#11478)
- Add warning if potentially-scrapable examples are skipped due to dev-dependencies (rust-lang/cargo#11503)
- Don't scrape examples from library targets by default (rust-lang/cargo#11499)
- Stabilize terminal-width (rust-lang/cargo#11494)
- Make sure that hash of `SourceId` is stable (rust-lang/cargo#11501)
- Use workspace lockfile when running `cargo package` and `cargo publish` (rust-lang/cargo#11477)
- Show `--help` if there is no man page for subcommand (rust-lang/cargo#11473)

r? `@ghost`
@ehuss ehuss added this to the 1.68.0 milestone Dec 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
7 participants