Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cargo sometimes chooses incorrect features #3812

Closed
jethrogb opened this issue Mar 10, 2017 · 11 comments · Fixed by #3843
Closed

Cargo sometimes chooses incorrect features #3812

jethrogb opened this issue Mar 10, 2017 · 11 comments · Fixed by #3843

Comments

@jethrogb
Copy link
Contributor

I've been experiencing “random rebuilds”: sometimes when running cargo build twice in a row, the second time it rebuilds some crates even though nothing changed.

I've spent some time debugging this issue and it seems to be caused by a low but non-negligible probability of incorrect feature selection for one of the crates in my dependency graph. This will then lead to a different fingerprint being computed and a rebuild happening.

When I run RUST_LOG=cargo::core::resolver=trace cargo build --lib --verbose|&egrep 'activating (tempdir|rand)', most of the time I get output like this:

TRACE:cargo::core::resolver: activating rand v0.3.14 (https://github.com/jethrogb/rand?branch=rdrand#b052c179) (replacing rand v0.3.14)
TRACE:cargo::core::resolver: activating rand v0.3.14 (https://github.com/jethrogb/rand?branch=rdrand#b052c179) (replacing rand v0.3.14)
TRACE:cargo::core::resolver: activating tempdir v0.3.5
TRACE:cargo::core::resolver: activating rand v0.3.14 (https://github.com/jethrogb/rand?branch=rdrand#b052c179) (replacing rand v0.3.14)
TRACE:cargo::core::resolver: activating rand v0.3.14 (https://github.com/jethrogb/rand?branch=rdrand#b052c179) (replacing rand v0.3.14)
TRACE:cargo::core::resolver: activating tempdir v0.3.5

or this:

TRACE:cargo::core::resolver: activating rand v0.3.14 (https://github.com/jethrogb/rand?branch=rdrand#b052c179) (replacing rand v0.3.14)
TRACE:cargo::core::resolver: activating tempdir v0.3.5
TRACE:cargo::core::resolver: activating rand v0.3.14 (https://github.com/jethrogb/rand?branch=rdrand#b052c179) (replacing rand v0.3.14)
TRACE:cargo::core::resolver: activating tempdir v0.3.5

But sometimes, I get this:

TRACE:cargo::core::resolver: activating rand v0.3.14 (https://github.com/jethrogb/rand?branch=rdrand#b052c179) (replacing rand v0.3.14)
TRACE:cargo::core::resolver: activating rand v0.3.14 (https://github.com/jethrogb/rand?branch=rdrand#b052c179) (replacing rand v0.3.14)
TRACE:cargo::core::resolver: activating tempdir v0.3.5
TRACE:cargo::core::resolver: activating tempdir v0.3.5
TRACE:cargo::core::resolver: activating rand v0.3.14 (https://github.com/jethrogb/rand?branch=rdrand#b052c179) (replacing rand v0.3.14)
TRACE:cargo::core::resolver: activating rand v0.3.14 (https://github.com/jethrogb/rand?branch=rdrand#b052c179) (replacing rand v0.3.14)

or this:

TRACE:cargo::core::resolver: activating rand v0.3.14 (https://github.com/jethrogb/rand?branch=rdrand#b052c179) (replacing rand v0.3.14)
TRACE:cargo::core::resolver: activating tempdir v0.3.5
TRACE:cargo::core::resolver: activating tempdir v0.3.5
TRACE:cargo::core::resolver: activating rand v0.3.14 (https://github.com/jethrogb/rand?branch=rdrand#b052c179) (replacing rand v0.3.14)
TRACE:cargo::core::resolver: activating rand v0.3.14 (https://github.com/jethrogb/rand?branch=rdrand#b052c179) (replacing rand v0.3.14)

When that happens, the incorrect features are selected for the rand crate. As far as I can tell, the resolver logic in cargo is run twice. If during the first time tempdir was activated after rand but during the second time tempdir is activated before rand, then the features that tempdir specifies for the rand crate are being used.

Note that tempdir is not actually ever built since it is a build-dependency of a crate that is a dependency of another crate, but not on my platform.

I suspect the non-determinism is coming from HashMaps or ReadDir. I have no clue where to go from here and could use some debugging help. I unfortunately can't share my Cargo.toml and have not been able to reproduce the issue in a smaller test case. My dependency graph includes about 400 crates and I'm using crates.io, git, and path dependencies as well as a [replace] section.

You may ignore the following, putting it here for my records:

Useful breakpoints:
1       breakpoint     keep y   0x0000561ecd6d04a7 in cargo::ops::cargo_rustc::fingerprint::calculate at /tmp/cargo/src/cargo/ops/cargo_rustc/fingerprint.rs:365
	stop only if unit->pkg->manifest.summary.package_id.inner.ptr.pointer.__0->data.name.vec.buf.ptr.pointer.__0[0] == 'r' && unit->pkg->manifest.summary.package_id.inner.ptr.pointer.__0->data.name.vec.buf.ptr.pointer.__0[1] == 'a' && unit->pkg->manifest.summary.package_id.inner.ptr.pointer.__0->data.name.vec.buf.ptr.pointer.__0[2] == 'n' && unit->pkg->manifest.summary.package_id.inner.ptr.pointer.__0->data.name.vec.buf.ptr.pointer.__0[3] == 'd' && fingerprint.ptr.pointer.__0->data.features.vec.buf.ptr.pointer.__0[13] != 'r'
2       breakpoint     keep y   0x0000561ecd6ea51a in cargo::ops::cargo_rustc::compile at /tmp/cargo/src/cargo/ops/cargo_rustc/mod.rs:226
3       breakpoint     keep y   0x0000561ecd642034 in cargo::core::resolver::{{impl}}::resolve_features at /tmp/cargo/src/cargo/core/resolver/mod.rs:972
	stop only if candidate->package_id.inner.ptr.pointer.__0->data.name.vec.buf.ptr.pointer.__0[0] == 'r' && candidate->package_id.inner.ptr.pointer.__0->data.name.vec.buf.ptr.pointer.__0[1] == 'a' && candidate->package_id.inner.ptr.pointer.__0->data.name.vec.buf.ptr.pointer.__0[2] == 'n' && candidate->package_id.inner.ptr.pointer.__0->data.name.vec.buf.ptr.pointer.__0[3] == 'd'
4       breakpoint     keep y   0x0000561ecd635868 in cargo::core::resolver::activate at /tmp/cargo/src/cargo/core/resolver/mod.rs:308
	stop only if candidate->summary.ptr.pointer.__0->value.package_id.inner.ptr.pointer.__0->data.name.vec.buf.ptr.pointer.__0[0] == 'r' && candidate->summary.ptr.pointer.__0->value.package_id.inner.ptr.pointer.__0->data.name.vec.buf.ptr.pointer.__0[1] == 'a' && candidate->summary.ptr.pointer.__0->value.package_id.inner.ptr.pointer.__0->data.name.vec.buf.ptr.pointer.__0[2] == 'n' && candidate->summary.ptr.pointer.__0->value.package_id.inner.ptr.pointer.__0->data.name.vec.buf.ptr.pointer.__0[3] == 'd'

Useful displays:
1:   y  unit->pkg->manifest.summary.package_id.inner.ptr.pointer.__0->data.name
2:   y  fingerprint.ptr.pointer.__0->data.features
3:   y  used_features.map.table.size
4:   y  *method
5:   y  parent->RUST$ENCODED$ENUM$0$None.__0->ptr.pointer.__0->value.package_id.inner.ptr.pointer.__0->data.name
@alexcrichton
Copy link
Member

Thanks for the report, something definitely sounds fishy! Note that cargo's resolver is slightly nondeterminstic (due to usage of hashmap somewhere), but in general that shouldn't affect this. It's normal to take different paths each time to reach the same resolution graph.

Can you try running your builds with RUST_LOG=cargo::ops::cargo_rustc::fingerprint=info? That should print out information about why a crate is being rebuilt and should confirm the resolution issue.

Also, if you've got a full log of RUST_LOG=debug of two runs where one was cached and the other rebuilt spuriously, that'd be supef helpful as well! (a local reproduction would also be helpful)

@jethrogb
Copy link
Contributor Author

jethrogb commented Mar 10, 2017

I had modified fingerprint.rs to give additional information:

diff --git a/src/cargo/ops/cargo_rustc/fingerprint.rs b/src/cargo/ops/cargo_rustc/fingerprint.rs
index 25c734a..ff47558 100644
--- a/src/cargo/ops/cargo_rustc/fingerprint.rs
+++ b/src/cargo/ops/cargo_rustc/fingerprint.rs
@@ -127,7 +127,7 @@ pub fn prepare_target<'a, 'cfg>(cx: &mut Context<'a, 'cfg>,
 /// `DependencyQueue`, but it also needs to be retained here because Cargo can
 /// be interrupted while executing, losing the state of the `DependencyQueue`
 /// graph.
-#[derive(Serialize, Deserialize)]
+#[derive(Serialize, Deserialize, Debug)]
 pub struct Fingerprint {
     rustc: u64,
     features: String,
@@ -168,12 +168,13 @@ fn deserialize_deps<D>(d: D) -> Result<Vec<(String, Arc<Fingerprint>)>, D::Error
     }).collect())
 }
 
-#[derive(Serialize, Deserialize, Hash)]
+#[derive(Serialize, Deserialize, Hash, Debug)]
 enum LocalFingerprint {
     Precalculated(String),
     MtimeBased(MtimeSlot, PathBuf),
 }
 
+#[derive(Debug)]
 struct MtimeSlot(Mutex<Option<FileTime>>);
 
 impl Fingerprint {
@@ -251,7 +252,7 @@ impl Fingerprint {
         }
         for (a, b) in self.deps.iter().zip(old.deps.iter()) {
             if a.1.hash() != b.1.hash() {
-                bail!("new ({}) != old ({})", a.0, b.0)
+                bail!("new ({} {:#?}) != old ({} {:#?})", a.0, a.1, b.0, b.1)
             }
         }
         Ok(())

This is the line you're looking for:

INFO:cargo::ops::cargo_rustc::fingerprint: fingerprint error for uuid v0.3.1 (URL redacted): new (rand v0.3.14 (https://github.com/jethrogb/rand?branch=rdrand#b052c179) Fingerprint {
    rustc: 14803062808521079198,
    features: "[\"core_io\", \"default\", \"libc\", \"rdrand\", \"std\"]",
    target: 17555397676896223371,
    profile: 14528549471338536373,
    deps: [
        (
            "core_io v0.1.20161028",
            Fingerprint {
                rustc: 14803062808521079198,
                features: "[]",
                target: 4212954048505086387,
                profile: 14528549471338536373,
                deps: [],
                local: Precalculated(
                    "0.1.20161028"
                ),
                memoized_hash: Mutex { data: None },
                rustflags: []
            }
        ),
        (
            "libc v0.2.17 (URL redacted)",
            Fingerprint {
                rustc: 14803062808521079198,
                features: "[\"default\", \"use_std\"]",
                target: 8311546696561608402,
                profile: 14528549471338536373,
                deps: [],
                local: Precalculated(
                    "14d241bb57413a3f8794827fb66bac8f7b93b2d8"
                ),
                memoized_hash: Mutex { data: Some(13088676079623628845) },
                rustflags: []
            }
        )
    ],
    local: Precalculated(
        "b052c17960b8e8ead7af61acc07fbbc2b7458792"
    ),
    memoized_hash: Mutex { data: Some(11817633241878475375) },
    rustflags: []
}) != old (rand v0.3.14 (https://github.com/jethrogb/rand?branch=rdrand#b052c179) Fingerprint {
    rustc: 0,
    features: "",
    target: 0,
    profile: 0,
    deps: [],
    local: Precalculated(
        ""
    ),
    memoized_hash: Mutex { data: Some(11580094398159781064) },
    rustflags: []
})

I do have the outputs in both cases with RUST_LOG=cargo=trace but they will require redacting before I can publish them.

11580094398159781064 is pretty much the same fingerprint except with features: "[\"core_io\", \"rdrand\"]"

@alexcrichton
Copy link
Member

Yeah if you can gist that between two runs that's most helpful. We don't save off the entire fingerprint (as it'd literally take up gigabytes of data) so just the memoized hash is restored (which is all we need for a comparison).

The two runs should have different "new" fingerprints where we can confirm it's different feature selection. The full RUST_LOG=cargo=debug output should also have info about the complete resolution graph that Cargo generated which should also help for diagnosing this.

If you'd prefer you can send me links privately on IRC if you'd prefer to avoid posting here as well!

@jethrogb
Copy link
Contributor Author

Going the other way:

INFO:cargo::ops::cargo_rustc::fingerprint: fingerprint error for uuid v0.3.1 (URL redacted): new (rand v0.3.14 (https://github.com/jethrogb/rand?branch=rdrand#b052c179) Fingerprint {
    rustc: 14803062808521079198,
    features: "[\"core_io\", \"rdrand\"]",
    target: 17555397676896223371,
    profile: 14528549471338536373,
    deps: [
        (
            "core_io v0.1.20161028",
            Fingerprint {
                rustc: 14803062808521079198,
                features: "[]",
                target: 4212954048505086387,
                profile: 14528549471338536373,
                deps: [],
                local: Precalculated(
                    "0.1.20161028"
                ),
                memoized_hash: Mutex { data: None },
                rustflags: []
            }
        )
    ],
    local: Precalculated(
        "b052c17960b8e8ead7af61acc07fbbc2b7458792"
    ),
    memoized_hash: Mutex { data: Some(11580094398159781064) },
    rustflags: []
}) != old (rand v0.3.14 (https://github.com/jethrogb/rand?branch=rdrand#b052c179) Fingerprint {
    rustc: 0,
    features: "",
    target: 0,
    profile: 0,
    deps: [],
    local: Precalculated(
        ""
    ),
    memoized_hash: Mutex { data: Some(11817633241878475375) },
    rustflags: []
})

As I said, the root difference is in the features (and further dependencies selected based on that).

I e-mailed you the trace logs.

@alexcrichton
Copy link
Member

@jethrogb to confirm, you're using an up-to-date Cargo right?

@jethrogb
Copy link
Contributor Author

I've tried several versions including the latest nightly as of last Friday when I filed this issue.

@alexcrichton
Copy link
Member

Ok cool, I'm still digging into this.

If you can get me access to the various Cargo.tomls (at least around the problematic ones) that may make debugging easier, but I'm not certain I'll need that yet (sorry haven't had a chance to really dive in yet)

@alexcrichton
Copy link
Member

@jethrogb can gist the Cargo.toml sections for the following dependency edges? (basically just the dependency configuration here)

  • uuid -> rand
  • tempdir -> rand
  • backend -> rand
  • model_types -> uuid
  • api_model -> uuid
  • db_model -> uuid
  • backend -> uuid
  • iron-test -> uuid

@alexcrichton
Copy link
Member

Aha nevermind, I've minimized it!

Working on a fix.

alexcrichton added a commit to alexcrichton/cargo that referenced this issue Mar 17, 2017
Previously Cargo had a bug where if a crate *without* a default feature was
overridden with one that did indeed have a default feature then the default may
not end up getting activated by accident. The fix was to avoid returning too
quickly hen activating dependencies until after we've inspected and learned
about replacements.

Closes rust-lang#3812
@alexcrichton
Copy link
Member

Fix: #3843

@jethrogb
Copy link
Contributor Author

Fantastic! Will try right away.

bors added a commit that referenced this issue Mar 18, 2017
…atklad

Fix overriding mixing with default features

Previously Cargo had a bug where if a crate *without* a default feature was
overridden with one that did indeed have a default feature then the default may
not end up getting activated by accident. The fix was to avoid returning too
quickly hen activating dependencies until after we've inspected and learned
about replacements.

Closes #3812
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants