Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cargo clean -p removes fingerprints for any package that shares a prefix with the target #10069

Closed
jonhoo opened this issue Nov 11, 2021 · 7 comments · Fixed by #10621
Closed
Assignees
Labels

Comments

@jonhoo
Copy link
Contributor

jonhoo commented Nov 11, 2021

Problem

cargo clean -p foo will remove the fingerprint files for all crates that start with foo- due to this glob:

let pkg_dir = format!("{}-*", pkg.name());

Steps

#!/bin/bash
rm -rf clean-any-prefix
mkdir clean-any-prefix
cd clean-any-prefix

cargo new foo
cd foo
echo 'tracing-subscriber = { version = "*", default-features = false }' >> Cargo.toml
echo 'tracing = { version = "*", default-features = false }' >> Cargo.toml
cargo check
cp -r target target.old
cargo clean -p tracing
diff -qr target.old target | grep subscriber
Only in target.old/debug/.fingerprint: tracing-subscriber-4444b7d26073268f

Possible Solution(s)

This one is trickier to fix than #10068. The "best" option is likely to walk all the versions in the Resolve and remove each one explicitly rather than relying on a glob. This would also allow addressing this TODO:

if spec.version().is_some() {
config.shell().warn(&format!(
"version qualifier in `-p {}` is ignored, \
cleaning all versions of `{}` found",
spec_str,
spec.name()
))?;
}

Barring that, the removal could walk the results of the glob and then do filtering using something that checks for exactly {pkg.name()}-{semver-pattern} (maybe with a regex).

Notes

No response

Version

cargo 1.56.0 (4ed5d137b 2021-10-04)
release: 1.56.0
commit-hash: 4ed5d137baff5eccf1bae5a7b2ae4b57efad4a7d
commit-date: 2021-10-04
@jonhoo jonhoo added the C-bug Category: bug label Nov 11, 2021
@russweas
Copy link
Contributor

@rustbot claim

@russweas
Copy link
Contributor

Maybe I'm missing something, but don't filenames in target follow the pattern {name}-{metadata-hash}? For example, on my machine, the files are named foo-23e89febcb105717 and serde-c8c3e7c3b6ca906a. I've been looking through cargo's source code, but the only references to the second part of the filenames I could find is in compilation_files.rs, specifically generated by this single (private) function:

fn compute_metadata(
unit: &Unit,
cx: &Context<'_, '_>,
metas: &mut HashMap<Unit, MetaInfo>,
) -> MetaInfo {

I'm not sure how to access this from cargo_clean.rs without a total refactor of the repo to expose compute_metadata(). Or is there some way to access it using the semver-pattern you referenced in the issue?

I understand the codebase enough to do the rest of the refactor here, but I just can't figure out how to go from the semver to the filenames in target.

I think there's a pretty simple solution for the -p flag mixing up packages, though. All you have to do is split the glob filename by the last hyphen (or whatever other naming scheme is used for a particular filetype), and compare the first half of this split to the desired package name. This wouldn't address the version issue, though, and in general I feel that walking the Resolve would be a much more robust solution than relying on globs.

@jonhoo
Copy link
Contributor Author

jonhoo commented Nov 12, 2021

Oh, yeah, you're right — matching on semver wouldn't work here. Good catch!

I agree, removing explicitly each version from the Resolve seems like the better way to go here, and it'll also fix that TODO!

@russweas
Copy link
Contributor

I'm still not sure how to determine a filename from the Resolve though.

Even if I get a match between the spec's version and the version of a package I visit in a Resolve walk, there's still no way to map that matching package to the build artifacts in the target directory without a major refactor of compilation_files.rs. Even then, some data that's really deep in the compilation code like the LTO status is required to generate the hash, so I'm not sure it'd be possible to do the refactor without bootstrapping compilation processes too.

I can push a PR that fixes -p deleting packages that start with the same string as the desired package's name using simple String utils, but I'm still a little lost on how I could use Resolve to obtain the filenames in target.

@jonhoo
Copy link
Contributor Author

jonhoo commented Nov 13, 2021

Hmm, I see what you mean. I wonder if the way to go here instead then is to use glob to produce a list of files, and then walk that list and look for things that match {pkg.name()}-<something that looks like a hash>?

@CosmicHorrorDev
Copy link
Contributor

Filtering the entries returned by the glob seems like it would be a workable solution.

@russweas are you still planning on working on this, or would you mind if I took a stab at it?

@russweas
Copy link
Contributor

@LovecraftianHorror go for it! I just graduated, moved, and started a new job, won't have time for open-source for a while...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants