Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hold the mutate exclusive lock when vendoring #12509

Merged
merged 1 commit into from
Dec 20, 2023

Conversation

Rustin170506
Copy link
Member

@Rustin170506 Rustin170506 commented Aug 16, 2023

What does this PR try to resolve?

close #12254

Held the mutate exclusive lock when vendoring. This would block all other processes which want to read or change the unpacked files.

See: #12509 (comment)

r? @weihanglo

@rustbot rustbot added Command-vendor S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Aug 16, 2023
@Rustin170506 Rustin170506 marked this pull request as draft August 16, 2023 02:45
@ehuss
Copy link
Contributor

ehuss commented Aug 16, 2023

I assume this is related to #12254?

@weihanglo
Copy link
Member

Thanks and sorry I didn't speak it more clearly. I would expect to see acquiring a lock there first. If somebody can working on digging into the issue deeper, we can perhaps remove this piece of code when we feel safe.

Thinking a bit more on the integrity of cargo-vendor, re-unpacking source looks like a safer solution before we can find a way to resolve #9455. We never know if some other tools changes unpacked registry sources, which is already broken when you run cargo build today, unfortunately 😢.

@ehuss
Copy link
Contributor

ehuss commented Aug 16, 2023

I would expect to see acquiring a lock there first.

Just FYI, I don't think the existing locking would help here, since the problem is with the build process which does not hold a lock. I am currently reworking locking for the garbage collection support, which adds read locks to the build process. It might be usable for this situation.

I'm also curious, the other alternative from the original issue was to read from the .crate files instead of the src cache. I'm wondering how difficult that would be to support?

@Rustin170506
Copy link
Member Author

I'm also curious, the other alternative from the original issue was to read from the .crate files instead of the src cache. I'm wondering how difficult that would be to support?

Do you mean we need to unpack it again from .cargo/registry/cache/xxx/some-pkg.crate?

@ehuss
Copy link
Contributor

ehuss commented Aug 17, 2023

Do you mean we need to unpack it again from .cargo/registry/cache/xxx/some-pkg.crate?

Yea, to have cargo vendor directly read from the .crate and extract it into the vendor folder instead of reading from the src cache. I'm not sure what the API for that would look like, I'm just curious what kind of options there are or how difficult that would be to do.

@Rustin170506
Copy link
Member Author

I'm not sure what the API for that would look like, I'm just curious what kind of options there are or how difficult that would be to do.

I think it can be done. The tricky part is that we have to calculate the checksum for each file we want to copy. So we may need another unpack_package function with filters and checksum calculation.

But there is an easy way to do it, we can unpack it to a temp folder first and then execute the original logic. But I think this has a big impact on performance.

@Rustin170506
Copy link
Member Author

I think it can be done. The tricky part is that we have to calculate the checksum for each file we want to copy. So we may need another unpack_package function with filters and checksum calculation.

@ehuss Do you think this is a reasonable way to move forward? I am happy to give it a try.

@ehuss
Copy link
Contributor

ehuss commented Sep 23, 2023

I'm looking over the vendor code, and I think it looks like it would be quite challenging to directly extract from the .crate files. There would need to be two variants of cp_sources, one that reads from a source directory (like git dependencies) and one from .crate files. It also seems complex, since there are multiple workspaces and multiple sets of Sources (a potentially different set from each workspace). It could possibly build a union of all the sources, and then ask the appropriate Source to handle extracting into the target directory. That seems like it could be a pretty large change to me, though, and quite hard to avoid duplicating some of the code.

#12706 is introducing the cache locking support. That might be a much easier option, since it would be a roughly one line change to acquire an exclusive lock. It's still unpleasant, since this remove_dir_all stuff has a fairly significant performance penalty, and it would be nice to remove it.

Unfortunately I don't think I have the time to dig into exactly how to architect a design that would support extracting .crate files directly. Perhaps @weihanglo can help if you want to continue down that path? It would be nice (due to the significant performance improvement), but it seems like it would require a lot of changes.

@weihanglo
Copy link
Member

r? weihanglo

Would find a time looking into it and discuss with hi-rustin :)

Signed-off-by: hi-rustin <rustin.liu@gmail.com>
@Rustin170506 Rustin170506 marked this pull request as ready for review December 18, 2023 01:52
@Rustin170506 Rustin170506 changed the title Revert alexcrichton/cargo-vendor#131 Hold the mutate exclusive lock when vendoring Dec 18, 2023
@rustbot
Copy link
Collaborator

rustbot commented Dec 18, 2023

Could not assign reviewer from: weihanglo.
User(s) weihanglo are either the PR author, already assigned, or on vacation, and there are no other candidates.
Use r? to specify someone else to assign.

@Rustin170506
Copy link
Member Author

#12706 is introducing the cache locking support. That might be a much easier option, since it would be a roughly one line change to acquire an exclusive lock. It's still unpleasant, since this remove_dir_all stuff has a fairly significant performance penalty, and it would be nice to remove it.

I've added the exclusive lock now. But it is still very difficult to remove remove_dir_all stuff because we don't know the reason of #5956. I don't think it's safe to remove this code unless we force users not to change the unpacked file in the src directory.

@weihanglo @ehuss Should we move forward with this PR? Or do you have any other thoughts?

@ehuss
Copy link
Contributor

ehuss commented Dec 18, 2023

I'm fine with moving forward with this. I agree that it seems like removing remove_dir_all will be quite challenging.

Copy link
Member

@weihanglo weihanglo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This works as expected. Thank you.

Still not pleasant, as now cargo hold the mutate exclusive lock for the entire duration of cargo vendor.

I wonder if there is a way to “downgrade” from MutateExclusive to DownloadExclusive/Shared so that cargo doesn't need to hold the lock that long.

@weihanglo
Copy link
Member

@bors r+

@bors
Copy link
Collaborator

bors commented Dec 20, 2023

📌 Commit 2e77dc1 has been approved by weihanglo

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Dec 20, 2023
@bors
Copy link
Collaborator

bors commented Dec 20, 2023

⌛ Testing commit 2e77dc1 with merge 9a77459...

@bors
Copy link
Collaborator

bors commented Dec 20, 2023

☀️ Test successful - checks-actions
Approved by: weihanglo
Pushing 9a77459 to master...

@bors bors merged commit 9a77459 into rust-lang:master Dec 20, 2023
20 checks passed
bors added a commit to rust-lang-ci/rust that referenced this pull request Dec 22, 2023
Update cargo

10 commits in 1a2666ddd14cf0a255d4ddb61c63531c259a7b39..363a2d11320faf531f6aacd1ea067c6bc08343b9
2023-12-17 17:53:53 +0000 to 2023-12-22 03:12:42 +0000
- refactor: centralize git checkouts and db paths (rust-lang/cargo#13187)
- Bump to 0.78.0; update changelog (rust-lang/cargo#13192)
- refactor: custom error types for `cargo-util-schemas` (rust-lang/cargo#13186)
- chore(deps): update rust crate handlebars to `v4.5.0` (rust-lang/cargo#13168)
- Hold the mutate exclusive lock when vendoring (rust-lang/cargo#12509)
- refactor: clean up package metadata (rust-lang/cargo#13184)
- ci: check SemVer for cargo-util-schemas on CI (rust-lang/cargo#13185)
- refactor(schemas): Pull out as `cargo-util-schemas` (rust-lang/cargo#13178)
- chore(rustfix): rename Readme.md to README.md (rust-lang/cargo#13181)
- chore(rustfix): remove useless clippy rules and fix a typo (rust-lang/cargo#13182)

r? ghost
@rustbot rustbot added this to the 1.77.0 milestone Dec 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Command-vendor S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

cargo vendor modifies the package cache without locking
5 participants