-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
parallelize x.py test tidy #81833
parallelize x.py test tidy #81833
Conversation
(rust-highfive has picked a reviewer for you, use r? to override) |
☔ The latest upstream changes (presumably #81768) made this pull request unmergeable. Please resolve the merge conflicts. |
11ae980
to
426a033
Compare
rebased and switched from rayon to crossbeam-utils. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ~5x win you list is nice, and it doesn't look like this adds too much complexity, so it's overall not too bad... I am inclined to merge, but would like to spend a bit of time thinking about it.
src/tools/tidy/src/bins.rs
Outdated
let mut temp_path = path.join("tidy-test-file"); | ||
// | ||
// We also append the thread ID to avoid threads trampling on each others files. | ||
let file_name = format!("t{}.tidy-test-file", std::thread::current().id().as_u64()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This implies we're visiting the same path more than once, which seems surprising to me - why is this necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this isn't actually needed. I did this change before the one in filter_dirs
. I'll retest with only the filter_dirs
change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thread '' panicked at 'Deleted temp file: Os { code: 2, kind: NotFound, message: "No such file or directory" }', src/tools/tidy/src/bins.rs:43:46
Looks like adding the thread ID was necessary after all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not satisfied with just an error message - something is wrong with at least my mental model of the code, and that leads me to believe this may just be hiding some other bug. I would like to see an explanation for why the error is happening.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One possibility is that the first create fails and then the 2nd attempt uses the output dir which is shared between the passes.
rust/src/tools/tidy/src/bins.rs
Lines 36 to 38 in 3c10a88
match fs::File::create(&temp_path).or_else(|_| { | |
temp_path = output.join("tidy-test-file"); | |
fs::File::create(&temp_path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we not parallelize this pass perhaps to avoid this? I'm not convinced by your explanation I guess and it feels subtle; I'd rather have slow correct code than wrong parallel code :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could, but I don't quite understand the concern. The tests are largely non-mutating. This is the only mutating part and the fix I did would be necessary for correctness anyway due to to the possibility mentioned above.
So what do you fear might be going wrong beyond this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The modified comment in the code doesn't suggest a correctness fix to me, so maybe I overlooked that - if there's a correctness fix here, I'm not yet seeing how we've avoided the problem by way of thread ID... does this problem occur before your PR? If so, then I don't see how adding thread ID helps, as previously it'd always be zero, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I was trying to say is that it is a necessary part of making these checks concurrent because the code path creating the files may create them in either the source or the output folder and the output folder is shared. So if two checks run concurrently then they might create the file twice and try to delete it twice which will fail if it's not a unique file name.
Your concern seems to be that it's not sufficient, but I don't see anything else that might trigger this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW (I guess we're moving it to a separate PR) I think I'd suggest we move this out of the parallel code or cache the results; they shouldn't change and doing it just once seems simpler than trying to make sure we don't clobber each other. It also helps us avoid accidentally getting different results in different threads, which could be quite confusing.
Also, one thing we'll want to do is respect -j when passed to x.py, for both tidy and fmt checking. Ideally we'd also utilize the jobserver crate to properly handle parallelism when running under a larger make, but that seems complicated. |
Bootstrap currently doesn't provide a jobserver, right? Only cargo does afaict. Getting the testsuite to run under a jobserver would be nice since it could unlock some additional room for parallelism - e.g. building stage 0 compiletest or rustdoc concurrent with the stage 1 compile - but that seems like a much bigger task overall. |
@rustbot label: +S-waiting-on-review -S-waiting-on-author |
f61f81b
to
feea5b1
Compare
This comment has been minimized.
This comment has been minimized.
feea5b1
to
a8c1351
Compare
Local measurement suggests that 70% of the tidy runtime comes from rustfmt, so it might be good to just skip the parallelization of the tidy part (avoiding complexity there), and retain just the rustfmt parallelism; ignore is already walking the tree in parallel today so there's not much added complexity there. Does that make sense to you? |
I'd prefer if the runtime were even lower, rather than higher. But if that's the only way to get this merged... what choice do I have? I would prefer to understand your concerns though. |
I'll split the tidy changes into a separate PR so they can be discussed on their own. |
Well, there is a cost to the parallelism being added to tidy - crossbeam is not the lightest of dependencies, and I'd expect a bug report down the line that we're not handling jobserver tokens right (when -j1 is passed to make, rather than x.py directly). Ultimately I think we can agree that there's a cost to the added parallelism; reduced but not eliminated by the safety guarantees Rust provides. I'm not sure that justifying that tradeoff for a few seconds of time is worth it, but it's also true that I don't really run tidy locally at all; others' workflows may involve calling it more often, and that may mean that they care more about this. It's hard for me to accept added code and complexity for something that at least for me is primarily a CI cost. The rustfmt parallelism is presumably where the majority of the wins this PR brings come from, and so I'm more willing to accept those - they're also in theory less impactful from a maintenance perspective, as they're entirely local (i.e., we just call rustfmt with more processes at once). I wouldn't be surprised if we could get similar wins without any parallelism, just batching ~15-20 files to rustfmt at a time, but maybe my calculus is off there. Finding a way to avoid --skip-children might be even more rewarding |
Yeah, I run it frequently when doing interactive git rebase. I was also thinking of using it as a commit hook but it's still too slow for that.
It's already batching (N = 8). I measured performance win from batch sizes in this discussion and larger batches provide barely any speedups. I also ran this under perf, most of the wins are from parallelism.
I can write this without crossbeam if you want, 1 thread per check, up to
Eh, this only runs for a few seconds, so it shouldn't screw with anyone's build system. I mean jobserver support for all of bootstrap is a great idea, but by no means trivial, at least if we want to fully utilize it e.g. by parallelizing build phases and ordering them before test phases. But I don't see it as a killer feature in this particular case. There are much bigger fish not obeying jobserver tokens, e.g. compiletest which runs for minutes. |
a8c1351
to
eeaed93
Compare
old: ``` real 0m11.123s user 0m14.495s sys 0m5.227s ``` new: ``` real 0m2.767s user 0m13.014s sys 0m1.691s ```
eeaed93
to
c071970
Compare
Updated timings with just the rustfmt changes
|
@bors r+ rollup Ok, thanks for splitting things out. tidy and rustfmt are pretty orthogonal anyway so it seems good to land them separately. I think I'm still interested in seeing the tidy changes - see my last comment on the one direct problem I wanted to remove before landing those changes. |
📌 Commit c071970 has been approved by |
… r=Mark-Simulacrum parallelize x.py test tidy Running tidy on individual commits when rewriting git history was somewhat of an annoyance, so I have parallelized it a bit. running `time ./x.py test tidy` with warm IO caches: old: ``` real 0m11.123s user 0m14.495s sys 0m5.227s ``` new: ``` real 0m1.834s user 0m13.545s sys 0m3.094s ``` There's further room for improvement (<0.9s should be feasible) but that would require bigger changes.
… r=Mark-Simulacrum parallelize x.py test tidy Running tidy on individual commits when rewriting git history was somewhat of an annoyance, so I have parallelized it a bit. running `time ./x.py test tidy` with warm IO caches: old: ``` real 0m11.123s user 0m14.495s sys 0m5.227s ``` new: ``` real 0m1.834s user 0m13.545s sys 0m3.094s ``` There's further room for improvement (<0.9s should be feasible) but that would require bigger changes.
Rollup of 11 pull requests Successful merges: - rust-lang#81300 (BTree: share panicky test code & test panic during clear, clone) - rust-lang#81706 (Document BinaryHeap unsafe functions) - rust-lang#81833 (parallelize x.py test tidy) - rust-lang#81966 (Add new `rustc` target for Arm64 machines that can target the iphonesimulator) - rust-lang#82154 (Update RELEASES.md 1.50 to include methods stabilized in rust-lang#79342) - rust-lang#82177 (Do not delete bootstrap.exe on Windows during clean) - rust-lang#82181 (Add check for ES5 in CI) - rust-lang#82229 (Add [A-diagnostics] bug report template) - rust-lang#82233 (try-back-block-type test: Use TryFromSliceError for From test) - rust-lang#82302 (Remove unsafe impl Send for CompletedTest & TestResult) - rust-lang#82349 (test: Print test name only once on timeout) Failed merges: r? `@ghost` `@rustbot` modify labels: rollup
…lacrum Parallelize tidy Split off from rust-lang#81833 While that PR brings wall time of `x.py test tidy` down to 0m2.847s adding this one on top should bring it down to 0m1.673s. r? `@Mark-Simulacrum` Previous concerns can be found at rust-lang#81833 (comment) and rust-lang#81833 (comment)
Running tidy on individual commits when rewriting git history was somewhat of an annoyance, so I have parallelized it a bit.
running
time ./x.py test tidy
with warm IO caches:old:
new:
There's further room for improvement (<0.9s should be feasible) but that would require bigger changes.