Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ignore files in .gitignore in tidy #106440

Merged
merged 3 commits into from
Mar 5, 2023
Merged

Ignore files in .gitignore in tidy #106440

merged 3 commits into from
Mar 5, 2023

Conversation

jyn514
Copy link
Member

@jyn514 jyn514 commented Jan 4, 2023

  • Switch from walkdir to ignore. This required various changes to make skip thread-safe.
  • Ignore build anywhere in the source tree, not just at the top-level. We support this in bootstrap, we should support it in tidy too.

As a nice side benefit, this also makes tidy a bit faster.

Before:

; hyperfine -i '"/home/gh-jyn514/rust2/build/aarch64-unknown-linux-gnu/stage0-tools-bin/rust-tidy" "/home/gh-jyn514/rust2" "/home/gh-jyn514/rust2/build/aarch64-unknown-linux-gnu/stage0/bin/cargo" "/home/gh-jyn514/rust2/build" "32"'
Benchmark 1: "/home/gh-jyn514/rust2/build/aarch64-unknown-linux-gnu/stage0-tools-bin/rust-tidy" "/home/gh-jyn514/rust2" "/home/gh-jyn514/rust2/build/aarch64-unknown-linux-gnu/stage0/bin/cargo" "/home/gh-jyn514/rust2/build" "32"
  Time (mean ± σ):      1.080 s ±  0.008 s    [User: 2.616 s, System: 3.243 s]
  Range (min … max):    1.069 s …  1.099 s    10 runs

After:

; hyperfine '"/home/gh-jyn514/rust2/build/aarch64-unknown-linux-gnu/stage0-tools-bin/rust-tidy" "/home/gh-jyn514/rust2" "/home/gh-jyn514/rust2/build/aarch64-unknown-linux-gnu/stage0/bin/cargo" "/home/gh-jyn514/rust2/build" "32"'
Benchmark 1: "/home/gh-jyn514/rust2/build/aarch64-unknown-linux-gnu/stage0-tools-bin/rust-tidy" "/home/gh-jyn514/rust2" "/home/gh-jyn514/rust2/build/aarch64-unknown-linux-gnu/stage0/bin/cargo" "/home/gh-jyn514/rust2/build" "32"
  Time (mean ± σ):     705.0 ms ±   1.4 ms    [User: 3179.1 ms, System: 1517.5 ms]
  Range (min … max):   702.3 ms … 706.9 ms    10 runs

r? @the8472

@rustbot rustbot added A-testsuite Area: The testsuite used to check the correctness of rustc S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jan 4, 2023
@jyn514
Copy link
Member Author

jyn514 commented Jan 4, 2023

I originally tried making all these checks parallel before realizing that you'd already done that at the top level 🤦 just not at the file level which is the thing I did. If you want the code it's here: https://github.com/rust-lang/rust/compare/master...jyn514:rust:parallel-tidy?expand=1)

but not sure how useful it is given that walk is within the checks and not at the top-level, seems pretty hard to be able to intersperse the checks with each other.

@the8472
Copy link
Member

the8472 commented Jan 4, 2023

Switching to file level parallelism may be better on many-core systems because doing it at the top level is fairly coarse and the longest-running tasks (style.rs was the slowest last time I checked) end up dominating wall-time.
I did the top-level parallelization because it was good enough back then, but tidy has grown since.

I'll do a comparison of this branch vs. the file-parallel one on my machine.

@jyn514
Copy link
Member Author

jyn514 commented Jan 4, 2023

oh sorry I forgot to mention, it saved nearly no time at all :/ first image is with this PR and second image is with file-level parallelism.
image

image

@the8472
Copy link
Member

the8472 commented Jan 4, 2023

Yes, it appears so. But it's not saturating my CPU, so there's still some bottleneck.

# master
$ ./x test tidy && hyperfine -w 5 '"build/x86_64-unknown-linux-gnu/stage0-tools-bin/rust-tidy" "." "build/x86_64-unknown-linux-gnu/stage0/bin/cargo" "build" "48"'
Benchmark #1: "build/x86_64-unknown-linux-gnu/stage0-tools-bin/rust-tidy" "." "build/x86_64-unknown-linux-gnu/stage0/bin/cargo" "build" "48"
  Time (mean ± σ):     612.7 ms ±  20.5 ms    [User: 2.189 s, System: 1.378 s]
  Range (min … max):   572.2 ms … 643.1 ms    10 runs


# this PR
Benchmark #1: "build/x86_64-unknown-linux-gnu/stage0-tools-bin/rust-tidy" "." "build/x86_64-unknown-linux-gnu/stage0/bin/cargo" "build" "48"
  Time (mean ± σ):     607.8 ms ±  32.2 ms    [User: 2.635 s, System: 1.411 s]
  Range (min … max):   559.6 ms … 658.3 ms    10 runs


# file-parallel branch
Benchmark #1: "build/x86_64-unknown-linux-gnu/stage0-tools-bin/rust-tidy" "." "build/x86_64-unknown-linux-gnu/stage0/bin/cargo" "build" "48"
  Time (mean ± σ):     635.7 ms ±  24.6 ms    [User: 3.046 s, System: 1.639 s]
  Range (min … max):   597.7 ms … 687.2 ms    10 runs

Thread timeline of this PR, you can see it's waiting on some cargo calls from tidy, the other checks finish much faster:

image

Same of the parallel version, more threads for the tidy work but it's waiting for cargo anyway:

image

If those cargo calls could be parallelized and sped up then the file-parallel version might end up faster, especially if the work could be spread more evenly across the threads (ignore maybe lacks workstealing?).

@jyn514
Copy link
Member Author

jyn514 commented Jan 4, 2023

Oh, interesting - I think all the cargo calls are in src/tools/tidy/src/deps.rs, which is why they're serialized - I think it shouldn't be too hard to spawn them all in parallel and then wait on each :)

I don't want to block this PR on that change though, I think it's already a good improvement even without the file-level parallelism.

src/tools/tidy/src/walk.rs Outdated Show resolved Hide resolved
@jyn514 jyn514 added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jan 5, 2023
@bors

This comment was marked as resolved.

- Switch from `walkdir` to `ignore`. This required various changes to
  make `skip` thread-safe.
- Ignore `build` anywhere in the source tree, not just at the top-level.
  We support this in bootstrap, we should support it in tidy too.

As a nice side benefit, this also makes tidy a bit faster.

Before:
```
; hyperfine -i '"/home/gh-jyn514/rust2/build/aarch64-unknown-linux-gnu/stage0-tools-bin/rust-tidy" "/home/gh-jyn514/rust2" "/home/gh-jyn514/rust2/build/aarch64-unknown-linux-gnu/stage0/bin/cargo" "/home/gh-jyn514/rust2/build" "32"'
Benchmark 1: "/home/gh-jyn514/rust2/build/aarch64-unknown-linux-gnu/stage0-tools-bin/rust-tidy" "/home/gh-jyn514/rust2" "/home/gh-jyn514/rust2/build/aarch64-unknown-linux-gnu/stage0/bin/cargo" "/home/gh-jyn514/rust2/build" "32"
  Time (mean ± σ):      1.080 s ±  0.008 s    [User: 2.616 s, System: 3.243 s]
  Range (min … max):    1.069 s …  1.099 s    10 runs
```

After:
```
; hyperfine '"/home/gh-jyn514/rust2/build/aarch64-unknown-linux-gnu/stage0-tools-bin/rust-tidy" "/home/gh-jyn514/rust2" "/home/gh-jyn514/rust2/build/aarch64-unknown-linux-gnu/stage0/bin/cargo" "/home/gh-jyn514/rust2/build" "32"'
Benchmark 1: "/home/gh-jyn514/rust2/build/aarch64-unknown-linux-gnu/stage0-tools-bin/rust-tidy" "/home/gh-jyn514/rust2" "/home/gh-jyn514/rust2/build/aarch64-unknown-linux-gnu/stage0/bin/cargo" "/home/gh-jyn514/rust2/build" "32"
  Time (mean ± σ):     705.0 ms ±   1.4 ms    [User: 3179.1 ms, System: 1517.5 ms]
  Range (min … max):   702.3 ms … 706.9 ms    10 runs
```
`WalkBuilder` handles top-level paths differently than `fn walk` used
to: it doesn't run the `skip` function to determine if it should be
skipped, instead assuming the top-level function is always included.

This is a reasonable assumption; adapt our code so it doesn't make
pointless calls to `walk`.
@the8472
Copy link
Member

the8472 commented Mar 5, 2023

Is walkdir still used anywhere else in tidy? Otherwise it could be removed from its Cargo.toml

@jyn514
Copy link
Member Author

jyn514 commented Mar 5, 2023

It's used in mir_opt_tests:

let files = walkdir::WalkDir::new(&path.join("mir-opt")).into_iter();

@the8472
Copy link
Member

the8472 commented Mar 5, 2023

Would replacing it there reduce build times? If not r=me.

@jyn514
Copy link
Member Author

jyn514 commented Mar 5, 2023

I don't think so, the only files there are .rs, .mir, .diff, and .html files, and all of them are looked at (intentionally).

@bors r=the8472 rollup

@bors
Copy link
Contributor

bors commented Mar 5, 2023

📌 Commit 98334f6 has been approved by the8472

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Mar 5, 2023
Comment on lines -51 to +60
if t!(File::open(entry.path()), entry.path()).read_to_string(&mut contents).is_err() {
contents.clear();
}
f(&entry, &contents);
let mut file = t!(File::open(entry.path()), entry.path());
t!(file.read_to_end(&mut contents), entry.path());
let contents_str = match std::str::from_utf8(&contents) {
Ok(s) => s,
Err(_) => return, // skip this file
};
f(&entry, &contents_str);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why replacingread_to_string with read_to_end and from_utf8?

Copy link
Member Author

@jyn514 jyn514 Mar 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The behavior on error is different. Before it would call f on an empty file, now it skips calling f altogether (and gives a hard error if entry.path() is missing).

bors added a commit to rust-lang-ci/rust that referenced this pull request Mar 5, 2023
…iaskrgr

Rollup of 7 pull requests

Successful merges:

 - rust-lang#106440 (Ignore files in .gitignore in tidy)
 - rust-lang#108613 (Remove `llvm.skip-rebuild` option)
 - rust-lang#108616 (Sync codegen defaults with compiler defaults and add a ping message so they stay in sync)
 - rust-lang#108618 (Rename `src/etc/vscode_settings.json` to `rust_analyzer_settings.json`)
 - rust-lang#108626 (rustdoc-json: switch from HashMap to FxHashMap to fix non-determinism)
 - rust-lang#108744 (Don't ICE when encountering bound var in builtin copy/clone bounds)
 - rust-lang#108749 (Clean up rustdoc-js tester.js file)

Failed merges:

r? `@ghost`
`@rustbot` modify labels: rollup
@bors bors merged commit 2f8bf34 into rust-lang:master Mar 5, 2023
@rustbot rustbot added this to the 1.70.0 milestone Mar 5, 2023
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this pull request Mar 18, 2023
Speed up tidy quite a lot

I highly recommend reviewing this commit-by-commit. Based on rust-lang#106440 for convenience.

## Timings

These were collected by running `x test tidy -v` to copy paste the command, then using [`samply record`](https://github.com/mstange/samply).

before (8 threads)
![image](https://user-images.githubusercontent.com/23638587/222965319-352ad2c8-367c-4d74-960a-e4bb161a6aff.png)

after (8 threads) ![image](https://user-images.githubusercontent.com/23638587/222965323-fa846f4e-727a-4bf8-8e3b-1b7b40505cc3.png)

before (64 threads) ![image](https://user-images.githubusercontent.com/23638587/222965302-dc88020c-19e9-49d9-a87d-cad054d717f3.png)
after (64 threads) ![image](https://user-images.githubusercontent.com/23638587/222965335-e73d7622-59de-41d2-9cc4-1bd67042a349.png)

The last commit makes tidy use more threads, so comparing "before (8 threads)" to "after (64 threads)" is IMO the most realistic comparison. Locally, that brings the time for me to run tidy down from 4 to .9 seconds, i.e. the majority of the time for `x test tidy` is now spend running `fmt --check`.

r? `@the8472`
@jyn514 jyn514 deleted the tidy branch May 20, 2023 22:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-testsuite Area: The testsuite used to check the correctness of rustc S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants