Ignore files in .gitignore in tidy #106440

jyn514 · 2023-01-04T04:44:03Z

Switch from walkdir to ignore. This required various changes to make skip thread-safe.
Ignore build anywhere in the source tree, not just at the top-level. We support this in bootstrap, we should support it in tidy too.

As a nice side benefit, this also makes tidy a bit faster.

Before:

; hyperfine -i '"/home/gh-jyn514/rust2/build/aarch64-unknown-linux-gnu/stage0-tools-bin/rust-tidy" "/home/gh-jyn514/rust2" "/home/gh-jyn514/rust2/build/aarch64-unknown-linux-gnu/stage0/bin/cargo" "/home/gh-jyn514/rust2/build" "32"'
Benchmark 1: "/home/gh-jyn514/rust2/build/aarch64-unknown-linux-gnu/stage0-tools-bin/rust-tidy" "/home/gh-jyn514/rust2" "/home/gh-jyn514/rust2/build/aarch64-unknown-linux-gnu/stage0/bin/cargo" "/home/gh-jyn514/rust2/build" "32"
  Time (mean ± σ):      1.080 s ±  0.008 s    [User: 2.616 s, System: 3.243 s]
  Range (min … max):    1.069 s …  1.099 s    10 runs

After:

; hyperfine '"/home/gh-jyn514/rust2/build/aarch64-unknown-linux-gnu/stage0-tools-bin/rust-tidy" "/home/gh-jyn514/rust2" "/home/gh-jyn514/rust2/build/aarch64-unknown-linux-gnu/stage0/bin/cargo" "/home/gh-jyn514/rust2/build" "32"'
Benchmark 1: "/home/gh-jyn514/rust2/build/aarch64-unknown-linux-gnu/stage0-tools-bin/rust-tidy" "/home/gh-jyn514/rust2" "/home/gh-jyn514/rust2/build/aarch64-unknown-linux-gnu/stage0/bin/cargo" "/home/gh-jyn514/rust2/build" "32"
  Time (mean ± σ):     705.0 ms ±   1.4 ms    [User: 3179.1 ms, System: 1517.5 ms]
  Range (min … max):   702.3 ms … 706.9 ms    10 runs

r? @the8472

jyn514 · 2023-01-04T04:47:35Z

I originally tried making all these checks parallel before realizing that you'd already done that at the top level 🤦 just not at the file level which is the thing I did. If you want the code it's here: https://github.com/rust-lang/rust/compare/master...jyn514:rust:parallel-tidy?expand=1)

but not sure how useful it is given that walk is within the checks and not at the top-level, seems pretty hard to be able to intersperse the checks with each other.

the8472 · 2023-01-04T05:37:19Z

Switching to file level parallelism may be better on many-core systems because doing it at the top level is fairly coarse and the longest-running tasks (style.rs was the slowest last time I checked) end up dominating wall-time.
I did the top-level parallelization because it was good enough back then, but tidy has grown since.

I'll do a comparison of this branch vs. the file-parallel one on my machine.

jyn514 · 2023-01-04T05:45:08Z

oh sorry I forgot to mention, it saved nearly no time at all :/ first image is with this PR and second image is with file-level parallelism.

the8472 · 2023-01-04T06:40:12Z

Yes, it appears so. But it's not saturating my CPU, so there's still some bottleneck.

# master
$ ./x test tidy && hyperfine -w 5 '"build/x86_64-unknown-linux-gnu/stage0-tools-bin/rust-tidy" "." "build/x86_64-unknown-linux-gnu/stage0/bin/cargo" "build" "48"'
Benchmark #1: "build/x86_64-unknown-linux-gnu/stage0-tools-bin/rust-tidy" "." "build/x86_64-unknown-linux-gnu/stage0/bin/cargo" "build" "48"
  Time (mean ± σ):     612.7 ms ±  20.5 ms    [User: 2.189 s, System: 1.378 s]
  Range (min … max):   572.2 ms … 643.1 ms    10 runs


# this PR
Benchmark #1: "build/x86_64-unknown-linux-gnu/stage0-tools-bin/rust-tidy" "." "build/x86_64-unknown-linux-gnu/stage0/bin/cargo" "build" "48"
  Time (mean ± σ):     607.8 ms ±  32.2 ms    [User: 2.635 s, System: 1.411 s]
  Range (min … max):   559.6 ms … 658.3 ms    10 runs


# file-parallel branch
Benchmark #1: "build/x86_64-unknown-linux-gnu/stage0-tools-bin/rust-tidy" "." "build/x86_64-unknown-linux-gnu/stage0/bin/cargo" "build" "48"
  Time (mean ± σ):     635.7 ms ±  24.6 ms    [User: 3.046 s, System: 1.639 s]
  Range (min … max):   597.7 ms … 687.2 ms    10 runs

Thread timeline of this PR, you can see it's waiting on some cargo calls from tidy, the other checks finish much faster:

Same of the parallel version, more threads for the tidy work but it's waiting for cargo anyway:

If those cargo calls could be parallelized and sped up then the file-parallel version might end up faster, especially if the work could be spread more evenly across the threads (ignore maybe lacks workstealing?).

jyn514 · 2023-01-04T16:48:39Z

Oh, interesting - I think all the cargo calls are in src/tools/tidy/src/deps.rs, which is why they're serialized - I think it shouldn't be too hard to spawn them all in parallel and then wait on each :)

I don't want to block this PR on that change though, I think it's already a good improvement even without the file-level parallelism.

src/tools/tidy/src/walk.rs

- Switch from `walkdir` to `ignore`. This required various changes to make `skip` thread-safe. - Ignore `build` anywhere in the source tree, not just at the top-level. We support this in bootstrap, we should support it in tidy too. As a nice side benefit, this also makes tidy a bit faster. Before: ``` ; hyperfine -i '"/home/gh-jyn514/rust2/build/aarch64-unknown-linux-gnu/stage0-tools-bin/rust-tidy" "/home/gh-jyn514/rust2" "/home/gh-jyn514/rust2/build/aarch64-unknown-linux-gnu/stage0/bin/cargo" "/home/gh-jyn514/rust2/build" "32"' Benchmark 1: "/home/gh-jyn514/rust2/build/aarch64-unknown-linux-gnu/stage0-tools-bin/rust-tidy" "/home/gh-jyn514/rust2" "/home/gh-jyn514/rust2/build/aarch64-unknown-linux-gnu/stage0/bin/cargo" "/home/gh-jyn514/rust2/build" "32" Time (mean ± σ): 1.080 s ± 0.008 s [User: 2.616 s, System: 3.243 s] Range (min … max): 1.069 s … 1.099 s 10 runs ``` After: ``` ; hyperfine '"/home/gh-jyn514/rust2/build/aarch64-unknown-linux-gnu/stage0-tools-bin/rust-tidy" "/home/gh-jyn514/rust2" "/home/gh-jyn514/rust2/build/aarch64-unknown-linux-gnu/stage0/bin/cargo" "/home/gh-jyn514/rust2/build" "32"' Benchmark 1: "/home/gh-jyn514/rust2/build/aarch64-unknown-linux-gnu/stage0-tools-bin/rust-tidy" "/home/gh-jyn514/rust2" "/home/gh-jyn514/rust2/build/aarch64-unknown-linux-gnu/stage0/bin/cargo" "/home/gh-jyn514/rust2/build" "32" Time (mean ± σ): 705.0 ms ± 1.4 ms [User: 3179.1 ms, System: 1517.5 ms] Range (min … max): 702.3 ms … 706.9 ms 10 runs ```

`WalkBuilder` handles top-level paths differently than `fn walk` used to: it doesn't run the `skip` function to determine if it should be skipped, instead assuming the top-level function is always included. This is a reasonable assumption; adapt our code so it doesn't make pointless calls to `walk`.

the8472 · 2023-03-05T12:07:00Z

Is walkdir still used anywhere else in tidy? Otherwise it could be removed from its Cargo.toml

jyn514 · 2023-03-05T12:09:52Z

It's used in mir_opt_tests:

rust/src/tools/tidy/src/mir_opt_tests.rs

Line 9 in 40ba0e8

let files = walkdir::WalkDir::new(&path.join("mir-opt")).into_iter();

the8472 · 2023-03-05T12:21:30Z

Would replacing it there reduce build times? If not r=me.

jyn514 · 2023-03-05T12:29:32Z

I don't think so, the only files there are .rs, .mir, .diff, and .html files, and all of them are looked at (intentionally).

@bors r=the8472 rollup

bors · 2023-03-05T12:29:34Z

📌 Commit 98334f6 has been approved by the8472

It is now in the queue for this repository.

klensy · 2023-03-05T12:48:40Z

src/tools/tidy/src/walk.rs

-        if t!(File::open(entry.path()), entry.path()).read_to_string(&mut contents).is_err() {
-            contents.clear();
-        }
-        f(&entry, &contents);
+        let mut file = t!(File::open(entry.path()), entry.path());
+        t!(file.read_to_end(&mut contents), entry.path());
+        let contents_str = match std::str::from_utf8(&contents) {
+            Ok(s) => s,
+            Err(_) => return, // skip this file
+        };
+        f(&entry, &contents_str);


Why replacingread_to_string with read_to_end and from_utf8?

The behavior on error is different. Before it would call f on an empty file, now it skips calling f altogether (and gives a hard error if entry.path() is missing).

…iaskrgr Rollup of 7 pull requests Successful merges: - rust-lang#106440 (Ignore files in .gitignore in tidy) - rust-lang#108613 (Remove `llvm.skip-rebuild` option) - rust-lang#108616 (Sync codegen defaults with compiler defaults and add a ping message so they stay in sync) - rust-lang#108618 (Rename `src/etc/vscode_settings.json` to `rust_analyzer_settings.json`) - rust-lang#108626 (rustdoc-json: switch from HashMap to FxHashMap to fix non-determinism) - rust-lang#108744 (Don't ICE when encountering bound var in builtin copy/clone bounds) - rust-lang#108749 (Clean up rustdoc-js tester.js file) Failed merges: r? `@ghost` `@rustbot` modify labels: rollup

Speed up tidy quite a lot I highly recommend reviewing this commit-by-commit. Based on rust-lang#106440 for convenience. ## Timings These were collected by running `x test tidy -v` to copy paste the command, then using [`samply record`](https://github.com/mstange/samply). before (8 threads) ![image](https://user-images.githubusercontent.com/23638587/222965319-352ad2c8-367c-4d74-960a-e4bb161a6aff.png) after (8 threads) ![image](https://user-images.githubusercontent.com/23638587/222965323-fa846f4e-727a-4bf8-8e3b-1b7b40505cc3.png) before (64 threads) ![image](https://user-images.githubusercontent.com/23638587/222965302-dc88020c-19e9-49d9-a87d-cad054d717f3.png) after (64 threads) ![image](https://user-images.githubusercontent.com/23638587/222965335-e73d7622-59de-41d2-9cc4-1bd67042a349.png) The last commit makes tidy use more threads, so comparing "before (8 threads)" to "after (64 threads)" is IMO the most realistic comparison. Locally, that brings the time for me to run tidy down from 4 to .9 seconds, i.e. the majority of the time for `x test tidy` is now spend running `fmt --check`. r? `@the8472`

rustbot assigned the8472 Jan 4, 2023

rustbot added A-testsuite Area: The testsuite used to check the correctness of rustc S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jan 4, 2023

rust-cloud-vms bot force-pushed the tidy branch from d998000 to ba6cb33 Compare January 4, 2023 05:24

the8472 reviewed Jan 5, 2023

View reviewed changes

src/tools/tidy/src/walk.rs Outdated Show resolved Hide resolved

jyn514 added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jan 5, 2023

This comment was marked as resolved.

Sign in to view

jyn514 added 3 commits March 5, 2023 05:44

Reuse allocations between files

97cffd5

jyn514 force-pushed the tidy branch from ba6cb33 to 98334f6 Compare March 5, 2023 11:53

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Mar 5, 2023

klensy reviewed Mar 5, 2023

View reviewed changes

matthiaskrgr mentioned this pull request Mar 5, 2023

Rollup of 7 pull requests #108771

Merged

jyn514 mentioned this pull request Mar 5, 2023

Speed up tidy quite a lot #108772

Merged

bors merged commit 2f8bf34 into rust-lang:master Mar 5, 2023

rustbot added this to the 1.70.0 milestone Mar 5, 2023

joboet mentioned this pull request Mar 18, 2023

Ignore files in .gitignore in mir opt check #109323

Merged

davidlattimore mentioned this pull request Mar 19, 2023

compiler/rustc_mir_build/src/build ignored by .gitignore #109340

Closed

jyn514 deleted the tidy branch May 20, 2023 22:33

jyn514 mentioned this pull request May 20, 2023

tidy should not traverse untracked directories #69291

Open

Zalathar mentioned this pull request Nov 24, 2024

Add a tidy rule forbidding modules named build #133404

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Ignore files in .gitignore in tidy #106440

Ignore files in .gitignore in tidy #106440

Uh oh!

jyn514 commented Jan 4, 2023

Uh oh!

jyn514 commented Jan 4, 2023

Uh oh!

the8472 commented Jan 4, 2023

Uh oh!

jyn514 commented Jan 4, 2023

Uh oh!

the8472 commented Jan 4, 2023 •

edited

Loading

Uh oh!

jyn514 commented Jan 4, 2023

Uh oh!

Uh oh!

This comment was marked as resolved.

the8472 commented Mar 5, 2023

Uh oh!

jyn514 commented Mar 5, 2023

Uh oh!

the8472 commented Mar 5, 2023

Uh oh!

jyn514 commented Mar 5, 2023

Uh oh!

bors commented Mar 5, 2023

Uh oh!

klensy Mar 5, 2023

Uh oh!

jyn514 Mar 5, 2023 •

edited

Loading

Uh oh!

Uh oh!

Ignore files in .gitignore in tidy #106440

Ignore files in .gitignore in tidy #106440

Uh oh!

Conversation

jyn514 commented Jan 4, 2023

Uh oh!

jyn514 commented Jan 4, 2023

Uh oh!

the8472 commented Jan 4, 2023

Uh oh!

jyn514 commented Jan 4, 2023

Uh oh!

the8472 commented Jan 4, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jyn514 commented Jan 4, 2023

Uh oh!

Uh oh!

This comment was marked as resolved.

the8472 commented Mar 5, 2023

Uh oh!

jyn514 commented Mar 5, 2023

Uh oh!

the8472 commented Mar 5, 2023

Uh oh!

jyn514 commented Mar 5, 2023

Uh oh!

bors commented Mar 5, 2023

Uh oh!

klensy Mar 5, 2023

Choose a reason for hiding this comment

Uh oh!

jyn514 Mar 5, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

the8472 commented Jan 4, 2023 •

edited

Loading

jyn514 Mar 5, 2023 •

edited

Loading