-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
(Big performance change) Do not run lints that cannot emit #125116
Conversation
rustbot has assigned @michaelwoerister. Use |
These commits modify the If this was unintentional then you should revert the changes before this PR is merged. Some changes occurred in src/tools/clippy cc @rust-lang/clippy |
cc @nnethercote @Kobzol, the perf wizards. Could you please give this PR a look and tell me if there are any obvious performance issues on the filtering? |
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
…r=<try> (Big performance change) Do not run lints that cannot emit Before this lint, adding a lint was a difficult matter because it always had some overhead involved. This was because all lints would run, no matter their default level, or if the user had `#![allow]`ed them. This PR changes that. This change would improve both the Rust lint infrastructure and Clippy, but Clippy will see the most benefit, as it has about 900 registered lints (and growing!) So yeah, with this little patch we filter all lints pre-linting, and remove any lint that is either: - Manually `#![allow]`ed in the whole crate, - Allowed in the command line, or - Not manually enabled with `#[warn]` or similar, and its default level is `Allow` I open this PR to receive some feedback, mainly related to performance. We have lots of `Lock`s, `with_lock` and similar functions (also lots of cloning), so the filtering performance is not the best. In an older iteration, instead of doing this in the parsing phase, we developed a visitor with the same function but without so many locks, would reverting to that change help? I'm not sure tbh.
This comment has been minimized.
This comment has been minimized.
@lqd haven't you tried something like this before? 🤔 |
☀️ Try build successful - checks-actions |
This comment has been minimized.
This comment has been minimized.
We've tried a few different things yes, and so has blyxyas -- it maybe wasn't exactly like this, but I encountered annoyances like: some slow const eval loadbearing lint that shouldn't be ignored, lints that would be allowed unexpectedly because cargo allows lints unconditionally on dependencies (arguably the most common usage, and where perf gains would show up AFAICT) but some may trigger FCWs or are required to lint on dependencies despite being allowed, et cetera. Refactoring and fixing all these were too costly compared to the gains at the time, as rustc's lints were fast enough on dependencies, also a "rarer" use-case. That being said, we've added and uplifted more lints since then, including possibly costly ones like the non local impls one, and the situation may also be different for clippy itself (but we won't see that in the perf.rlo results, only locally with the clippy dedicated commands IIUC) |
Finished benchmarking commit (cc1d40f): comparison URL. Overall result: ❌ regressions - ACTION NEEDEDBenchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @bors rollup=never Instruction countThis is a highly reliable metric that was used to determine the overall result at the top of this comment.
Max RSS (memory usage)Results (primary 2.3%, secondary -0.5%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResults (secondary 2.4%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 676.788s -> 676.098s (-0.10%) |
The benchmark doesn't check clippy, right? As lqd hinted at as well? And without splitting allow-by-default rustc lints it does nothing without clippy, so I think this just shows how much time it takes to filter them (can someone else confirm this :3c) Thus, basically nothing it seems :3 (So @blyxyas maybe the cloning is ok?) |
Yeah, the benchmarks currently doesn't check Clippy, that's why I'm currently benchmarking on a different server via SSH (A server that we got explicitly to benchmark Clippy). I'll post the results here when they arrive :) Also, it currently doesn't check builtin lints because I'm having some issues checking that. That's also part of why I decided to open the PR, maybe someone has some idea (I'll see if I can read the previous attempts by lqd, maybe I can learn something from them) EDIT: Seems like lqd hasn't pushed their attempts, I'll have to keep trying new approaches by myself. |
Okis, here are the results (Wall time, Clippy) Wall time❌ Max RSS❌ Instructions❌ Cycles[ +0.42%, +25.48%] |
Those wall times are proof that this optimization has a lot of potential, the main drawback is that the filtering / parsing code is not fast enough, so in some scenarios that I'm not really able to determine exactly what do they have in common, the optimization goes backwards. But a ~70% in Wall time, that's great and we should look more into it. |
I wouldn't draw too many conclusions from these results, they seem to be quite unstable (there is also a 190% walltime regression). Note that even for PRs that don't have large perf. impacts, we can see ~30% walltime swings even on the stable benchmarking server (https://perf.rust-lang.org/compare.html?start=9e7aff794539aa040362f4424eb29207449ffce0&end=44fa5fd39a1d2af41bd7f43bc246a5e4f6d94696&stat=wall-time&nonRelevant=true). |
I've changed the system, we're back to using visitors (I've benchmarked this new commit, it should have 0 regressions and about -0.66% improvement) @bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
cf3326e
to
187ff0c
Compare
This comment has been minimized.
This comment has been minimized.
187ff0c
to
4cb4c1a
Compare
☔ The latest upstream changes (presumably #131235) made this pull request unmergeable. Please resolve the merge conflicts. |
4cb4c1a
to
70e9bc2
Compare
Before this change, adding a lint was a difficult matter because it always had some overhead involved. This was because all lints would run, no matter their default level, or if the user had #![allow]ed them. This PR changes that
70e9bc2
to
ddad55f
Compare
64c914b
to
1dcfa27
Compare
Do you intend to investigate https://github.com/rust-lang/rust/pull/125116/files#r1770598991 as a follow-up? Thanks for the work anyway! |
☀️ Test successful - checks-actions |
Finished benchmarking commit (4d88de2): comparison URL. Overall result: no relevant changes - no action needed@rustbot label: -perf-regression Instruction countThis benchmark run did not return any relevant results for this metric. Max RSS (memory usage)Results (secondary -4.7%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesThis benchmark run did not return any relevant results for this metric. Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 781.782s -> 783.187s (0.18%) |
Before this change, adding a lint was a difficult matter because it always had some overhead involved. This was because all lints would run, no matter their default level, or if the user had
#![allow]
ed them. This PR changes that. This change would improve both the Rust lint infrastructure and Clippy, but Clippy will see the most benefit, as it has about 900 registered lints (and growing!)So yeah, with this little patch we filter all lints pre-linting, and remove any lint that is either:
#![allow]
ed in the whole crate,#[warn]
or similar, and its default level isAllow
As some lints need to run, this PR also adds loadbearing lints. On a lint declaration, you can use the
@eval_always = true
marker to label it as loadbearing. A loadbearing lint will never be filtered (it will always run)Fixes #106983