-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove more PtrToPtr
casts in GVN
#126844
Conversation
Some changes occurred to MIR optimizations cc @rust-lang/wg-mir-opt |
This comment has been minimized.
This comment has been minimized.
Some changes occurred in coverage tests. cc @Zalathar |
This comment has been minimized.
This comment has been minimized.
9aac432
to
49d353b
Compare
This comment has been minimized.
This comment has been minimized.
2d8caf9
to
093ccad
Compare
This comment has been minimized.
This comment has been minimized.
093ccad
to
dd545e1
Compare
Thanks! This is as usual more code than I wish it took to implement MIR transforms. But also I never thought of adding more casting cleanups to GVN. I'll re-assess some of the MIR in I'm going to send this to the benchmark queue on the off-chance that this produces a huge amount of more work for GVN or some other mir-opt pass to do. But if I see anything other than "whoa that's a lot of time in inside MIR transforms" I'll approve this when the perf run comes back. @bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
Remove more `PtrToPtr` casts in GVN This addresses two things I noticed in MIR: 1. `NonNull::<T>::eq` does `(a as *mut T) == (b as *mut T)`, but it could just compare the `*const T`s, so this removes `PtrToPtr` casts that are on both sides of a pointer comparison, so long as they're not fat-to-thin casts. 2. `NonNull::<T>::addr` does `transmute::<_, usize>(p as *const ())`, but so long as `T: Thin` that cast doesn't do anything, and thus we can directly transmute the `*const T` instead. r? mir-opt
In theory, could MIR adopt something like Cranelift's isle to describe transformations? |
I don't think a project of that scale is worth it for MIR. I'd much rather see someone design an IR that's suited to implementing optimizations first, and co-design it with a DSL for writing optimizations. Not that I have the kind of time/patience/expertise to do any of those things. |
☀️ Try build successful - checks-actions |
This comment has been minimized.
This comment has been minimized.
@tgross35 I'd love to have something like that. TBH the I do agree with @saethlin that it'd be nicer to have as SSA-primary IR on which to do this, though, rather than the GVN approximations to that. |
Finished benchmarking commit (71e9c33): comparison URL. Overall result: ❌✅ regressions and improvements - ACTION NEEDEDBenchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @bors rollup=never Instruction countThis is a highly reliable metric that was used to determine the overall result at the top of this comment.
Max RSS (memory usage)Results (primary 1.3%, secondary 1.2%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResults (primary 0.4%, secondary -2.3%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Binary sizeResults (primary -0.0%, secondary 0.1%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Bootstrap: 693.553s -> 695.303s (0.25%) |
@bors r+ |
I wonder if we should also simplify |
☀️ Test successful - checks-actions |
@cjgillot That makes sense to me, though I guess we need to be careful about things like I would love to have a better way to write helpers for all of this, but haven't found a way yet. Because what you're describing here sounds very similar to set of things that the |
Finished benchmarking commit (d7c5937): comparison URL. Overall result: ❌✅ regressions and improvements - ACTION NEEDEDNext Steps: If you can justify the regressions found in this perf run, please indicate this with @rustbot label: +perf-regression Instruction countThis is a highly reliable metric that was used to determine the overall result at the top of this comment.
Max RSS (memory usage)Results (primary 1.9%, secondary -2.1%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResults (primary 1.1%, secondary -3.8%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Binary sizeResults (primary -0.0%, secondary 0.1%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Bootstrap: 694.171s -> 695.122s (0.14%) |
Lots of churn here. Fun stuff like ripgrep-opt-full being +2.33% cycles but -5.03% wall time :/ I think it's probably time to look more at exactly what we're MIR inlining, now that a bunch of PRs like this have shrunk things enough that they've started inlining more in MIR. Maybe we want a different strategy that's less purely threshold based but tees it up to take more advantage of LLVM's inlining after ours... |
What we really want is less total MIR post-mono. I wonder if we could start by doing some analysis of the MIR inliner decisions on codebases like ripgrep. I suspect some inlinings reduce total monomorphic MIR and some increase it. Some examples would be good. If all it comes down to is how much the caller is instantiated, that would be a bummer but maybe we could do a PGO sort of thing to the standard library MIR based on instantiations in the perf suite. |
Visiting for weekly rustc-perf triage
@rustbot label: +perf-regression-triaged |
This addresses two things I noticed in MIR:
NonNull::<T>::eq
does(a as *mut T) == (b as *mut T)
, but it could just compare the*const T
s, so this removesPtrToPtr
casts that are on both sides of a pointer comparison, so long as they're not fat-to-thin casts.NonNull::<T>::addr
doestransmute::<_, usize>(p as *const ())
, but so long asT: Thin
that cast doesn't do anything, and thus we can directly transmute the*const T
instead.r? mir-opt