-
Notifications
You must be signed in to change notification settings - Fork 13.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
More aggressive inlining in the fast_reject
code.
#137760
Conversation
This code is very hot in a couple of benchmarks.
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
More aggressive inlining in the `fast_reject` code. This code is very hot in a couple of benchmarks. r? `@ghost`
☀️ Try build successful - checks-actions |
This comment has been minimized.
This comment has been minimized.
Finished benchmarking commit (ecb5e9e): comparison URL. Overall result: ✅ improvements - no action neededBenchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. @bors rollup=never Instruction countThis is the most reliable metric that we have; it was used to determine the overall result at the top of this comment. However, even this metric can sometimes exhibit noise.
Max RSS (memory usage)This benchmark run did not return any relevant results for this metric. CyclesResults (primary -2.2%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 770.531s -> 769.678s (-0.11%) |
Hmm, locally I got ~5% icount improvements on |
ty::Adt(rhs_def, rhs_args) => { | ||
lhs_def == rhs_def && self.args_may_unify_inner(lhs_args, rhs_args, depth) | ||
// This call site can be hot. | ||
lhs_def == rhs_def | ||
&& self.inlined_args_may_unify_inner(lhs_args, rhs_args, depth) | ||
} | ||
_ => false, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's the perf impact of using
ty::Adt(lhs_def, lhs_args) => match rhs.kind() {
ty::Adt(rhs_def, rhs_args) if lhs_def == rhs_def => if lhs_args.len() == 1 {
// This code is very hot and a lot of generic ADTs have
// just a single generic argument.
self.inlined_arg_may_unify(lhs_args.as_slice()[0], rhs_args.as_slice()[0], depth)
} else
self.args_may_unify_inner(lhs_args, rhs_args, depth)
},
_ => false,
},
and not adding inlined_args_may_unify_inner
?
I don't like call-sites using inlined_args_may_unify
if we can avoid it/it doesn't have a meaningful perf impact to not do so 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This call is hot, but the two calls to inlined_args_may_unify
are also hot. So only having the len==1 optimization at this call site would reduce its impact.
I think this isn't worth persisting with. The improvements are smaller than I'd hoped, and they only affect a small fraction of real programs. |
This code is very hot in a couple of benchmarks.
r? @lcnr