-
Notifications
You must be signed in to change notification settings - Fork 13.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add a deep fast_reject routine #97345
Conversation
@bors try @rust-timer queue |
Awaiting bors try build completion. @rustbot label: +S-waiting-on-perf |
⌛ Trying commit 9fe410feb297c11ac09a5185e1fc80e544b6cc97 with merge 5dd29bd39b3743aaf3c0c2669be7c188fe375735... |
@bors try @rust-timer queue |
Awaiting bors try build completion. @rustbot label: +S-waiting-on-perf |
⌛ Trying commit c2a5d73b7644a936194dc63896be12ca64d1894c with merge b1f8b07304a489ac1719a510512c8f14ac90de0f... |
@bors abort |
This comment has been minimized.
This comment has been minimized.
☀️ Try build successful - checks-actions |
Queued b1f8b07304a489ac1719a510512c8f14ac90de0f with parent acb5c16, future comparison URL. |
Finished benchmarking commit (b1f8b07304a489ac1719a510512c8f14ac90de0f): comparison url. Instruction count
Max RSS (memory usage)Results
CyclesResults
If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf. Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. @bors rollup=never Footnotes |
ideally we extend that deep fast reject to also use it for coherence. That should probably give us some further speedups rust/compiler/rustc_trait_selection/src/traits/coherence.rs Lines 79 to 105 in b2eba05
For that we need a deep fast reject where both sides are inference vars, implementing that now |
@bors try @rust-timer queue |
Awaiting bors try build completion. @rustbot label: +S-waiting-on-perf |
⌛ Trying commit df22932377f13307f936f9bc81ed70bef76bf5e5 with merge 3cd8dbec5da846625b1ccec2294b2656ad747c1a... |
@rust-timer queue df22932 |
Awaiting bors try build completion. @rustbot label: +S-waiting-on-perf |
@rust-timer queue 3cd8dbec5da846625b1ccec2294b2656ad747c1a |
It has a type This Zulip thread has more details. |
ty::Adt(obl_def, obl_substs) => match k { | ||
&ty::Adt(impl_def, impl_substs) => { | ||
obl_def == impl_def | ||
&& iter::zip(obl_substs, impl_substs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess if obl_def == impl_def
then the substs are guaranteed to have the same length?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes
} | ||
_ => false, | ||
}, | ||
ty::Slice(obl_ty) => matches!(k, &ty::Slice(impl_ty) if types_may_unify(obl_ty, impl_ty)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could use matches!
-with-an-if
in more of these cases... probably good to have consistency one way or the other. Since this is the only one at the moment, maybe a match
would be better here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my issue with matches is that it somewhat breaks formatting which is annoying.
Making this consistent is a bit annoying 😆 going to keep it this way for now '^^
// Ideally we would walk the existential predicates here or at least | ||
// compare their length. But considering that the relevant `Relate` impl | ||
// actually sorts and deduplicates these, that doesn't work. | ||
matches!(k, ty::Dynamic(impl_preds, ..) if |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, this one is a matches!
too.
Great additional improvements on Even better, I measured with a trunk version of The changes mostly look good. I have a few nits above, I'll let you decide how to deal with them. But please squash together commits 1, 3, and 4 before merging. |
@bors delegate=lcnr |
✌️ @lcnr can now approve this pull request |
I measured some benchmarks that we know from this analysis have some similarities to
Instruction counts measured locally for
Woo! |
This comment has been minimized.
This comment has been minimized.
`match_impl` has two call sites. For one of them (within `rematch_impl`) the fast reject test isn't necessary, because any rejection would represent a compiler bug. This commit moves the fast reject test to the other `match_impl` call site, in `assemble_candidates_from_impls`. This lets us move the fast reject test outside the `probe` call in that function. This avoids the taking of useless snapshots when the fast reject test succeeds, which gives a performance win when compiling the `bitmaps` and `nalgebra` crates. Co-authored-by: name <n.nethercote@gmail.com>
@bors r=nnethercote |
📌 Commit bff7b51 has been approved by |
☀️ Test successful - checks-actions |
Finished benchmarking commit (4a99c5f): comparison url. Instruction count
Max RSS (memory usage)Results
CyclesResults
If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf. @rustbot label: -perf-regression Footnotes |
continues the work on #97136.
r? @nnethercote
Actually agree with you on the match structure 😆 let's see how that impacted perf 😅