-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[perf] Disable inline_in_all_cgus when LTO is enabled #65052
Conversation
r? @estebank (rust_highfive has picked a reviewer for you, use r? to override) |
@bors try @rust-timer queue A perf run can't hurt, I guess. |
Awaiting bors try build completion |
[perf] Disable inline_in_all_cgus when LTO is enabled I tried to run lolbench with this flag altered; however lolbench either errored or got stuck, so I end up running only 3 of the benchmarks manually. One showed no changes, another one showed 20% perf improvement somehow, and another showed slight (1%) regression. So I'm thinking of benchmarking it with perf instead; if we still have ThinLTO on CI then we can measure stage2 compiler performance as a metric of how much it affects generated code.
☀️ Try build successful - checks-azure |
Queued 8b81ee4 with parent 0221e26, future comparison URL. |
Finished benchmarking try commit 8b81ee4, comparison URL. |
Everything within the margin of noise, is ThinLTO used for benchmarks? |
It should be, yes, for release builds by default I believe. |
@ishitatsuyuki |
@bjorn3 I am aware of that, and what I expected is that this change will indirectly affect stage2 compiler, which in turn can be benchmarked for performance difference. So far what I have heard is that ThinLTO is enabled for both compiler builds and perf, so everything being identical sounds really strange. Something more likely is that the code/flag being changed has no effect at all, so maybe I should ask who had implemented this originally - |
@@ -65,7 +65,7 @@ pub trait MonoItemExt<'tcx>: fmt::Debug { | |||
fn instantiation_mode(&self, tcx: TyCtxt<'tcx>) -> InstantiationMode { | |||
let inline_in_all_cgus = | |||
tcx.sess.opts.debugging_opts.inline_in_all_cgus.unwrap_or_else(|| { | |||
tcx.sess.opts.optimize != OptLevel::No | |||
tcx.sess.opts.optimize != OptLevel::No && if tcx.sess.lto() == Lto::No |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is an if
after the &&
. Why does this code compile?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh good catch, and yeah I have no idea why this happened to compile. I'll push a fix later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Strange. I can't reproduce on play.rust-lang.org.
Oh shit this file is dead code. The code is inside |
5da9200
to
3e0a02a
Compare
@Mark-Simulacrum Sorry, can I get another perf run? |
@bors try @rust-timer queue |
Awaiting bors try build completion |
@bors try |
1 similar comment
@bors try |
[perf] Disable inline_in_all_cgus when LTO is enabled I tried to run lolbench with this flag altered; however lolbench either errored or got stuck, so I end up running only 3 of the benchmarks manually. One showed no changes, another one showed 20% perf improvement somehow, and another showed slight (1%) regression. So I'm thinking of benchmarking it with perf instead; if we still have ThinLTO on CI then we can measure stage2 compiler performance as a metric of how much it affects generated code.
☀️ Try build successful - checks-azure |
Queued c5643b3 with parent 9e35a28, future comparison URL. |
Finished benchmarking try commit c5643b3, comparison URL. |
Checked this and it seems it’s mostly a big lose. Closing. |
I tried to run lolbench with this flag altered; however lolbench either errored or got stuck, so I end up running only 3 of the benchmarks manually. One showed no changes, another one showed 20% perf improvement somehow, and another showed slight (1%) regression.
So I'm thinking of benchmarking it with perf instead; if we still have ThinLTO on CI then we can measure stage2 compiler performance as a metric of how much it affects generated code.