-
Notifications
You must be signed in to change notification settings - Fork 13.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
slice: #[inline] a couple iterator methods. #96711
Conversation
The one I care about and actually saw in the wild not getting inlined is clone(). We ended up doing a whole function call for something that just copies two pointers. I ended up marking as_slice / as_ref as well because make_slice is inline(always) itself, and is also the kind of think that can kill performance in hot loops if you expect it to get inlined. But happy to undo those.
Hey! It looks like you've submitted a new PR for the library teams! If this PR contains changes to any Examples of
|
(rust-highfive has picked a reviewer for you, use r? to override) |
Hm, I would have expected all of these methods to be generic and so automatically eligible for inlining. Do you have examples of code that doesn't see them inlined? Is this with incremental on perhaps? r? @nikic to check if we should be filing such bugs upstream or otherwise investigating. |
I hit this on a release Firefox build (which IIRC use I think I saw it in a non-incremental build but I can test this again if needed (not sure if that's your question). In a somewhat simplified example it does get inlined: https://rust.godbolt.org/z/eoo7Wcrfj, but the real code is much more complicated so... :( |
@nikic this is ready for review |
@bors try @rust-timer queue |
Awaiting bors try build completion. @rustbot label: +S-waiting-on-perf |
⌛ Trying commit 93e587b with merge 4cb99581146765206ae6636357b2792e9bf77c12... |
☀️ Try build successful - checks-actions |
Queued 4cb99581146765206ae6636357b2792e9bf77c12 with parent f382c27, future comparison URL. |
Finished benchmarking commit (4cb99581146765206ae6636357b2792e9bf77c12): comparison URL. Overall result: ✅ improvements - no action neededBenchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. @bors rollup=never Instruction countThis is a highly reliable metric that was used to determine the overall result at the top of this comment.
Max RSS (memory usage)ResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Footnotes |
☀️ Test successful - checks-actions |
Finished benchmarking commit (0265a3e): comparison URL. Overall result: ❌✅ regressions and improvements - ACTION NEEDEDNext Steps: If you can justify the regressions found in this perf run, please indicate this with @rustbot label: +perf-regression Instruction countThis is a highly reliable metric that was used to determine the overall result at the top of this comment.
Max RSS (memory usage)ResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Footnotes |
Perf results are more positive than negative, I think that's all that matters for this kind of change. The regressions are minor ones in secondary benchmarks. @rustbot label: +perf-regression-triaged |
The one I care about and actually saw in the wild not getting inlined is
clone(). We ended up doing a whole function call for something that just
copies two pointers.
I ended up marking as_slice / as_ref as well because make_slice is
inline(always) itself, and is also the kind of think that can kill
performance in hot loops if you expect it to get inlined. But happy to
undo those.