-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tentatively #[inline] Option::from #102434
Conversation
Probably not gonna have much of an impact because into can't be inlined.
(rust-highfive has picked a reviewer for you, use r? to override) |
Hey! It looks like you've submitted a new PR for the library teams! If this PR contains changes to any Examples of
|
@SoniEx2 I'd be mildly surprised that this isn't already being inlined. Do you have some code in release mode where this isn't being inlined? Happy to put this through a perf run, but before doing so, do you have a godbolt link or similar showing that this isn't being inlined? |
we do not. we're just throwing stuff at the wall and seeing what sticks. tho we were surprised to find it not marked |
Usually LLVM manages to figure out inlining for tiny functions like this on its own; I'd recommend trying some experiments with godbolt, typing code at rustc and seeing what the assembly looks like, and if you see a call to a tiny stub function that should be inlined but isn't, that's a good place to add |
Since this is a generic function, it gets duplicated into the codegen units of the caller anyways, making inlining possible. The inlinehint from it doesn't have too much of an impact usually and LLVM will just inline it. |
@Nilstrieb in that case why was this useful? |
|
I guess my question should have been phrased as "Why was that function not inlined in the first place (without an inline attribute), since this is also generic?". Apologies for the poor formulation. Is there a difference between blanket impls and generic impls (i.e. |
No, there is no difference between those. LLVMs inliner (and MIR inlining) just inline whatever their heuristics say should be inlined, which is often correct, but not always. Functions being generic just allows inlining in the first place (because the function implementation has to by copied into the user codegen unit), whether it actually happens is up to the inliner. |
A lot of our generic methods don't end up getting inlined when optimizing for size. At a previous job I was doing a lot of embedded stuff, and at -Copt-level=z (even with -Zbuild-std) we'd frequently see dozens of copies of tiny generic functions. FWIW, this would only happen on certain targets (I guess the set of LLVM passes we run depends somewhat on the target? Since not all targets exhibited this issue), which makes the issue more annoying. Some (but not all) of these would be inlined anyway, but end up in the output regardless, but this is an LLVM bug (some discussion in #96624). This was never that big of a deal (cost was likely under a kilobyte of code total which didn't make a difference for us), so I never dug that deeply (also IMO ideally we wouldn't need to mark these with While I never saw Option::from, I suspect this is just because it doesn't get used much in that code, since this is absolutely the kind of thing we'd see copies of. |
Because this is a recurring problem, I've spent some time trying to understand just why Basically, if we ignore the opt-for-size case (where |
@bors try @rust-timer queue The queue is empty anyway, so it doesn't hurt to see what this does I suppose. |
Awaiting bors try build completion. @rustbot label: +S-waiting-on-perf |
⌛ Trying commit 73fb318 with merge 71bfd8ddf728cd51898f6b895569e3ab104ff9e1... |
☀️ Try build successful - checks-actions |
Queued 71bfd8ddf728cd51898f6b895569e3ab104ff9e1 with parent 744e397, future comparison URL. |
Finished benchmarking commit (71bfd8ddf728cd51898f6b895569e3ab104ff9e1): comparison URL. Overall result: no relevant changes - no action neededBenchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. @bors rollup=never Instruction countThis benchmark run did not return any relevant results for this metric. Max RSS (memory usage)ResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Footnotes |
The cycle and RSS aren't real, the perf. run has hit a period where the perf. machine has a different config, but it's not reflected in master perf. results yet. Seeing as there are no instruction count improvements, I'd be inclined to close this. |
@SoniEx2
Maybe close it? Thank you. |
Probably not gonna have much of an impact because into can't be inlined. But let's try it?
(please perf run this whenever)