-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimizing the rest of bool's Ord implementation #114721
Conversation
Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @cuviper (or someone else) soon. Please see the contribution instructions for more information. Namely, in order to ensure the minimum review times lag, PR authors and assigned reviewers should ensure that the review label (
|
I'm not sure whether the perf run will exercise this, but let's see: |
This comment has been minimized.
This comment has been minimized.
⌛ Trying commit b75351e with merge 8104587bead96343e80021e6a1277aba1825b518... |
FWIW, cg-gcc improves the assertion to a single branch: example::clamp:
cmp dl, sil
jb .L6
or esi, edi
mov eax, esi
and eax, edx
ret
example::clamp.cold:
.L6:
; panic code... |
☀️ Try build successful - checks-actions |
This comment has been minimized.
This comment has been minimized.
Finished benchmarking commit (8104587bead96343e80021e6a1277aba1825b518): comparison URL. Overall result: no relevant changes - no action neededBenchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. @bors rollup=never Instruction countThis benchmark run did not return any relevant results for this metric. Max RSS (memory usage)ResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesThis benchmark run did not return any relevant results for this metric. Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 631.671s -> 631.905s (0.04%) |
Ah, right, like #105840 found out, LLVM is really bad at understanding the subtraction- |
This difference is also visible in C: https://godbolt.org/z/WcWvx1rYM Anyway, the Rust codegen is still an improvement, and being neutral on rust-timer is fine, just apparently unused. @bors r+ rollup |
…r=cuviper Optimizing the rest of bool's Ord implementation After coming across issue rust-lang#66780, I realized that the other functions provided by Ord (`min`, `max`, and `clamp`) were similarly inefficient for bool. This change provides implementations for them in terms of boolean operators, resulting in much simpler assembly and faster code. Fixes issue rust-lang#114653 [Comparison on Godbolt](https://rust.godbolt.org/z/5nb5P8e8j) `max` assembly before: ```assembly example::max: mov eax, edi mov ecx, eax neg cl mov edx, esi not dl cmp dl, cl cmove eax, esi ret ``` `max` assembly after: ```assembly example::max: mov eax, edi or eax, esi ret ``` `clamp` assembly before: ```assembly example::clamp: mov eax, esi sub al, dl inc al cmp al, 2 jae .LBB1_1 mov eax, edi sub al, sil movzx ecx, dil sub dil, dl cmp dil, 1 movzx edx, dl cmovne edx, ecx cmp al, -1 movzx eax, sil cmovne eax, edx ret .LBB1_1: ; identical assert! code ``` `clamp` assembly after: ```assembly example::clamp: test edx, edx jne .LBB1_2 test sil, sil jne .LBB1_3 .LBB1_2: or dil, sil and dil, dl mov eax, edi ret .LBB1_3: ; identical assert! code ```
…iaskrgr Rollup of 7 pull requests Successful merges: - rust-lang#114721 (Optimizing the rest of bool's Ord implementation) - rust-lang#114746 (Don't add associated type bound for non-types) - rust-lang#114779 (Add check before suggest removing parens) - rust-lang#114859 (Add trait related queries to SMIR's rustc_internal) - rust-lang#114861 (fix typo: affect -> effect) - rust-lang#114867 ([nit] Fix a comment typo.) - rust-lang#114871 (Update the link in the docs of `std::intrinsics`) r? `@ghost` `@rustbot` modify labels: rollup
After coming across issue #66780, I realized that the other functions provided by Ord (
min
,max
, andclamp
) were similarly inefficient for bool. This change provides implementations for them in terms of boolean operators, resulting in much simpler assembly and faster code.Fixes issue #114653
Comparison on Godbolt
max
assembly before:max
assembly after:clamp
assembly before:clamp
assembly after: