-
Notifications
You must be signed in to change notification settings - Fork 13k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Auto merge of #41764 - scottmcm:faster-reverse, r=brson
Make [u8]::reverse() 5x faster Since LLVM doesn't vectorize the loop for us, do unaligned reads of a larger type and use LLVM's bswap intrinsic to do the reversing of the actual bytes. cfg!-restricted to x86 and x86_64, as I assume it wouldn't help on things like ARMv5. Also makes [u16]::reverse() a more modest 1.5x faster by loading/storing u32 and swapping the u16s with ROT16. Thank you ptr::*_unaligned for making this easy :) Benchmark results (from my i5-2500K): ```text # Before test slice::reverse_u8 ... bench: 273,836 ns/iter (+/- 15,592) = 3829 MB/s test slice::reverse_u16 ... bench: 139,793 ns/iter (+/- 17,748) = 7500 MB/s test slice::reverse_u32 ... bench: 74,997 ns/iter (+/- 5,130) = 13981 MB/s test slice::reverse_u64 ... bench: 47,452 ns/iter (+/- 2,213) = 22097 MB/s # After test slice::reverse_u8 ... bench: 52,170 ns/iter (+/- 3,962) = 20099 MB/s test slice::reverse_u16 ... bench: 93,330 ns/iter (+/- 4,412) = 11235 MB/s test slice::reverse_u32 ... bench: 74,731 ns/iter (+/- 1,425) = 14031 MB/s test slice::reverse_u64 ... bench: 47,556 ns/iter (+/- 3,025) = 22049 MB/s ``` If you're curious about the assembly, instead of doing this ``` movzx eax, byte ptr [rdi] movzx ecx, byte ptr [rsi] mov byte ptr [rdi], cl mov byte ptr [rsi], al ``` it does this ``` mov rax, qword ptr [rdx] mov rbx, qword ptr [r11 + rcx - 8] bswap rbx mov qword ptr [rdx], rbx bswap rax mov qword ptr [r11 + rcx - 8], rax ```
- Loading branch information
Showing
4 changed files
with
86 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters