-
Notifications
You must be signed in to change notification settings - Fork 12.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Faster slice PartialOrd #28436
Merged
Merged
Faster slice PartialOrd #28436
Commits on Sep 16, 2015
-
Specialize
PartialOrd
for totally ordered primitive typesKnowing the result of equality comparison can enable additional optimizations in LLVM. Additionally, this makes it obvious that `partial_cmp` on totally ordered types cannot return `None`.
Configuration menu - View commit details
-
Copy full SHA for 1614173 - Browse repository at this point
Copy the full SHA 1614173View commit details -
Reusing the same idea as in rust-lang#26884, we can exploit the fact that the length of slices is known, hence we can use a counted loop instead of iterators, which means that we only need a single counter, instead of having to increment and check one pointer for each iterator. Using the generic implementation of the boolean comparison operators (`lt`, `le`, `gt`, `ge`) provides further speedup for simple types. This happens because the loop scans elements checking for equality and dispatches to element comparison or length comparison depending on the result of the prefix comparison. ``` test u8_cmp ... bench: 14,043 ns/iter (+/- 1,732) test u8_lt ... bench: 16,156 ns/iter (+/- 1,864) test u8_partial_cmp ... bench: 16,250 ns/iter (+/- 2,608) test u16_cmp ... bench: 15,764 ns/iter (+/- 1,420) test u16_lt ... bench: 19,833 ns/iter (+/- 2,826) test u16_partial_cmp ... bench: 19,811 ns/iter (+/- 2,240) test u32_cmp ... bench: 15,792 ns/iter (+/- 3,409) test u32_lt ... bench: 18,577 ns/iter (+/- 2,075) test u32_partial_cmp ... bench: 18,603 ns/iter (+/- 5,666) test u64_cmp ... bench: 16,337 ns/iter (+/- 2,511) test u64_lt ... bench: 18,074 ns/iter (+/- 7,914) test u64_partial_cmp ... bench: 17,909 ns/iter (+/- 1,105) ``` ``` test u8_cmp ... bench: 6,511 ns/iter (+/- 982) test u8_lt ... bench: 6,671 ns/iter (+/- 919) test u8_partial_cmp ... bench: 7,118 ns/iter (+/- 1,623) test u16_cmp ... bench: 6,689 ns/iter (+/- 921) test u16_lt ... bench: 6,712 ns/iter (+/- 947) test u16_partial_cmp ... bench: 6,725 ns/iter (+/- 780) test u32_cmp ... bench: 7,704 ns/iter (+/- 1,294) test u32_lt ... bench: 7,611 ns/iter (+/- 3,062) test u32_partial_cmp ... bench: 7,640 ns/iter (+/- 1,149) test u64_cmp ... bench: 7,517 ns/iter (+/- 2,164) test u64_lt ... bench: 7,579 ns/iter (+/- 1,048) test u64_partial_cmp ... bench: 7,629 ns/iter (+/- 1,195) ```
Configuration menu - View commit details
-
Copy full SHA for d04b8b5 - Browse repository at this point
Copy the full SHA d04b8b5View commit details -
Reuse cmp in totally ordered types
Instead of manually defining it, `partial_cmp` can simply wrap the result of `cmp` for totally ordered types.
Configuration menu - View commit details
-
Copy full SHA for bf9254a - Browse repository at this point
Copy the full SHA bf9254aView commit details -
Remove boundary checks in slice comparison operators
In order to get rid of all range checks, the compiler needs to explicitly see that the slices it iterates over are as long as the loop variable upper bound. This further improves the performance of slice comparison: ``` test u8_cmp ... bench: 4,761 ns/iter (+/- 1,203) test u8_lt ... bench: 4,579 ns/iter (+/- 649) test u8_partial_cmp ... bench: 4,768 ns/iter (+/- 761) test u16_cmp ... bench: 4,607 ns/iter (+/- 580) test u16_lt ... bench: 4,681 ns/iter (+/- 567) test u16_partial_cmp ... bench: 4,607 ns/iter (+/- 967) test u32_cmp ... bench: 4,448 ns/iter (+/- 891) test u32_lt ... bench: 4,546 ns/iter (+/- 992) test u32_partial_cmp ... bench: 4,415 ns/iter (+/- 646) test u64_cmp ... bench: 4,380 ns/iter (+/- 1,184) test u64_lt ... bench: 5,684 ns/iter (+/- 602) test u64_partial_cmp ... bench: 4,663 ns/iter (+/- 1,158) ```
Configuration menu - View commit details
-
Copy full SHA for 369a9dc - Browse repository at this point
Copy the full SHA 369a9dcView commit details -
Configuration menu - View commit details
-
Copy full SHA for 08b9edf - Browse repository at this point
Copy the full SHA 08b9edfView commit details -
Explain explicit slicing in slice cmp and partial_cmp methods
The explicit slicing is needed in order to enable additional range check optimizations in the compiler.
Configuration menu - View commit details
-
Copy full SHA for 74dc146 - Browse repository at this point
Copy the full SHA 74dc146View commit details
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.