-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add slice::sort_by_cached_key as a memoised sort_by_key #48639
Changes from 14 commits
ea6a1bd
670e69e
b8452cc
9fbee35
21fde09
7dcfc07
bdcc6f9
f41a26f
b430cba
ca3bed0
9896b38
81edd17
785e3c3
eca1e18
9c7b69e
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -102,6 +102,7 @@ use core::mem::size_of; | |
use core::mem; | ||
use core::ptr; | ||
use core::slice as core_slice; | ||
use core::{u8, u16, u32}; | ||
|
||
use borrow::{Borrow, BorrowMut, ToOwned}; | ||
use boxed::Box; | ||
|
@@ -1302,7 +1303,12 @@ impl<T> [T] { | |
|
||
/// Sorts the slice with a key extraction function. | ||
/// | ||
/// This sort is stable (i.e. does not reorder equal elements) and `O(n log n)` worst-case. | ||
/// This sort is stable (i.e. does not reorder equal elements) and `O(m n log(m n))` | ||
/// worst-case, where the key function is `O(m)`. | ||
/// | ||
/// For expensive key functions (e.g. functions that are not simple property accesses or | ||
/// basic operations), [`sort_by_cached_key`](#method.sort_by_cached_key) is likely to be | ||
/// significantly faster, as it does not recompute element keys. | ||
/// | ||
/// When applicable, unstable sorting is preferred because it is generally faster than stable | ||
/// sorting and it doesn't allocate auxiliary memory. | ||
|
@@ -1328,12 +1334,82 @@ impl<T> [T] { | |
/// ``` | ||
#[stable(feature = "slice_sort_by_key", since = "1.7.0")] | ||
#[inline] | ||
pub fn sort_by_key<B, F>(&mut self, mut f: F) | ||
where F: FnMut(&T) -> B, B: Ord | ||
pub fn sort_by_key<K, F>(&mut self, mut f: F) | ||
where F: FnMut(&T) -> K, K: Ord | ||
{ | ||
merge_sort(self, |a, b| f(a).lt(&f(b))); | ||
} | ||
|
||
/// Sorts the slice with a key extraction function. | ||
/// | ||
/// During sorting, the key function is called only once per element. | ||
/// | ||
/// This sort is stable (i.e. does not reorder equal elements) and `O(m n + n log n)` | ||
/// worst-case, where the key function is `O(m)`. | ||
/// | ||
/// For simple key functions (e.g. functions that are property accesses or | ||
/// basic operations), [`sort_by_key`](#method.sort_by_key) is likely to be | ||
/// faster. | ||
/// | ||
/// # Current implementation | ||
/// | ||
/// The current algorithm is based on [pattern-defeating quicksort][pdqsort] by Orson Peters, | ||
/// which combines the fast average case of randomized quicksort with the fast worst case of | ||
/// heapsort, while achieving linear time on slices with certain patterns. It uses some | ||
/// randomization to avoid degenerate cases, but with a fixed seed to always provide | ||
/// deterministic behavior. | ||
/// | ||
/// In the worst case, the algorithm allocates temporary storage in a `Vec<(K, usize)>` the | ||
/// length of the slice. | ||
/// | ||
/// # Examples | ||
/// | ||
/// ``` | ||
/// #![feature(slice_sort_by_cached_key)] | ||
/// let mut v = [-5i32, 4, 32, -3, 2]; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. [01:04:41] ---- slice.rs - slice::[T]::sort_by_cached_key (line 1367) stdout ----
[01:04:41] error[E0658]: use of unstable library feature 'slice_sort_by_cached_key' (see issue #34447)
[01:04:41] --> slice.rs:1370:3
[01:04:41] |
[01:04:41] 6 | v.sort_by_cached_key(|k| k.to_string());
[01:04:41] | ^^^^^^^^^^^^^^^^^^
[01:04:41] |
[01:04:41] = help: add #![feature(slice_sort_by_cached_key)] to the crate attributes to enable |
||
/// | ||
/// v.sort_by_cached_key(|k| k.to_string()); | ||
/// assert!(v == [-3, -5, 2, 32, 4]); | ||
/// ``` | ||
/// | ||
/// [pdqsort]: https://github.com/orlp/pdqsort | ||
#[unstable(feature = "slice_sort_by_cached_key", issue = "34447")] | ||
#[inline] | ||
pub fn sort_by_cached_key<K, F>(&mut self, f: F) | ||
where F: FnMut(&T) -> K, K: Ord | ||
{ | ||
// Helper macro for indexing our vector by the smallest possible type, to reduce allocation. | ||
macro_rules! sort_by_key { | ||
($t:ty, $slice:ident, $f:ident) => ({ | ||
let mut indices: Vec<_> = | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Would be nice to be able to use Just a suggestion, though. Maybe leave that for a future PR? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That's a good idea; I might leave that for a future PR though, yeah. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Rather than pulling in a full crate (which is mostly about encapsulating stuff in a single type that can be moved as a unit), you can reproduce that functionality with a couple of stack variables. Something like: pub fn foo<T>(s: &[T], f: &Fn(&T) -> T) {
let iter = s.iter().map(f).enumerate().map(|(i, k)| (k, i as u16));
let mut vec: Vec<_>;
let mut array: [_; ARRAY_LEN];
const ARRAY_LEN: usize = 32;
let len = iter.len();
let mapped = if len <= ARRAY_LEN {
array = unsafe {
std::mem::uninitialized()
};
for (x, slot) in iter.zip(&mut array) {
*slot = x
}
&mut array[..len]
} else {
vec = iter.collect();
&mut vec[..]
};
// …
} There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. (Just a caveat, that's the gist of it and the array code needs to take more precautions to be safe.) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @bluss Oh, do you mean panic-safe? Yes, I forgot about that, the array should be inside a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. And use |
||
$slice.iter().map($f).enumerate().map(|(i, k)| (k, i as $t)).collect(); | ||
// The elements of `indices` are unique, as they are indexed, so any sort will be | ||
// stable with respect to the original slice. We use `sort_unstable` here because | ||
// it requires less memory allocation. | ||
indices.sort_unstable(); | ||
for i in 0..$slice.len() { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We still need some kind of proof (even an informal one would be okay) that this loop takes linear time. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We can use the pictures in this article. I'm having a harder time proving it is correct rather than linear time, but it looks good to me. It should be linear time because the inner Whenever we walked k + 1 links in a cycle, the That means that the inner while loop's body can only be visited |
||
let mut index = indices[i].1; | ||
while (index as usize) < i { | ||
index = indices[index as usize].1; | ||
} | ||
indices[i].1 = index; | ||
$slice.swap(i, index as usize); | ||
} | ||
}) | ||
} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What's the time complexity of this There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ah, I was too concerned about correctness to notice that I might have slipped up with the time complexity! I think this is quadratic in the worst case, given some thought. I'll switch to the one you suggested — that seems like a safer choice in any respect. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Actually, I think I can easily modify my method to reduce it to linear. It also seems to have a lower constant factor overhead than the one in the article, so a win both ways :) |
||
|
||
let sz_u8 = mem::size_of::<(K, u8)>(); | ||
let sz_u16 = mem::size_of::<(K, u16)>(); | ||
let sz_u32 = mem::size_of::<(K, u32)>(); | ||
let sz_usize = mem::size_of::<(K, usize)>(); | ||
|
||
let len = self.len(); | ||
if sz_u8 < sz_u16 && len <= ( u8::MAX as usize) { return sort_by_key!( u8, self, f) } | ||
if sz_u16 < sz_u32 && len <= (u16::MAX as usize) { return sort_by_key!(u16, self, f) } | ||
if sz_u32 < sz_usize && len <= (u32::MAX as usize) { return sort_by_key!(u32, self, f) } | ||
sort_by_key!(usize, self, f) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Here we have four cases, and each one will use a separate monomorphized version of However, my feeling is that in many cases Let's add another set of conditions to these ifs to remove the code bloat whenever possible, like this: let size_u8 = mem::size_of::<(K, u8)>();
let size_u16 = mem::size_of::<(K, u16)>();
let size_u32 = mem::size_of::<(K, u32)>();
let size_usize = mem::size_of::<(K, usize)>();
if size_u8 < size_u16 && len <= ( u8::MAX as usize) { ... }
if size_u16 < size_u32 && len <= (u16::MAX as usize) { ... }
if size_u32 < size_usize && len <= (u32::MAX as usize) { ... } There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I did check that modifying the sizes like this had a positive effect, though I don't think I kept the results now; I can do more later if you'd like to see them. |
||
} | ||
|
||
/// Sorts the slice, but may not preserve the order of equal elements. | ||
/// | ||
/// This sort is unstable (i.e. may reorder equal elements), in-place (i.e. does not allocate), | ||
|
@@ -1410,7 +1486,7 @@ impl<T> [T] { | |
/// elements. | ||
/// | ||
/// This sort is unstable (i.e. may reorder equal elements), in-place (i.e. does not allocate), | ||
/// and `O(n log n)` worst-case. | ||
/// and `O(m n log(m n))` worst-case, where the key function is `O(m)`. | ||
/// | ||
/// # Current implementation | ||
/// | ||
|
@@ -1420,8 +1496,9 @@ impl<T> [T] { | |
/// randomization to avoid degenerate cases, but with a fixed seed to always provide | ||
/// deterministic behavior. | ||
/// | ||
/// It is typically faster than stable sorting, except in a few special cases, e.g. when the | ||
/// slice consists of several concatenated sorted sequences. | ||
/// Due to its key calling strategy, [`sort_unstable_by_key`](#method.sort_unstable_by_key) | ||
/// is likely to be slower than [`sort_by_cached_key`](#method.sort_by_cached_key) in | ||
/// cases where the key function is expensive. | ||
/// | ||
/// # Examples | ||
/// | ||
|
@@ -1435,9 +1512,8 @@ impl<T> [T] { | |
/// [pdqsort]: https://github.com/orlp/pdqsort | ||
#[stable(feature = "sort_unstable", since = "1.20.0")] | ||
#[inline] | ||
pub fn sort_unstable_by_key<B, F>(&mut self, f: F) | ||
where F: FnMut(&T) -> B, | ||
B: Ord | ||
pub fn sort_unstable_by_key<K, F>(&mut self, f: F) | ||
where F: FnMut(&T) -> K, K: Ord | ||
{ | ||
core_slice::SliceExt::sort_unstable_by_key(self, f); | ||
} | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -425,6 +425,14 @@ fn test_sort() { | |
v.sort_by(|a, b| b.cmp(a)); | ||
assert!(v.windows(2).all(|w| w[0] >= w[1])); | ||
|
||
// Sort in lexicographic order. | ||
let mut v1 = orig.clone(); | ||
let mut v2 = orig.clone(); | ||
v1.sort_by_key(|x| x.to_string()); | ||
v2.sort_by_cached_key(|x| x.to_string()); | ||
assert!(v1.windows(2).all(|w| w[0].to_string() <= w[1].to_string())); | ||
assert!(v1 == v2); | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Since There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good point! |
||
// Sort with many pre-sorted runs. | ||
let mut v = orig.clone(); | ||
v.sort(); | ||
|
@@ -477,24 +485,29 @@ fn test_sort_stability() { | |
// the second item represents which occurrence of that | ||
// number this element is, i.e. the second elements | ||
// will occur in sorted order. | ||
let mut v: Vec<_> = (0..len) | ||
let mut orig: Vec<_> = (0..len) | ||
.map(|_| { | ||
let n = thread_rng().gen::<usize>() % 10; | ||
counts[n] += 1; | ||
(n, counts[n]) | ||
}) | ||
.collect(); | ||
|
||
// only sort on the first element, so an unstable sort | ||
let mut v = orig.clone(); | ||
// Only sort on the first element, so an unstable sort | ||
// may mix up the counts. | ||
v.sort_by(|&(a, _), &(b, _)| a.cmp(&b)); | ||
|
||
// this comparison includes the count (the second item | ||
// This comparison includes the count (the second item | ||
// of the tuple), so elements with equal first items | ||
// will need to be ordered with increasing | ||
// counts... i.e. exactly asserting that this sort is | ||
// stable. | ||
assert!(v.windows(2).all(|w| w[0] <= w[1])); | ||
|
||
let mut v = orig.clone(); | ||
v.sort_by_cached_key(|&(x, _)| x); | ||
assert!(v.windows(2).all(|w| w[0] <= w[1])); | ||
} | ||
} | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good pointer to make but I think we are normally cautious and omit this; otherwise we have the docs for a stable method recommending an experimental method (for the next few releases).
This is how we handled
sort_unstable_by
; the mention of it insort_by
's doc first showed up in Rust 1.20 when it went stable.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh, that's a reasonable decision. I'll get rid of those comments for now.