Skip to content

Commit

Permalink
Merge #292
Browse files Browse the repository at this point in the history
292: Permutations r=jswrenn a=tobz1000

Fixes #285.

This implementation is based on the Python implementation specified in the issue. There are quite a few points which should be considered.

## Method name, _k_-less form

The adaptor function is called `permutations`, for user familiarity. However, since the size of the output can be specified, a more mathematically accurate name would probably be `k_permutations`. Perhaps two adaptor functions would be better: `.k_permutations(k: usize)` and `.permutations()` (which just sets `k = vals.len()`)?

## Item value ordering/distinctness

Input items are not inspected in any way; they are treated purely by their initial index. This means that:

  1. Permutations are yielded in lexicographical order of the index, not by any `PartialOrd` implementation of the items.
  2. Identical items will not be detected, and will result in some identical permutations.

__1__ can be achieved by the user by collecting-and-sorting their input.

__2__ is a little trickier, but can be achieved by filtering the output. However, I think there is a more efficient algorithm to avoid duplicated. Maybe we should provide this option?

## Permutations from source buffer/indices

In addition to the iterator adaptor, I've added `Permutations::from_vals`, to create directly from a `Vec` or a slice. This saves some cloning compared to using `source.[into_]iter().permutations(k)`.

There is also `Permutations::new(n, k)`, which is functionally equivalent to `(0..n).permutations(k)`, but a little faster (about 0.6x the run time).

But perhaps you would consider these unnecessary additions to the API?

## `PermutationSource` trait

These different implementations (from `Vec`/slice/just indices) are achieved with the trait `PermutationSource`. It's visible to the user to implement for other structures if they wish, but this is probably of limited value. There's not much harm in allowing the user to access it, but again, maybe it's just API bloat. (or a future breaking change when it's removed/changed...)

On the other hand, perhaps there are other places in the library which could benefit from taking a source generically? Any adaptor where input elements are used more than once will need to store them, and it might be more efficient to allow users to supply the memory directly. This could be done in another PR.

For completeness, I also made full implementations of the three variations, without the trait, for benchmarking. The pure-indices implementation is about 10% slower using the trait, but `Vec`s and slices are unaffected (or even a little faster...).

## `Combinations` changes

I've made small changes to the `Combinations` documentation to match `Permutations` - most significantly, replacing the letter `n` with `k`. I've also changed all uses of the variable `n` to `k` in the implementation... but maybe this is considered commit noise :)

There are some other potential changes which would bring `Combinations` and `Permutations` more in line with one another.

As mentioned above, `Combinations` might benefit from using a `_____Source` trait, to allow direct memory buffers.

My `Permutations` implementations doesn't make use of `LazyBuffer`. Perhaps the buffer is useful for iterators which never complete, and have a very small `k` value. Has any benchmarking been done?

Either way, it makes sense for both adaptors to either use or not use `LazyBuffer`. Maybe it could be integrated into the `_______Source` trait if it's useful?

---

Let me know what you think when you have the chance. (Sorry for submitting two reasonably big PRs at once. I hope you get a chance to go through it all eventually!)

Co-authored-by: Toby Dimmick <tobydimmick@pm.me>
  • Loading branch information
bors[bot] and tobz1000 authored Sep 17, 2019
2 parents a6e9713 + 88834c9 commit eb8f01e
Show file tree
Hide file tree
Showing 9 changed files with 567 additions and 47 deletions.
44 changes: 43 additions & 1 deletion benches/bench1.rs
Original file line number Diff line number Diff line change
Expand Up @@ -7,10 +7,11 @@ use test::{black_box};
use itertools::Itertools;

use itertools::free::cloned;
use itertools::Permutations;

use std::iter::repeat;
use std::cmp;
use std::ops::Add;
use std::ops::{Add, Range};

mod extra;

Expand Down Expand Up @@ -762,3 +763,44 @@ fn all_equal_default(b: &mut test::Bencher) {

b.iter(|| xs.iter().dedup().nth(1).is_none())
}

const PERM_COUNT: usize = 6;

#[bench]
fn permutations_iter(b: &mut test::Bencher) {
struct NewIterator(Range<usize>);

impl Iterator for NewIterator {
type Item = usize;

fn next(&mut self) -> Option<Self::Item> {
self.0.next()
}
}

b.iter(|| {
for _ in NewIterator(0..PERM_COUNT).permutations(PERM_COUNT) {

}
})
}

#[bench]
fn permutations_range(b: &mut test::Bencher) {
b.iter(|| {
for _ in (0..PERM_COUNT).permutations(PERM_COUNT) {

}
})
}

#[bench]
fn permutations_slice(b: &mut test::Bencher) {
let v = (0..PERM_COUNT).collect_vec();

b.iter(|| {
for _ in v.as_slice().iter().permutations(PERM_COUNT) {

}
})
}
28 changes: 14 additions & 14 deletions src/combinations.rs
Original file line number Diff line number Diff line change
Expand Up @@ -2,12 +2,12 @@ use std::fmt;

use super::lazy_buffer::LazyBuffer;

/// An iterator to iterate through all the `n`-length combinations in an iterator.
/// An iterator to iterate through all the `k`-length combinations in an iterator.
///
/// See [`.combinations()`](../trait.Itertools.html#method.combinations) for more information.
#[must_use = "iterator adaptors are lazy and do nothing unless consumed"]
pub struct Combinations<I: Iterator> {
n: usize,
k: usize,
indices: Vec<usize>,
pool: LazyBuffer<I>,
first: bool,
Expand All @@ -17,27 +17,27 @@ impl<I> fmt::Debug for Combinations<I>
where I: Iterator + fmt::Debug,
I::Item: fmt::Debug,
{
debug_fmt_fields!(Combinations, n, indices, pool, first);
debug_fmt_fields!(Combinations, k, indices, pool, first);
}

/// Create a new `Combinations` from a clonable iterator.
pub fn combinations<I>(iter: I, n: usize) -> Combinations<I>
pub fn combinations<I>(iter: I, k: usize) -> Combinations<I>
where I: Iterator
{
let mut indices: Vec<usize> = Vec::with_capacity(n);
for i in 0..n {
let mut indices: Vec<usize> = Vec::with_capacity(k);
for i in 0..k {
indices.push(i);
}
let mut pool: LazyBuffer<I> = LazyBuffer::new(iter);

for _ in 0..n {
for _ in 0..k {
if !pool.get_next() {
break;
}
}

Combinations {
n: n,
k: k,
indices: indices,
pool: pool,
first: true,
Expand All @@ -52,18 +52,18 @@ impl<I> Iterator for Combinations<I>
fn next(&mut self) -> Option<Self::Item> {
let mut pool_len = self.pool.len();
if self.pool.is_done() {
if pool_len == 0 || self.n > pool_len {
if pool_len == 0 || self.k > pool_len {
return None;
}
}

if self.first {
self.first = false;
} else if self.n == 0 {
} else if self.k == 0 {
return None;
} else {
// Scan from the end, looking for an index to increment
let mut i: usize = self.n - 1;
let mut i: usize = self.k - 1;

// Check if we need to consume more from the iterator
if self.indices[i] == pool_len - 1 && !self.pool.is_done() {
Expand All @@ -72,7 +72,7 @@ impl<I> Iterator for Combinations<I>
}
}

while self.indices[i] == i + pool_len - self.n {
while self.indices[i] == i + pool_len - self.k {
if i > 0 {
i -= 1;
} else {
Expand All @@ -84,14 +84,14 @@ impl<I> Iterator for Combinations<I>
// Increment index, and reset the ones to its right
self.indices[i] += 1;
let mut j = i + 1;
while j < self.n {
while j < self.k {
self.indices[j] = self.indices[j - 1] + 1;
j += 1;
}
}

// Create result vector based on the indices
let mut result = Vec::with_capacity(self.n);
let mut result = Vec::with_capacity(self.k);
for i in self.indices.iter() {
result.push(self.pool[*i].clone());
}
Expand Down
12 changes: 6 additions & 6 deletions src/combinations_with_replacement.rs
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ where
I: Iterator,
I::Item: Clone,
{
n: usize,
k: usize,
indices: Vec<usize>,
// The current known max index value. This increases as pool grows.
max_index: usize,
Expand All @@ -24,7 +24,7 @@ where
I: Iterator + fmt::Debug,
I::Item: fmt::Debug + Clone,
{
debug_fmt_fields!(Combinations, n, indices, max_index, pool, first);
debug_fmt_fields!(Combinations, k, indices, max_index, pool, first);
}

impl<I> CombinationsWithReplacement<I>
Expand All @@ -39,16 +39,16 @@ where
}

/// Create a new `CombinationsWithReplacement` from a clonable iterator.
pub fn combinations_with_replacement<I>(iter: I, n: usize) -> CombinationsWithReplacement<I>
pub fn combinations_with_replacement<I>(iter: I, k: usize) -> CombinationsWithReplacement<I>
where
I: Iterator,
I::Item: Clone,
{
let indices: Vec<usize> = vec![0; n];
let indices: Vec<usize> = vec![0; k];
let pool: LazyBuffer<I> = LazyBuffer::new(iter);

CombinationsWithReplacement {
n,
k,
indices,
max_index: 0,
pool: pool,
Expand All @@ -66,7 +66,7 @@ where
// If this is the first iteration, return early
if self.first {
// In empty edge cases, stop iterating immediately
return if self.n == 0 || self.pool.is_done() {
return if self.k == 0 || self.pool.is_done() {
None
// Otherwise, yield the initial state
} else {
Expand Down
9 changes: 5 additions & 4 deletions src/lazy_buffer.rs
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ use std::ops::Index;

#[derive(Debug, Clone)]
pub struct LazyBuffer<I: Iterator> {
it: I,
pub it: I,
done: bool,
buffer: Vec<I::Item>,
}
Expand Down Expand Up @@ -54,14 +54,15 @@ where
}
}

impl<I> Index<usize> for LazyBuffer<I>
impl<I, J> Index<J> for LazyBuffer<I>
where
I: Iterator,
I::Item: Sized,
Vec<I::Item>: Index<J>
{
type Output = I::Item;
type Output = <Vec<I::Item> as Index<J>>::Output;

fn index<'b>(&'b self, _index: usize) -> &'b I::Item {
fn index(&self, _index: J) -> &Self::Output {
self.buffer.index(_index)
}
}
66 changes: 57 additions & 9 deletions src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -117,6 +117,8 @@ pub mod structs {
pub use multipeek_impl::MultiPeek;
pub use pad_tail::PadUsing;
pub use peeking_take_while::PeekingTakeWhile;
#[cfg(feature = "use_std")]
pub use permutations::Permutations;
pub use process_results_impl::ProcessResults;
#[cfg(feature = "use_std")]
pub use put_back_n_impl::PutBackN;
Expand Down Expand Up @@ -182,6 +184,8 @@ mod minmax;
mod multipeek_impl;
mod pad_tail;
mod peeking_take_while;
#[cfg(feature = "use_std")]
mod permutations;
mod process_results_impl;
#[cfg(feature = "use_std")]
mod put_back_n_impl;
Expand Down Expand Up @@ -1135,7 +1139,7 @@ pub trait Itertools : Iterator {
adaptors::tuple_combinations(self)
}

/// Return an iterator adaptor that iterates over the `n`-length combinations of
/// Return an iterator adaptor that iterates over the `k`-length combinations of
/// the elements from an iterator.
///
/// Iterator element type is `Vec<Self::Item>`. The iterator produces a new Vec per iteration,
Expand All @@ -1150,7 +1154,7 @@ pub trait Itertools : Iterator {
/// vec![1, 2, 4],
/// vec![1, 3, 4],
/// vec![2, 3, 4],
/// ]);
/// ]);
/// ```
///
/// Note: Combinations does not take into account the equality of the iterated values.
Expand All @@ -1164,16 +1168,15 @@ pub trait Itertools : Iterator {
/// vec![2, 2],
/// ]);
/// ```
///
#[cfg(feature = "use_std")]
fn combinations(self, n: usize) -> Combinations<Self>
fn combinations(self, k: usize) -> Combinations<Self>
where Self: Sized,
Self::Item: Clone
{
combinations::combinations(self, n)
combinations::combinations(self, k)
}

/// Return an iterator that iterates over the `n`-length combinations of
/// Return an iterator that iterates over the `k`-length combinations of
/// the elements from an iterator, with replacement.
///
/// Iterator element type is `Vec<Self::Item>`. The iterator produces a new Vec per iteration,
Expand All @@ -1190,15 +1193,60 @@ pub trait Itertools : Iterator {
/// vec![2, 2],
/// vec![2, 3],
/// vec![3, 3],
/// ]);
/// ]);
/// ```
#[cfg(feature = "use_std")]
fn combinations_with_replacement(self, n: usize) -> CombinationsWithReplacement<Self>
fn combinations_with_replacement(self, k: usize) -> CombinationsWithReplacement<Self>
where
Self: Sized,
Self::Item: Clone,
{
combinations_with_replacement::combinations_with_replacement(self, n)
combinations_with_replacement::combinations_with_replacement(self, k)
}

/// Return an iterator adaptor that iterates over all k-permutations of the
/// elements from an iterator.
///
/// Iterator element type is `Vec<Self::Item>` with length `k`. The iterator
/// produces a new Vec per iteration, and clones the iterator elements.
///
/// If `k` is greater than the length of the input iterator, the resultant
/// iterator adaptor will be empty.
///
/// ```
/// use itertools::Itertools;
///
/// let perms = (5..8).permutations(2);
/// itertools::assert_equal(perms, vec![
/// vec![5, 6],
/// vec![5, 7],
/// vec![6, 5],
/// vec![6, 7],
/// vec![7, 5],
/// vec![7, 6],
/// ]);
/// ```
///
/// Note: Permutations does not take into account the equality of the iterated values.
///
/// ```
/// use itertools::Itertools;
///
/// let it = vec![2, 2].into_iter().permutations(2);
/// itertools::assert_equal(it, vec![
/// vec![2, 2], // Note: these are the same
/// vec![2, 2], // Note: these are the same
/// ]);
/// ```
///
/// Note: The source iterator is collected lazily, and will not be
/// re-iterated if the permutations adaptor is completed and re-iterated.
#[cfg(feature = "use_std")]
fn permutations(self, k: usize) -> Permutations<Self>
where Self: Sized,
Self::Item: Clone
{
permutations::permutations(self, k)
}

/// Return an iterator adaptor that pads the sequence to a minimum length of
Expand Down
Loading

0 comments on commit eb8f01e

Please sign in to comment.