Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Iterator::fold for .chain(), .cloned(), .map() and the VecDeque iterators. #37315

Merged
merged 3 commits into from
Oct 26, 2016

Conversation

bluss
Copy link
Member

@bluss bluss commented Oct 20, 2016

Chain can do something interesting here where it passes on the fold
into its inner iterators.

The lets the underlying iterator's custom fold() be used, and skips the
regular chain logic in next.

Also implement .fold() specifically for .map() and .cloned() so that any
inner fold improvements are available through map and cloned.

The same way, a VecDeque iterator fold can be turned into two slice folds.

These changes lend the power of the slice iterator's loop codegen to
VecDeque, and to chains of slice iterators, and so on.
It's an improvement for .sum() and .product(), and other uses of fold.

@rust-highfive
Copy link
Collaborator

r? @alexcrichton

(rust_highfive has picked a reviewer for you, use r? to override)

@@ -1829,6 +1811,42 @@ fn wrap_index(index: usize, size: usize) -> usize {
index & (size - 1)
}

/// Returns the two slices that cover the VecDeque's valid range
trait RingSlices : Sized {
fn slice(self, from: usize, to: usize) -> Self;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be replaced with a bound on Index?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, we're using either Index or IndexMut

@cristicbz
Copy link
Contributor

cristicbz commented Oct 21, 2016

Github won't let me comment on lines you didn't change. You could also modify the select_fold1_helper on line 2119 to something like:

fn select_fold1<I,B, FProj, FCmp>(it: I,
                                  mut f_proj: FProj,
                                  mut f_cmp: FCmp) -> Option<(B, I::Item)>
    where I: Iterator,
          FProj: FnMut(&I::Item) -> B,
          FCmp: FnMut(&B, &I::Item, &B, &I::Item) -> bool
{
    let mut projected = it.map(|x| (f_proj(&x), x));
    projected.next().map(|(sel_p, sel)| {
        projected.fold((sel_p, sel), |(sel_p, sel), (x_p, x)| {
            if f_cmp(&sel_p, &sel, &x_p, &x) {
                (x_p, x)
            } else {
                (sel_p, sel)
            }
        })
    })
}

This should help min and max when coupled with your map & fold improvements.

@cristicbz
Copy link
Contributor

last could also be implemented as self.fold(None, |_, x| Some(x)) to take advantage of any potential special-casing of fold (though, of course, for Chain specifically there is an even better impl).

@bluss
Copy link
Member Author

bluss commented Oct 21, 2016

As said on irc, I'd prefer to have add-on improvements in a separate PR.

@cristicbz
Copy link
Contributor

Makes sense, I'll create a spearate PR off my branch this gets merged. Very excited about this idea of using fold and a hypothetical fold_while as an iterator-provided internal iteration protocol in addition to the external protocol provided by next.

fn slice(self, from: usize, to: usize) -> Self;
fn split_at(self, i: usize) -> (Self, Self);

fn ring_slices(buf: Self, head: usize, tail: usize) -> (Self, Self) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe some #[inline]-s here and on the slice/split_at methods below would help ensure the generated code stays the same as before? (don't have any data that it doesn't and since code is generic, it'll always be available to inline)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since it is generic it should not matter I think. It is sort of a higher level operation anyway.

@bluss bluss changed the title Special case Iterator::fold for .chain(), .flat_map(), .cloned(), .map() and the VecDeque iterators. Implement Iterator::fold for .chain(), .flat_map(), .cloned(), .map() and the VecDeque iterators. Oct 22, 2016
@cristicbz
Copy link
Contributor

Some further validation for this idea: I specialised the fold implementation of the hashmap RawTable's iterators in a similar fashion for some significant perf benefits in #37362.

@bluss
Copy link
Member Author

bluss commented Oct 23, 2016

Cool.

The improvement of .fold() on VecDeque enables unrolling and vectorization. I've used the same idea in ndarray, to much improve its generalized iterator the same way (it has a special case for contiguous data that was already good, but this helps the general cases by an order of magnitude). In ndarray this is not a fringe use case, instead it is used to improve the iteration of non-contiguous arrays across the board, improving in place arithmetic ops, array-to-array mapping and more operations for them.

The reasoning for this PR is

  1. VecDeque's iterator .fold() is an order of magnitude improvement (see issue VecDeque's iterator could be optimized #30805 for the size of that)
  2. Forwarding fold on map, cloned, map, flat_map, is important since that allows the improved versions of fold be used from the underlying iterators.

@bluss
Copy link
Member Author

bluss commented Oct 23, 2016

So while .chain() can be improved to optimize better (and I hope we will, I have some test cases for that), this PR is still relevant until we can make the VecDeque's iterator's .next() optimize better on its own, or ndarray's generalized iterator's .next() optimize better on its own and so on. Let the user's improved .fold() shine through.

@alexcrichton
Copy link
Member

@bors: r+

Nice wins! Neat idea propagating through adapters like map and cloned

@bors
Copy link
Contributor

bors commented Oct 25, 2016

📌 Commit f095f0d has been approved by alexcrichton

@bors
Copy link
Contributor

bors commented Oct 25, 2016

⌛ Testing commit f095f0d with merge 8a2de11...

@bors
Copy link
Contributor

bors commented Oct 25, 2016

💔 Test failed - auto-linux-64-debug-opt

@bluss
Copy link
Member Author

bluss commented Oct 25, 2016

Thank you. I'll have to investigate this llvm assertion..

@bluss
Copy link
Member Author

bluss commented Oct 25, 2016

I ruled out the VecDeque and cloned(), .map() changes, so it must be .chain() or .flat_map() (causing the assertion).

Implement .fold() specifically for .map() and .cloned() so that any
inner fold improvements are available through map and cloned.
Chain can do something interesting here where it passes on the fold
into its inner iterators.

The lets the underlying iterator's custom fold() be used, and skips the
regular chain logic in next.
@bluss bluss changed the title Implement Iterator::fold for .chain(), .flat_map(), .cloned(), .map() and the VecDeque iterators. Implement Iterator::fold for .chain(), .cloned(), .map() and the VecDeque iterators. Oct 25, 2016
@bluss
Copy link
Member Author

bluss commented Oct 25, 2016

I haven't solved it, but I have removed the FlatMap::fold method that caused the debuginfo-related llvm assertion when building librustc. It turned out to be the fold of something using flat_map here that caused it.

This way the other improvements can go in first. The remaining parts should not cause any llvm assertions.

@alexcrichton
Copy link
Member

@bors: r+

@bors
Copy link
Contributor

bors commented Oct 25, 2016

📌 Commit a16626f has been approved by alexcrichton

Manishearth added a commit to Manishearth/rust that referenced this pull request Oct 26, 2016
Implement Iterator::fold for .chain(), .cloned(), .map() and the VecDeque iterators.

Chain can do something interesting here where it passes on the fold
into its inner iterators.

The lets the underlying iterator's custom fold() be used, and skips the
regular chain logic in next.

Also implement .fold() specifically for .map() and .cloned() so that any
inner fold improvements are available through map and cloned.

The same way, a VecDeque iterator fold can be turned into two slice folds.

These changes lend the power of the slice iterator's loop codegen to
VecDeque, and to chains and flat maps of slice iterators, and so on.
It's an improvement for .sum() and .product(), and other uses of fold.
@bors
Copy link
Contributor

bors commented Oct 26, 2016

⌛ Testing commit a16626f with merge 3a25b65...

bors added a commit that referenced this pull request Oct 26, 2016
Implement Iterator::fold for .chain(), .cloned(), .map() and the VecDeque iterators.

Chain can do something interesting here where it passes on the fold
into its inner iterators.

The lets the underlying iterator's custom fold() be used, and skips the
regular chain logic in next.

Also implement .fold() specifically for .map() and .cloned() so that any
inner fold improvements are available through map and cloned.

The same way, a VecDeque iterator fold can be turned into two slice folds.

These changes lend the power of the slice iterator's loop codegen to
VecDeque, and to chains of slice iterators, and so on.
It's an improvement for .sum() and .product(), and other uses of fold.
@bors bors merged commit a16626f into rust-lang:master Oct 26, 2016
@bluss bluss deleted the fold-more branch October 26, 2016 21:59
@bluss bluss added the relnotes Marks issues that should be documented in the release notes of the next release. label Oct 30, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
relnotes Marks issues that should be documented in the release notes of the next release.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants