Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Iterator::advance_by and DoubleEndedIterator::advance_back_by #76909

Merged
merged 1 commit into from
Oct 1, 2020

Conversation

timvermeulen
Copy link
Contributor

This PR adds the iterator method

fn advance_by(&mut self, n: usize) -> Result<(), usize>

that advances the iterator by n elements, returning Ok(()) if this succeeds or Err(len) if the length of the iterator was less than n.

Currently Iterator::nth is the method to override for efficiently advancing an iterator by multiple elements at once. advance_by is superior for this purpose because

  • it's simpler to implement: instead of advancing the iterator and producing the next element you only need to advance the iterator
  • it composes better: iterators like Chain and FlatMap can implement advance_by in terms of advance_by on their inner iterators, but they cannot implement nth in terms of nth on their inner iterators (see Iterator::nth doesn't compose well #60395)
  • the default implementation of nth can trivially be implemented in terms of advance_by and next, which this PR also does

This PR also adds DoubleEndedIterator::advance_back_by for all the same reasons.

I'll make a tracking issue if it's decided this is worth merging. Also let me know if anything can be improved, this went through several iterations so there might very well still be room for improvement (especially in the doc comments). I've written overrides of these methods for most iterators that already override nth/nth_back, but those still need tests so I'll add them in a later PR.

cc @cuviper @scottmcm @Amanieu

@rust-highfive
Copy link
Collaborator

r? @dtolnay

(rust_highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Sep 19, 2020
@scottmcm
Copy link
Member

scottmcm commented Sep 19, 2020

I'll add that this would also make Skip simpler -- right now it has // nth(n) skips n+1 comments everywhere: for example https://doc.rust-lang.org/1.46.0/src/core/iter/adapters/mod.rs.html#2035-2043

I find this change pretty elegant:

     #[inline]
     fn last(mut self) -> Option<I::Item> {
-         if self.n > 0 {
-             // nth(n) skips n+1
-             self.iter.nth(self.n - 1)?;
-         }
+        self.iter.advance_by(self.n).ok()?;
         self.iter.last()
     }

Both for simpler code and because it folds the n > 0 check into the one that the wrapper iterator's advance_by fundamentally has to do anyway, which should be easier for LLVM to optimize well.


Edit: It would also accomplish the goal I'd mentioned in https://rust-lang.zulipchat.com/#narrow/stream/219381-t-libs/topic/Eager.20skipping.20for.20iterators/near/209044788. The suggested solution in https://users.rust-lang.org/t/hide-beginning-of-vec/48274/3?u=scottmcm wouldn't need the // consume 0..=15 comment, because of course .advance_by(16).unwrap(); is the eager iterator version of v[16..].

@Amanieu
Copy link
Member

Amanieu commented Sep 20, 2020

The problem with using advance_by instead of nth everywhere is that you lose optimizations for user-defined iterators that define nth (which is stable) but not advance_by (which is not).

@timvermeulen
Copy link
Contributor Author

Yeah, code that calls nth on potentially a user's iterator can't be changed yet (or maybe ever) so we'll unfortunately need to keep most iterator adapters' nth implementations alongside advance_by.

For a couple iterator types that don't wrap another iterator (such as slice::Iter, Chunks, Windows) we could potentially replace the nth implementation, assuming there's no performance impact. And iter::Chain::nth could be changed to call advance_by on the first iterator and nth on the second.

@the8472
Copy link
Member

the8472 commented Sep 21, 2020

The problem with using advance_by instead of nth everywhere is that you lose optimizations for user-defined iterators that define nth (which is stable) but not advance_by (which is not).

If advance_by were implemented on ExactSizeIterator instead then the default implementation could use nth internally by reading the original length before using nth and then returning that amount in the error path.

@timvermeulen
Copy link
Contributor Author

@the8472 You're certainly right, but on ExactSizeIterator it wouldn't be as useful, because the main benefit comes from overriding it on types such as Chain that can implement advance(by:) and not ExactSizeIterator.

Implementing advance_by in terms of nth would also mean we couldn't have the default implementation of nth call advance_by, which would be unfortunate, because implementing advance_by is often simpler than implementing nth.

@Amanieu
Copy link
Member

Amanieu commented Sep 30, 2020

@bors r+

@bors
Copy link
Contributor

bors commented Sep 30, 2020

📌 Commit ecacc75 has been approved by Amanieu

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Sep 30, 2020
bors added a commit to rust-lang-ci/rust that referenced this pull request Oct 1, 2020
Rollup of 12 pull requests

Successful merges:

 - rust-lang#76909 (Add Iterator::advance_by and DoubleEndedIterator::advance_back_by)
 - rust-lang#77153 (Fix recursive nonterminal expansion during pretty-print/reparse check)
 - rust-lang#77202 (Defer Apple SDKROOT detection to link time.)
 - rust-lang#77303 (const evaluatable: improve `TooGeneric` handling)
 - rust-lang#77305 (move candidate_from_obligation_no_cache)
 - rust-lang#77315 (Rename AllocErr to AllocError)
 - rust-lang#77319 (Stable hashing: add comments and tests concerning platform-independence)
 - rust-lang#77324 (Don't fire `const_item_mutation` lint on writes through a pointer)
 - rust-lang#77343 (Validate `rustc_args_required_const`)
 - rust-lang#77349 (Update cargo)
 - rust-lang#77360 (References to ZSTs may be at arbitrary aligned addresses)
 - rust-lang#77371 (Remove trailing space in error message)

Failed merges:

r? `@ghost`
@bors
Copy link
Contributor

bors commented Oct 1, 2020

⌛ Testing commit ecacc75 with merge b218b95...

@bors bors merged commit 8bd4ed9 into rust-lang:master Oct 1, 2020
@rustbot rustbot added this to the 1.48.0 milestone Oct 1, 2020
@ecstatic-morse
Copy link
Contributor

ecstatic-morse commented Oct 6, 2020

@timvermeulen The rollup containing this PR caused a large regression in instruction counts in the deeply-nested benchmark (a stress-test for iterator chaining). It also caused a small instruction count increase across all other benchmarks. #77594 has not fixed this.

Reverting the changes to nth and nth_back should fix this. @timvermeulen could you give this a try? @Amanieu, please consider marking PRs that change widely used iterator methods as rollup=never in the future, or schedule a perf run pre-emptively.

@scottmcm
Copy link
Member

scottmcm commented Oct 6, 2020

That's this one, right? https://github.com/rust-lang/rustc-perf/blob/master/collector/benchmarks/deeply-nested/src/lib.rs

I think that one will regress massively any time an object-safe method is added to Iterator, since it uses Box<dyn Iterator> so all those methods need to be codegened for the vtable. So I don't think reverting the nth changes will fix that part; we might just have to accept it -- unless we decide we just can't add another object-safe method ever again. (Also, that test is massively unrealistic because it's an empty iterator of ()s, and even changing it to an empty iterator of i32 already makes it take 30+ minutes, let alone making it actually have elements.)

That said, the small regressions in other things are worth looking at.

@ecstatic-morse
Copy link
Contributor

ecstatic-morse commented Oct 6, 2020

Ah, didn't even notice the trait object aspect of that benchmark. I'm more concerned about the small regression elsewhere, but perhaps it's a similar issue. I think it's worth doing a perf run with the changes to nth reverted to find out.

@scottmcm
Copy link
Member

scottmcm commented Oct 6, 2020

Yeah, that does sound like a reasonable thing to try. I'd also suggest manually implementing it for slice::Iter -- that's the common iterator usage and apparently hand-coding things for that can help.

@timvermeulen
Copy link
Contributor Author

I was going to add slice::Iter::advance_by next, let's see how that compares.

@timvermeulen
Copy link
Contributor Author

Alright, that doesn't really seem to have changed anything. Let's try reverting the nth changes next, though I don't think that's something we can realistically merge if that turns out to be it, because having nth and advance_by exist independent of each other would get very confusing...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants