Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stabilize step_by by adding it to Iterator (issue #27741) #41439

Merged
merged 3 commits into from
May 19, 2017

Conversation

ivandardi
Copy link
Contributor

Inspired by itertools' take() method. See issue #27741

@rust-highfive
Copy link
Collaborator

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @BurntSushi (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

Please see the contribution instructions for more information.

@ivandardi
Copy link
Contributor Author

This is my first commit to Rust! :D I probably screwed something up

I may need help with the docstrings. I feel like they can be improved. Also, I couldn't get it to compile without adding the #[stable] tags. Which are the proper tags to put there?

#[inline]
#[stable(feature = "rust1", since = "1.0.0")]
fn every_nth(self, step: usize) -> EveryNth<Self> where Self: Sized {
assert!(step != 0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if I'd expect this method to panic, especially if I'm using it programmatically. I'm in no position to define this, but: If this assert stays, I'd add a custom message.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would personally expect the behaviour of .peek() for every_nth(0).next(). Troublesome to implement though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nagisa more like impossible without cloning (peak returns a reference, next would return a value).

From the description ("skipping step elements at a time"), I'd expect every_nth(0) to return every item and every_nth(1) to skip every other item. This would also be consistent with nth.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think it would be best to have that behavior instead. I'll make the changes now.

@@ -145,6 +145,22 @@ fn test_iterator_chain_find() {
}

#[test]
fn test_every_nth_one() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add another test (with #[should_panic]) to test .every_nth(0)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, such a test should be added.

@shepmaster shepmaster added the S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. label Apr 21, 2017
@alexcrichton alexcrichton added the T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. label Apr 21, 2017
@ollie27
Copy link
Member

ollie27 commented Apr 21, 2017

How about if the number passed to every_nth is the number of elements that it should skip? It would mean every_nth(0) would return every element, every_nth(1) would return every other element and so on. It would match nth and remove the awkward panic.

@ivandardi
Copy link
Contributor Author

@ollie27 How would every_nth(1) work?

Like this:

let mut it = (0..).every_nth(1).take(3);
assert_eq!(it.next(), Some(0));
assert_eq!(it.next(), Some(2));
assert_eq!(it.next(), Some(4));
assert_eq!(it.next(), None);

Or like this:

let mut it = (0..).every_nth(1).take(3);
assert_eq!(it.next(), Some(1));
assert_eq!(it.next(), Some(3));
assert_eq!(it.next(), Some(5));
assert_eq!(it.next(), None);

@ollie27
Copy link
Member

ollie27 commented Apr 21, 2017

@ivandardi that's a good question. Both seem reasonable to me.

@nagisa
Copy link
Member

nagisa commented Apr 21, 2017

How would every_nth(1) work?

Probably the first. You can get the behaviour of the 2nd code sample by skipping before hand, whereas you cannot get the first behaviour from the second.

@leonardo-m
Copy link

This is how Python behaves:

>>> range(10)[::0]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: slice step cannot be zero
>>> range(10)[::1]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> range(10)[::2]
[0, 2, 4, 6, 8]

@ivandardi
Copy link
Contributor Author

Hm... It would be good to keep consistency with other languages too. If we keep the Python behavior, then the current implementation is correct and all it's needed is the #[should_panic] test.

After that we need to discuss whether or not to keep the name every_nth for clarity, how to fix the #[stable] tags and possibly the removal of the Step trait altogether.

@nagisa
Copy link
Member

nagisa commented Apr 24, 2017

If we keep the Python behavior, then the current implementation is correct

Too bad Python has a terrible behaviour in a lot of cases. I feel that at this point it is more in the garden of T-libs to decide what’s the optimal behaviour here.

Here’s what the documentation for the behaviour I would prefer:

Returns an iterator adaptor which yields the next element and then skips the specified number of elements before yielding another element. `every_nth(0)` is an identity adaptor.

# Examples

```
let mut it = (0..).every_nth(0).take(2);
assert_eq!(it.next(), Some(0));
assert_eq!(it.next(), Some(1));
assert_eq!(it.next(), None);
```

```
let mut it = (0..).every_nth(2).take(3);
assert_eq!(it.next(), Some(0));
assert_eq!(it.next(), Some(3));
assert_eq!(it.next(), Some(6));
assert_eq!(it.next(), None);
```

Implementation wise it also needs to be decided whether it is fine for iterator to be implemented as

fn next(&mut self) -> Option<Self::Item> { let n = self.next(); self.take(self.n).count(); n }
// as opposed to
fn next(&mut self) -> Option<Self::Item> { if self.first_take { self.next() } else { self.skip(self.n).next() } }

as these two have distinct behaviour in terms of the state the underlying iterator is after a next. Implementation approach will likely dictate the documentation wording as well.

@nagisa nagisa added S-waiting-on-team Status: Awaiting decision from the relevant subteam (see the T-<team> label). and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Apr 24, 2017
@nagisa
Copy link
Member

nagisa commented Apr 24, 2017

I’ll take liberty of tagging this as waiting-on-team, as there’s clearly nothing the author can act on.

fn next(&mut self) -> Option<Self::Item> {
let elt = self.iter.next();
if self.step > 0 {
self.iter.nth(self.step - 1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If nth returns None here, then the behavior of calling next again is unspecified. Shouldn't that be avoided?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, the iterator could be fused so that it never returns a Some(T) again. But I don't know if that's desired behavior, because one could voluntarily .fuse() beforehand.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the problem here is that iter.next() could return Some(..) and internally hit a None, but calling next() again could return Some(..). So the user is not actually informed that the iterator returned None once, making the behavior of every_nth itself unspecified without using fuse.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Itertools fuses the iterator internally too, so I don't see why not fuse it here too. We'd just need to leave it explicit in the documentation.

Copy link
Contributor

@johncf johncf Apr 27, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can think of a way to avoid fusing. But that would involve consuming an element when every_nth is called, which might be (slightly?) against the "laziness" principle of iterator combinators.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least in the basic use case of stepping through ranges, the compiler should be able to inline the whole thing and get rid of every abstraction (the fuse iterator or the first_take flag, nth etc). For other use cases, it might be impossible to optimise away the branch, but it might also be very bad to consume elements in advance.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but it might also be very bad to consume elements in advance.

I wasn't clear enough, I think. I said consume "the first element" in advance, not "all." I'm not quite sure if even consuming the first element in advance could be "bad," except perhaps weakly breaking a principle.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if the fused/first_take approaches have the same branching, the current implementation and the one I pointed out do not have the same complexity from the point of view of work on the inner iterator, because of the early nth.
If the final consumer of the every_nth iterator only takes the first element, the underlying iterator will consume 1+n elements with the current implementation (as in @nagisa's first option), while one might expect it to only consume 1 element (as in the second option).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@johncf I think a similar rationale applies to Peekable. It would be easier to write it so that it always contains the current element (or None). Instead it was explicitly coded to keep track of whether the element has been pulled from the inner iterator.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ranma42 I'm guessing, that rationale would indeed matter when the iterator is consuming data from a network stream, for instance. In that case, consuming any element in advance might cause unexpected blocking behavior.

/// assert_eq!(iter.next(), None);
/// ```
#[inline]
#[stable(feature = "rust1", since = "1.0.0")]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You'll want to fix this attribute. It should be given its own feature name and it should be unstable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I plan to do that. I just didn't know how when I made the pull request. Also, the feature name depends on whether we get rid of the Step trait or if we'll keep the step trait and have this as every_nth. I personally prefer the former, since there's some code that already uses step_by and wouldn't break as bad with a change in the implementation. That would mean, however, that I'd have to reimplement Iterator for Range, which is not hard, but would definitely take some time. Once all that gets figured out, I'll put the proper attributes.

Copy link
Member

@scottmcm scottmcm May 9, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does taking the name step_by really require removing the Step trait?

Removing step, steps_between, and is_negative from Step seems logical, as does replacing core::iter::StepBy with the logic worked out here. But redoing all the Range iterators feels like it shouldn't be necessary for this change, and I don't see how the existence of Step conflicts with it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because if we rename every_nth to step_by, then there's going to be a method name conflict for ranges, because there would be a step_by from the Iterator trait, and a step_by from the Range struct.

Technically we wouldn't need to remove the Step trait, but if we switch the Range::step_by implementation to the iterator version, then there's going to be no use for the Step trait in the core lib. Besides, it looks like the only implementors of Step are the normal integers. Nobody really uses the Step trait to increment an integer when they can just use += 1. And if people really wanted it, we could just implement Iterator for them and have them return the integer plus one. It would work the same.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ivandardi I think we should just go with a separate feature name for now. We can do the removal in another PR.

@BurntSushi
Copy link
Member

cc @rust-lang/libs Are we sufficiently past the step_by stuff that we'd be willing to take every_nth?

@BurntSushi
Copy link
Member

@ivandardi The libs team discussed this today, and it seems like we're all on board with this direction. It looks like there are still some changes you wanted to make before merging though? For example, is it intentional that every_nth(0) panics?

@BurntSushi BurntSushi added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-team Status: Awaiting decision from the relevant subteam (see the T-<team> label). labels May 9, 2017
@nagisa
Copy link
Member

nagisa commented May 9, 2017

@BurntSushi I feel like there has been some mis-comunication? It was on the libs team to decide on one of the many proposed behaviours, and I do not see any decision wrt that.

@BurntSushi
Copy link
Member

@nagisa I see. I'll take a closer look and try to summarize.

@ivandardi
Copy link
Contributor Author

Yes, it's expected that every_nth(0) panics, otherwise it would enter an infinite loop of yielding the first element forever. It's consistent with the behavior seen in Python too.

I'll submit a commit making the changes to the feature name and making it unstable hopefully later today.

@BurntSushi BurntSushi added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels May 9, 2017
@nagisa
Copy link
Member

nagisa commented May 10, 2017

Yes, it's expected that every_nth(0) panics, otherwise it would enter an infinite loop of yielding the first element forever. It's consistent with the behavior seen in Python too.

You did agree before that having every_nth(0) be identity iterator transformer could make sense. Have you changed your mind since?

@ivandardi
Copy link
Contributor Author

@nagisa Nope. Makes more sense for every_nth(1) to be the identity, because it means that it will give every 1st element, aka the next element. It also goes in hand with what other languages have, and it's important to have familiar behavior. Yeah, I know I went back and forth in this, but this hasn't been easy to decide! XD But I think everyone is pretty settled on it now.

Also, I was testing with step_by(0) always yielding the first element of the iterable and I actually accidentally crashed my browser a lot of times in the playground while testing it out. If you have a for i in (0..10).every_nth(0), it will loop forever unless you have a break inside the loop. Combine that with the fact that people might have something like

usize step = do_calculation()
for i in (0..len).step_by(step)

where step might accidentally be 0 and it's a really bad thing. I now see why Python throws an exception when the step parameter in their ranges is 0 and that's why I'm also making it panic on step_by(0).

@ollie27
Copy link
Member

ollie27 commented May 10, 2017

Makes more sense for every_nth(1) to be the identity, because it means that it will give every 1st element, aka the next element.

That doesn't match nth:

Like most indexing operations, the count starts from zero, so nth(0) returns the first value, nth(1) the second, and so on.

@Mark-Simulacrum Mark-Simulacrum added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels May 19, 2017
@ivandardi
Copy link
Contributor Author

Awesome! Yeah, it took one month to get this going! At least I'm glad it works. After this, there needs to be another PR to take care of StepByDeprecated and reimplement Iterator for Ranges.

Oh, and a last question: Is it my Github name that gets added to the contributors list automatically? If yes, then I just updated it. :D

@bors
Copy link
Contributor

bors commented May 19, 2017

⌛ Testing commit 4955517 with merge 2ed2222...

@bors
Copy link
Contributor

bors commented May 19, 2017

💔 Test failed - status-appveyor

@BurntSushi
Copy link
Member

Oh, and a last question: Is it my Github name that gets added to the contributors list automatically? If yes, then I just updated it. :D

Hmm not sure. @steveklabnik ?

@CryZe
Copy link
Contributor

CryZe commented May 19, 2017

I'm pretty sure it's just the name and email you committed as.

@alexcrichton
Copy link
Member

@bors
Copy link
Contributor

bors commented May 19, 2017

⌛ Testing commit 4955517 with merge 543691d...

bors added a commit that referenced this pull request May 19, 2017
Stabilize step_by by adding it to Iterator (issue #27741)

Inspired by itertools' `take()` method. See issue #27741
@bors
Copy link
Contributor

bors commented May 19, 2017

☀️ Test successful - status-appveyor, status-travis
Approved by: BurntSushi
Pushing 543691d to master...

@bors bors merged commit 4955517 into rust-lang:master May 19, 2017
scottmcm added a commit to scottmcm/rust that referenced this pull request May 20, 2017
Follow-up to rust-lang#41439 (comment)

While doing so, remove the now-unused `step`, `steps_between`, and `is_negative` methods from `Step`.

Mostly simple, but needed two interesting changes:
* Override `Iterator::size_hint` for `iter::StepBy` (so hints aren't lost)
* Override `Iterator::size_hint` for `ops::RangeFrom` (so `(0..).size_hint()` returns what `(0..).step_by(1).size_hint()` used to)
(It turns out that `(0..).step_by(d)` is used in a bunch of tests, from `cycle` to `vec_deque`.)

Incidentally fixes rust-lang#41477
@birkenfeld
Copy link
Contributor

Isn't there a specialization for the Range types missing?

Mark-Simulacrum added a commit to Mark-Simulacrum/rust that referenced this pull request Jun 1, 2017
…alexcrichton

Deprecate range-specific `step_by`

Deprecation attributes and test updates only.

Was replaced by an any-iterator version in rust-lang#41439

Last follow-up (this release) to rust-lang#42110 (comment)

r? @alexcrichton
Mark-Simulacrum added a commit to Mark-Simulacrum/rust that referenced this pull request Jun 1, 2017
…alexcrichton

Deprecate range-specific `step_by`

Deprecation attributes and test updates only.

Was replaced by an any-iterator version in rust-lang#41439

Last follow-up (this release) to rust-lang#42110 (comment)

r? @alexcrichton
Mark-Simulacrum added a commit to Mark-Simulacrum/rust that referenced this pull request Jun 2, 2017
…alexcrichton

Deprecate range-specific `step_by`

Deprecation attributes and test updates only.

Was replaced by an any-iterator version in rust-lang#41439

Last follow-up (this release) to rust-lang#42110 (comment)

r? @alexcrichton
Mark-Simulacrum added a commit to Mark-Simulacrum/rust that referenced this pull request Jun 2, 2017
…alexcrichton

Deprecate range-specific `step_by`

Deprecation attributes and test updates only.

Was replaced by an any-iterator version in rust-lang#41439

Last follow-up (this release) to rust-lang#42110 (comment)

r? @alexcrichton
bors added a commit that referenced this pull request Jul 4, 2017
Delete deprecated & unstable range-specific `step_by`

Using the new one is annoying while this one exists, since the inherent method hides the one on iterator.

Tracking issue: #27741
Replacement: #41439
Deprecation: #42310 for 1.19
Fixes #41477
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.