Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rename Iterator::fold_first to reduce and stabilize it #79805

Merged
merged 3 commits into from
Feb 5, 2021

Conversation

m-ou-se
Copy link
Member

@m-ou-se m-ou-se commented Dec 7, 2020

This stabilizes #![feature(iterator_fold_self)].

The name for this function (originally fold_first) was still an open question, but the discussion on the tracking issue seems to have converged to reduce.

@m-ou-se m-ou-se added T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. A-iterators Area: Iterators needs-fcp This change is insta-stable, so needs a completed FCP to proceed. labels Dec 7, 2020
@m-ou-se
Copy link
Member Author

m-ou-se commented Dec 7, 2020

@rfcbot merge

@rfcbot
Copy link

rfcbot commented Dec 7, 2020

Team member @m-ou-se has proposed to merge this. The next step is review by the rest of the tagged team members:

Concerns:

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

@rfcbot rfcbot added proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. labels Dec 7, 2020
@BurntSushi
Copy link
Member

BurntSushi commented Dec 8, 2020

I'm somewhat skeptical of the name reduce. The analogous routines in Python and Javascript seem to have a very flexible API that subsume both the fold and fold_first APIs. So I'm not sure that reduce really corresponds to the specific case of "returns an element of the type that's in the collection." However, it sounds like this distinction is made by Kotlin/Scala? I'm not familiar with either language. Does anyone have links to the docs for their reduce routines?

In the summary put together by @jagill, they said this:

For new people (not from Kotlin/Scala), it may seem arbitrary which functionality is fold and which is reduce.

And I think that's exactly where I fell. When I saw the name reduce, my next thought was, "wait, don't we already have fold?" That is, I came into this thinking that fold and reduce were essentially synonymous. So my concern with the name fold boils down to that: will other people have the same confusion?

The name fold_first isn't exactly great either, but to me it at least signals that it's a fold but a different kind of fold.

Overall, I'd say my opinion here is pretty weak and this is probably a minor quibble. But it would be good to hear if anyone else shares this concern.

@est31
Copy link
Member

est31 commented Dec 8, 2020

I'm not sure that reduce really corresponds to the specific case of "returns an element of the type that's in the collection." However, it sounds like this distinction is made by Kotlin/Scala? I'm not familiar with either language. Does anyone have links to the docs for their reduce routines?

Kotlin docs are here, and at least the implementation on the Iterable interface (which probably corresponds to Rust's Iterators), the closure can take items beyond just the type itself: it can take items of any type that is a supertype of the item iterating over. Rust is no OOP language and can't really express this concept.

The scala docs are here and seem to accept supertypes as well:

B A type parameter for the binary operator, a supertype of A.

As it's no OOP language, Rust doesn't really have supertype vs subtype relationships for all types. The closest it has are traits. But you can't have trait type params in Rust so there would be no way to encode this constraint.

Maybe having an Into bound would bring us closer to Kotlin and Scala:

fn reduce<V>(f: impl Fn(V, Self::Item) -> V) -> Option<V> where Self::Item: Into<V>,  { /*...*/ }

@imjasonmiller
Copy link

I'm hoping this is the right place to comment. I'm not sure if this was already raised.

Coming from JavaScript my bias and preference is for .reduce(), but while JavaScript has a camel cased .reduceRight, Rust seems to prefix these methods with an r.

With Rust seemingly already having .rfold(), I think it might be hard on the eyes to possibly have .rreduce() — if that functionality might be added and the naming convention will be adhered to? In that regard, I think .rfold_first might be easier to read.

@vchekan
Copy link

vchekan commented Dec 9, 2020

I strongly favored reduce name before I've read wikipedia:
https://en.wikipedia.org/wiki/Fold_(higher-order_function)#In_various_languages

Apparently fold and reduce are used interchangeably in the industry and there is no clear distinction between terms. At least I could not identify distinct names for "aggregate elements by applying an operator to them" vs "aggregate elements by starting with initial value and applying an operator to them".

Offtopic: another curious discovery, somehow I thought that fold can have accumulator type different from collection type, to my surprise it is not so. Seems to me like loss of generalization, unless I am missing something?

@KodrAus
Copy link
Contributor

KodrAus commented Dec 9, 2020

I think I echo @BurntSushi's sentiment a bit in not really favoring reduce. I worry about accumulating a lot of noise on Iterator in the form of niche utility methods that makes it harder to find an appropriate chain of combinators for a given usecase (anecdotally that was my early experience with Rx). I'm still a bit skeptical of the method altogether, but think fold_first is an appropriate name to logically group it under fold rather than as an independent operation called reduce.

@m-ou-se
Copy link
Member Author

m-ou-se commented Dec 9, 2020

See #68125 (comment) for my reasoning for the name reduce.

@BurntSushi
Copy link
Member

@rfcbot concern naming

The more I think about this personally, the more I think reduce isn't the right name for this. The wikipedia link above seems like pretty compelling evidence to me that reduce and fold are used interchangeably in a lot of languages. So I feel like providing two different routines with the names that folks would previously presume to be synonymous is more confusing than something like fold_first.

@jagill
Copy link
Contributor

jagill commented Dec 14, 2020

@KodrAus @BurntSushi Would either/both of you be in favor of stabilizing the feature under the name fold_first? Since @KodrAus both wrote the original code and made the tracking issue, I'm particularly interested in understanding his statement "I'm still a bit skeptical of the method altogether".

I would personally really like this feature stabilized. It's in most major languages, and the comments in the tracking issue suggest that many people (myself included) find it very useful. If the other team members would accept fold_first, I'd move we try to stabilize it under that name.

@BurntSushi
Copy link
Member

I would be okay with fold_first.

@m-ou-se
Copy link
Member Author

m-ou-se commented Dec 14, 2020

I still think reduce and fold are the right names for these two functions. They are fundamentally different, and fold does not necessarily reduce (#68125 (comment)). In languages with function overloads, they are often provided as two overloads of the same name, since they are related. However, there seems to be no language where the meaning of fold and reduce is reversed. So, I wouldn't say that they are used interchangeably.

While reading code, x.reduce(|a, b| a * b) is clear without reading documentation for anyone who is used to this kind of operation. However, if I'd see x.fold_first(|a, b| a * b) without context, I'd probably have to look up what 'first' refers to. The first element is only 'special' when you see this as a variant of fold with an initial element. With a commutative and associative operation like in this example, there's nothing special about the first element, and calling it out in the name just makes it confusing.

While writing code, I don't see any problem with someone looking for the fold function by looking up 'reduce'. They'd end up finding a related function that also links to fold with an explanation. (The doc alias already helps too.)

@BurntSushi
Copy link
Member

While reading code, x.reduce(|a, b| a * b) is clear without reading documentation for anyone who is used to this kind of operation.

I definitely agree with that. My problem is that I perceive reduce and fold as effectively synonymous. So when folks look at the docs and see that both operations are provided, I think that could lead to some confusion. Maybe that isn't as big of a problem as I think it is, but I just remember being pretty confused by it when I saw this issue show up in my mail.

I'm not a huge fan of fold_first myself either. And I admit, if I saw fold_first in the code, the unfamiliarity of the name would likely lead me to look it up in the docs to see what it does.

@m-ou-se
Copy link
Member Author

m-ou-se commented Dec 14, 2020

So when folks look at the docs and see that both operations are provided, I think that could lead to some confusion.

I think this is not much of a problem in the docs, since the signatures are visible, there are examples, there's an explanation, and both functions link to eachother in their documentation.

if I saw fold_first in the code, the unfamiliarity of the name would likely lead me to look it up in the docs to see what it does.

I think this is a much bigger problem. A good name can make all the difference while reading new code. Picking a name that would cause people to have to look things up when there is an alternative that wouldn't, just seems bad.

@BurntSushi
Copy link
Member

I think I might be convinced by that. I'll noodle on it. @KodrAus You also expressed skepticism at the name. What do you think?

@withoutboats
Copy link
Contributor

withoutboats commented Dec 14, 2020

I'm glad we're moving away from the reduce name, but I would rather we not stabilize this API at all.

map/filter/fold are the classic iterator combinator examples, but fold is I think rather famously more difficult for users new to the API to understand. Having more variations on the concept I worry will make things more obscure for users trying to get an intuitive understanding of the API surface. I think reduce would be especially bad for this: its common knowledge that reduce and fold are two names for the same concept; I wouldn't envy users new to these API patterns trying to understand why we have both, how they are different, and which to use for which use cases.

On the other hand, for users who already understand the API this replaces a two-liner: a call to next followed by a call to fold; I don't think it really carries its weight.

@vchekan
Copy link

vchekan commented Dec 16, 2020

I'm glad we're moving away from the reduce name, but I would rather we not stabilize this API at all.

map/filter/fold are the classic iterator combinator examples, but fold is I think rather famously more difficult for users new to the API to understand. Having more variations on the concept I worry will make things more obscure for users trying to get an intuitive understanding of the API surface. I think reduce would be especially bad for this: its common knowledge that reduce and fold are two names for the same concept; I wouldn't envy users new to these API patterns trying to understand why we have both, how they are different, and which to use for which use cases.

FP patterns make it into mainstream with force, so nowadays I would expect all competent developers to have conceptual understanding of "fold" and friends. In fact, the opposite is true: developers would expect "fold with init value" and "fold, starting with 1st element" to exist in any competent language/standard library. Lack of either of these would be a) time wasted in searching for it b) mild disappointment due to rust stdlib falling behind even Java in ergonomics ;)

On the other hand, for users who already understand the API this replaces a two-liner: a call to next followed by a call to fold; I don't think it really carries its weight.

Well, it is two-liner now.
We need to declare additional variable for iterator just to satisfy fold.
The folding operation now is not on a: &T, b: &T but on a: Option<&T>, b: &T , so I have to use additional map inside the folding function.

Here is the difference:

    let v = vec![1,2,3];
    let mut vi = v.iter();
    v.iter().fold(vi.next().cloned(), |a, i| a.map(|a| a + i));

    vec![1,2,3].iter().cloned().fold_first(|a, b| a + b);

@BurntSushi
Copy link
Member

FP patterns make it into mainstream with force, so nowadays I would expect all competent developers to have conceptual understanding of "fold" and friends. In fact, the opposite is true: developers would expect "fold with init value" and "fold, starting with 1st element" to exist in any competent language/standard library. Lack of either of these would be a) time wasted in searching for it b) mild disappointment due to rust stdlib falling behind even Java in ergonomics

I almost never use folds in Rust code, and instead prefer for loops because I think folds are more difficult to read. So I don't really feel like this addition is that great. But, I also don't feel strongly against it either, because folds are occasionally nice and I recognize that reasonable people can disagree. And in particular, I think it's bad to start making decisions on some notion of what a "competent" developer knows or doesn't know.

I would rather not try to litigate the usefulness or clarity of folds in general here. That's just not going to move us forward. But rather, whether this specific API carries its own weight.

@KodrAus
Copy link
Contributor

KodrAus commented Dec 16, 2020

@jagill

wrote the original code

Ah, I eventually approved the original PR after it was shown that we could replace a few ad-hoc helpers in the rustc codebase with it, but I didn't write the feature myself 🙂

I think @withoutboats captured my skepticism very succinctly:

for users who already understand the API this replaces a two-liner: a call to next followed by a call to fold; I don't think it really carries its weight.

Most methods on Iterator that build on lower-level combinators replace just a handful of calls you'd have to otherwise make. We always have to balance that utility against collecting an unbounded number of unrelated methods that becomes harder to find them. I wouldn't want to introduce this as a new "top-level" method called reduce, but wouldn't be against nesting it under fold as fold_first.

@WaffleLapkin
Copy link
Member

@vchekan's code is a good example showing why we need this method - he has written the code that is both incorrect and inefficient:

let v = vec![1,2,3];

// Will result in `7` because .iter() is called twice
let mut vi = v.iter();
v.iter().fold(vi.next().cloned(), |a, i| a.map(|a| a + i));

// Correct version
let mut vi = v.iter();
vi.next().cloned().map(|a| vi.fold(a, Add::add));

(playground)

std method would guard against this problem :)

About the naming: I'm personally still in favour of reduce - IMO it's both nicer and more clear when reading code (also nice if you came from Kotin/Scala/etc background). Also fold1 is better than fold_first because it's already adopted by itertools, Haskell, etc.

@imjasonmiller
Copy link

Also fold1 is better than fold_first because it's already adopted by itertools

As an outsider, I'm not sure if I agree with the assertion of fold1 being better, if that is solely due to itertools and Haskell using it as well? If it's due to familiarity, I think grouping it with fold and using the first suffix makes sense due to its explicitness and it already being omnipresent within Rust, e.g. first, first_mut, split_first and so on?

Coming from JavaScript I preferred reduce at first, but admittedly I do think the given reasoning for grouping it with fold as fold_first is sound.

From what I gather, this functionality is already considered niche, so I assume it won't be added, but reduce also has the added constraint of it becoming difficult to read/scan if it were to have its own counterpart to rfold. I find a possible rfold_first much easier to read than rreduce, even though I suppose one could do .rev().reduce() as well.

@camsteffen
Copy link
Contributor

I like the fold+reduce naming convention. But I am surprised at how seldom it has been adopted by other languages. One notable example that is missing from the Wikipedia page mentioned is Dart.

For me, the difference between fold and reduce is well ingrained in my mind and I like having the distinction. I agree with @m-ou-se and others that they are fundamentally different. But I now realize that this is because I have extensive experience in both Kotlin and Dart.

Rust is in the minority of languages that does not have function overloading. So it is more useful in Rust to take advantage of more "fine-grained" naming conventions. Also, the difference between fold and reduce is more pronounced in strongly typed languages.

rev().reduce() seems fine to me. Are there downsides I'm unaware of?

@traviscross
Copy link
Contributor

Having worked in many languages, I consider either reduce or fold1 (or fold_first if we must be verbose) to be fine.

Fold and reduce are definitely synonyms. There's no question about that, as @BurntSushi has pointed out. But Scheme provided both fold-left and reduce-left for these two cases. That's sufficient precedent for me. It's probably why Scala and Kotlin went that way as well. But like I say, fold1 (or fold_first if we must) is equally clear, has strong precedent from Haskell, and has the advantage of grouping better in the documentation.

Regarding whether the method pulls its weight, this is a case where it's more surprising to not have the method. Nearly every language that provides fold provides fold1, either as a separate function, or in languages that support it, by way of an optional argument. The fact that it's small and easy to implement just makes it that much more aggravating when you go to reach for it and realize it's not there.

@WaffleLapkin
Copy link
Member

There is also precedent in reduce crate which provides reduce extension function for iterators.

@bors bors added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Feb 4, 2021
@m-ou-se m-ou-se force-pushed the iterator-reduce branch 2 times, most recently from a681d48 to 24e0940 Compare February 4, 2021 10:31
@m-ou-se
Copy link
Member Author

m-ou-se commented Feb 4, 2021

Rebased to include a new usage of the feature in compiler/.

@bors r=KodrAus

@bors
Copy link
Contributor

bors commented Feb 4, 2021

📌 Commit 24e0940 has been approved by KodrAus

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Feb 4, 2021
m-ou-se added a commit to m-ou-se/rust that referenced this pull request Feb 4, 2021
Rename Iterator::fold_first to reduce and stabilize it

This stabilizes `#![feature(iterator_fold_self)]`.

The name for this function (originally `fold_first`) was still an open question, but the discussion on [the tracking issue](rust-lang#68125) seems to have converged to `reduce`.
m-ou-se added a commit to m-ou-se/rust that referenced this pull request Feb 4, 2021
Rename Iterator::fold_first to reduce and stabilize it

This stabilizes `#![feature(iterator_fold_self)]`.

The name for this function (originally `fold_first`) was still an open question, but the discussion on [the tracking issue](rust-lang#68125) seems to have converged to `reduce`.
bors added a commit to rust-lang-ci/rust that referenced this pull request Feb 4, 2021
Rollup of 9 pull requests

Successful merges:

 - rust-lang#74304 (Stabilize the Wake trait)
 - rust-lang#79805 (Rename Iterator::fold_first to reduce and stabilize it)
 - rust-lang#81556 (introduce future-compatibility warning for forbidden lint groups)
 - rust-lang#81645 (Add lint for `panic!(123)` which is not accepted in Rust 2021.)
 - rust-lang#81710 (OsStr eq_ignore_ascii_case takes arg by value)
 - rust-lang#81711 (add #[inline] to all the public IpAddr functions)
 - rust-lang#81725 (Move test to be with the others)
 - rust-lang#81727 (Revert stabilizing integer::BITS.)
 - rust-lang#81745 (Stabilize poison API of Once, rename poisoned())

Failed merges:

r? `@ghost`
`@rustbot` modify labels: rollup
@bors bors merged commit 5b0acfd into rust-lang:master Feb 5, 2021
@rustbot rustbot added this to the 1.51.0 milestone Feb 5, 2021
@apiraino apiraino removed the to-announce Announce this issue on triage meeting label Feb 11, 2021
@m-ou-se m-ou-se deleted the iterator-reduce branch February 11, 2021 19:12
@pickfire
Copy link
Contributor

@m-ou-se I think reduce being a doc-alias of fold currently was not being discussed. It would be confusing when both fold and reduce to show up when you search for reduce.

By the way, happy chinese new year!

@arnabanimesh
Copy link

@m-ou-se I think reduce being a doc-alias of fold currently was not being discussed. It would be confusing when both fold and reduce to show up when you search for reduce.

By the way, happy chinese new year!

If you search for reduce, it shows fold. And everybody is fine with it. Why should it not be vice versa?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-iterators Area: Iterators disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. finished-final-comment-period The final comment period is finished for this PR / Issue. needs-fcp This change is insta-stable, so needs a completed FCP to proceed. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. S-waiting-on-fcp Status: PR is in FCP and is awaiting for FCP to complete. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.