Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow loops to return values other than () #352

Closed
wants to merge 8 commits into from

Conversation

ftxqxd
Copy link
Contributor

@ftxqxd ftxqxd commented Oct 4, 2014

Extend for, loop, and while loops to allow them to return values other than ():

  • add an optional else clause that is evaluated if the loop ended without
    using break;
  • add an optional expression parameter to break expressions to break out of
    a loop with a value.

Rendered view

(See also this discuss thread.)

@netvl
Copy link

netvl commented Oct 4, 2014

The idea is nice, but, FWIW, your motivation example could be rewritten much nicely without breaks:

fn find(list: Vec<int>, val: int) -> Option<uint> {
    for (i, v) in list.iter().enumerate() {
        if *v == val {
            return Some(i);
        }
    }
    None
}

so maybe it doesn't fit exactly as motivating example :)

@huonw
Copy link
Member

huonw commented Oct 4, 2014

Or just list.iter().position(|v| *v == val).

@blaenk
Copy link
Contributor

blaenk commented Oct 4, 2014

What are the semantics of breaking to a label with a value?

@codyps
Copy link

codyps commented Oct 4, 2014

Hmmm... frankly I'd prefer a way to have loops be expressions returning something with the Iterator trait.

@glaebhoerl
Copy link
Contributor

I would have eventually proposed this myself, so let me try to motivate it:

  1. Observe that the only way a loop can exit is via break. In general, expressions exit together with evaluating to a value. It's therefore straightforward to generalize loop to also evaluate to a value by supplying one to break.
  2. In the case of while and for..in, the loop may also exit "normally" without encountering a break. In this case it's not as obvious what value the loop could evaluate to. One simple solution is to add an optional else block which is evaluated iff no break was encountered.
  3. It turns out that there's precedent for this from Python, which also defines else on loops in exactly the same way. This suggests that we're on the right track, and not sailing off into uncharted waters. (Multiple people also came up with the same idea seemingly independently in the discourse thread.)

Essentially the claim is that having all of our major constructs be useful as expressions would be really, really cool, and this seems like unambiguously the right way to do it.

Iterators are nice, but if already-existing language constructs can be profitably improved and generalized, "iterators are also nice" is not really a compelling reason to miss out on doing so.

@blaenk
Copy link
Contributor

blaenk commented Oct 4, 2014

I agree with this for the reason that it increases consistency with regard to what constructs can be used as expressions.

I'd still appreciate to have clarification on what a break to a label with a value would mean though.

@huonw
Copy link
Member

huonw commented Oct 4, 2014

It turns out that there's precedent for this from Python, which also defines else on loops in exactly the same way. This suggests that we're on the right track, and not sailing off into uncharted waters. (Multiple people also came up with the same idea seemingly independently in the discourse thread.)

@glaebhoerl FWIW, many, many Python users don't know about this, and many that do know about it find remembering the correct semantics hard.

(To be clear, I'm not arguing against this feature: I have also thought about it favourably in the past, though I don't have have a particular desire for it to be added pre-1.0.)

I'd still appreciate to have clarification on how a break to a label with a value would mean though.

@blaenk What do you mean by this? The fact that something like let x = 'a: loop { break 'a 12; }; is not discussed?

@blaenk
Copy link
Contributor

blaenk commented Oct 4, 2014

The fact that it is mentioned. I'm on my phone so I can't quote easily, but it says something like "a value can be written after the label of any is present," and I'm wondering what the semantics would be then.

Add an optional expression parameter to break statements following the label (if any).

@huonw
Copy link
Member

huonw commented Oct 4, 2014

Oh, I missed it, because there's no code example (that seems like something that warrants an example, for clarity).

The semantics are presumably the same as a normal break/return. It jumps to the end of the corresponding loop and sets the value of the loop expression to be the passed value: a labelled break is not particularly different to returning from the whole function from deep inside a loop, it's just replacing "function" with "outer loop".

@blaenk
Copy link
Contributor

blaenk commented Oct 4, 2014

Right that makes sense, thanks Huon. For some reason I was forgetting about the semantics of labeled breaks and was thinking they jumped to the loop so that the loop continued again, something like a labled continue.

@tbu-
Copy link
Contributor

tbu- commented Oct 5, 2014

For Python: I think it is a relatively unknown feature there, and TBH when I first encountered it I parsed else as "execute this if the loop doesn't execute, i.e. if there's 0 iterations".

@Ericson2314
Copy link
Contributor

@P1start Thanks for doing this! Can you / do you want to add being able to break out of naked blocks the same way? break would still break to the innermost loop / innermost block by default to not scare C/C++ programmers and be backwards compatible. However using labels one can break out of blocks instead.

Another advantages is this can be used to implement @glaebhoerl try..catch, or even just use this instead.

@pnkfelix
Copy link
Member

pnkfelix commented Oct 9, 2014

assigning to @pnkfelix to shepherd

@pnkfelix pnkfelix self-assigned this Oct 9, 2014
@chris-morgan
Copy link
Member

This concept and the logical extension of it to Result-based iteration is a concept that’s been around from late last year at least; here’s an almost-complete proposal I wrote for it in April: http://chrismorgan.info/blog/rust-proposal-result-based-iteration.html. At the time I decided not to take it any further (and so that article has never been published—it still isn’t, that’s a draft) as I thought it probably wasn’t going to be accepted, but actually since then we’ve headed much more to using Result for things, so possibly it’d be worth bringing it further now, though switching to Result-based iteration still seems a small stretch to convince people of…

I’m surprised I didn’t do anything with while or loop at the time; it certainly makes sense to expand them also.

@glaebhoerl
Copy link
Contributor

@chris-morgan That's very interesting! It reminds me more than a little bit of some of the ideas in #243.

@telotortium
Copy link

I really find else nonintuitive, both to remember and to figure out on the rare occasions I see the code. I thought replacing else with if not break would be more immediately obvious for Rust, and the fact that it uses 3 words wouldn't matter too much since it's a relatively rare construction, both as part of a language (Python is the only language I know that uses else in this way) and even in frequency of use in the languages that do have it.

Unfortunately, not is not a keyword in Rust, nor is any other negative-sounding word. I don't think if not break (or if !break would be ambiguous after a for or while loop, but special-casing syntax highlighting, as well as the false impression that not is a keyword in other contexts, may be reasons enough to reject this suggestion.

@reem
Copy link

reem commented Oct 11, 2014

Some ideas that were tossed around on IRC instead of else: then, finally, done, on done.

@glaebhoerl
Copy link
Contributor

I've thought of finally too. Sadly it has the connotation of "always do this, no matter what", which is the opposite of what we want. For that matter, it's not clear that the other mentioned options escape this potential misinterpretation, either. Maybe there isn't a way to make the meaning inherently obvious without just saying it directly, as in @telotortium's comment. So it might be worth thinking further about possibilities which mention break directly in some way.

For what it's worth, here's a logical justification for else: An else attached to an if runs if the condition evaluates to false. An else attached to a while does the same thing. If a break is encountered, then it never does evaluate to false. It's a little bit tortured, but there is at least some logic to it. (On the other hand, the converse is not true: an else attached to an if is skipped if the condition evaluates to true. With a while, the condition can evaluate to true any number of times without causing the else to be skipped. This intuition is the basis for the other interpretation people are liable to think of, that the else is taken if the loop never runs, i.e. if the condition never evaluates to true.)

@tbu-
Copy link
Contributor

tbu- commented Oct 11, 2014

@glaebhoerl When I first encountered the else I thought it's only executed if the loop isn't executed at all.

@Ericson2314
Copy link
Contributor

@chris-morgan That's great! Is it possible to make some sort of recurring macro that would redefine break inside the body of the loop?

@Ericson2314
Copy link
Contributor

I was just thinking, if next returns a Result, then perhaps Reader could inherit from Iterator.

@glaebhoerl
Copy link
Contributor

I found this the other day:

MonadPlus m => EitherT e m a is the type of a generator returning the result of type a while yielding intermediate results of type e.

While in our case a Result-based for loop would yield intermediate results of type T before returning a final result of type E. The error/success connotations are weirdly reversed, but the parallel is intriguing.

(I don't understand the meaning of the Haskell type in its full generality either (would need to read the article more thoroughly) -- but instantiating m with IO and expanding the EitherT gets us IO (Either e a), which given Rust's unrestricted IO is very close to our Result<T, E>. But IO does not implement MonadPlus. So I dunno.)

@pnkfelix
Copy link
Member

@P1start I think this RFC could benefit from a few more examples.

For example, you say that the else clause is optional, but I think all the current examples include an explicit else clause ... I mainly am thinking of examples showing else-free code that would be accepted and rejected based on the control-flow analysis ensuring (or failing to ensure) that an appropriate value is produced from the loop.

Likewise, examples involving labelled breaks are probably warranted (see earlier comments where @huonw described such semantics).

@pnkfelix
Copy link
Member

Also, in the alternatives section, you note the idea of nobreak being clearer than else, but lament the fact that would require adding a new keyword -- an obvious variant on that would be loop { ... } !break { ... }; I think that should be unambiguous, though perhaps too subtle.

@glaebhoerl
Copy link
Contributor

Only half serious, but we could add an unless construct (where unless FOO = if !FOO, a la Perl), and then re-use that keyword here as unless break.

@m13253
Copy link

m13253 commented Jan 18, 2015

@blaenk wrote:

I'd still appreciate to have clarification on how a break to a label with a value would mean though.

I would like to state my own idea.

let outer_value = 'outer: loop {
    let inner_value = 'inner: loop {
        break 'outer 42; // Should go to outer_value
        13 // Lexically feed inner_value a value for demonstration
    }
}

That is obvious that the outer_value will receive the number 42. Since we already jumped out of the inner loop, it doesn't matter whether inner_value actually received something.

I'm looking forward to some different idea.

@phaux
Copy link

phaux commented Jan 19, 2015

Result is something defined in standard library, not the language itself. On some circumstances that libstd is not used at all, why should we pay for the overhead of introducing a Result automatically?

For the same reason we pay for the overhead of introducing Iterators to be able to use for loops in first place.

by using "else", the compiler helps us to grantee a value must be returned. If we use Result, the check would be put to runtime.

That's not true. Performance would be identical to using unwrap_or_else. The else clause is just a pointless syntax sugar for a functionality that's already there.

Here are some more examples with Option:

  • Get first even number from vector or 0
let nums = vec![1, 3, 3, 7];
let x = for n in nums.iter() {
  if n % 2 == 0 { break n; }
}.unwrap_or_default();
assert_eq!(x, 0);
  • Get first even number or first number divisible by 3 or 0
let nums = vec![1, 3, 3, 7];
let x = for n in nums.iter() {
  if n % 2 == 0 { break n; }
}.or(for n in nums.iter() {
  if n % 3 == 0 { break n; }
}).unwrap_or_default();
assert_eq!(x, 3);

Basically, you can use any Option method on for loops and pass them as an argument to functions that accept Options. This would get even better if we had proper error handling (unary ? operator)

@m13253
Copy link

m13253 commented Jan 20, 2015

by using "else", the compiler helps us to guarantee a value must be returned. If we use Result, the check would be put to runtime.
That's not true. Performance would be identical to using unwrap_or_else. The else clause is just a pointless syntax sugar for a functionality that's already there.

I didn't mean performance issue. The problem is, when we use else (or some other word rather than Result), the compiler checks and guarantees the loop must produce one value.

Let's say an algorithm that produces the first Fibonacci number larger than x:

let result = {
    let mut (a, b) = (0, 1);
    loop {
        if a > x {
            break a;
        }
        (a, b) = (b, a+b);
    }
}

Since it is a loop loop (not while), it must produce a value. The compiler does this check at compile rime.

Or if we use Result:

let result = match {
    let mut (a, b) = (0, 1);
    while true { // The compiler can not guarantee a value
        if a > x {
            break Option::Some(a);
        }
        (a, b) = (b, a+b);
    } else {
        None
    }
} { // Unwrap the result by our hand
    Some(result) => result, // We are doing some extra check in runtime though it is unnecessary
    None => panic!()
}

Then we are delaying checks to the runtime!

Never delay a check that can be done at compile time to runtime!
We only use Result when the result can only be determined in runtime (e.g. user input)

@m13253
Copy link

m13253 commented Jan 20, 2015

But what about the word "else"? Is "else" confusing?

I don't think so.

Take "if-else" as an example:

if expr {
    ... // Execute this when expr is true
} else {
    ... // Execute this when expr is false
}

Then we still can explain "while-else" similarly:

while expr {
    ... // Execute this as long as expr keeps true
} else {
    ... // Execute this as soon as expr becomes false
}

If proper documented, I do not think lots of people will misunderstand it as "Execute this only when the first evaluation of expr is false".

@phaux
Copy link

phaux commented Jan 21, 2015

I'd prefer if loop could still return None if we passed no value to break and Some if we did. Your example would then just need .unwrap() at the end.

This is still kinda pointless example. See my previous examples and try to convince me that there's a cleaner way for doing it with else clause. Protip: You can't.

Here's another one: Get first even number or first number divisible by 3 and turn it into a String

let nums = vec![1, 3, 3, 7];
let x = for n in nums.iter() {
  if n % 2 == 0 { break n; }
}.or(for n in nums.iter() {
  if n % 3 == 0 { break n; }
}).map(|x| {
  format!("The number was {}", x)
}).unwrap_or( "No such number".to_string() );
// x is a string "The number was 3"

Never delay a check that can be done at compile time to runtime!

I'm just discussing syntax. It would work exactly the same under the hood.

@ftxqxd
Copy link
Contributor Author

ftxqxd commented Jan 21, 2015

Option is not part of the Rust language (it’s part of the Rust standard library), so it would be a bit weird to build it into loops like that. Also, making loops return Option would remove a lot of the advantages of this RFC, namely the ability for the compiler to gain extra information about the control flow of loops with else. For example, under this RFC, this code would be valid:

let x;
let y = for i in some_vector.into_iter() { 
    if i == some_integer { x = true; break i }
} else {
    x = false; -1
};

The compiler knows that x will be assigned to exactly once, and so x does not need to be marked as mutable or have an initial value. However, if loops returned Option, defaulting to None when no expr’d break was reached, the code would look like this:

let x;
let y = for i in some_vector.into_iter() { 
    if i == some_integer { x = true; break i }
}.unwrap_or_else(|| {
    x = false; -1
});

To the compiler, unwrap_or_else is just a method that takes a closure. The compiler doesn’t know if it will ever be called, or how many times it will be called if it is called at all. So it wouldn’t let this code through, because if the closure were called any number of times more than once (if the break wasn’t reached) or called at all (if the break was reached), x (an immutable, uninitialised variable) could be assigned to more than once or not at all.

@tbu-
Copy link
Contributor

tbu- commented Jan 21, 2015

@P1start Option is part of the Rust language. It is used for for loops.

@ftxqxd
Copy link
Contributor Author

ftxqxd commented Jan 21, 2015

It’s still not technically part of the Rust language. It’s not even a lang item. for loops are just hackily hardcoded to use something that resembles an Option in layout (I think). That doesn’t mean that loop couldn’t use OptionOption, Some, and None could be hard-coded into loops as names, or Option could become a lang item properly.

@tbu-
Copy link
Contributor

tbu- commented Jan 21, 2015

I feel like this is just arguing about details. The Option type is part of the signature of the Iterator trait that is a requirement for writing a for loop. As such I don't see how

Option is not part of the Rust language (it’s part of the Rust standard library), so it would be a bit weird to build it into loops like that.

can be argued for, it is already built into for loops.

@m13253
Copy link

m13253 commented Jan 21, 2015

(I sent this reply with my mail client, sorry if the layout messed up)

I'd prefer if loop could still return None if we passed no value to break and Some if we did. Your example would then just need .unwrap() at the end.

If you need None, why not write a "None" by hand?

If someone does not need a "None", .unwrap() takes tens of extra CPU cycles to check for the impossible "None".

If you need None, write None explicitly.

This is still kinda pointless example. See my previous examples and try to convince me that there's a cleaner way for doing it with else clause. Protip: You can't.

When you need to return a value, write "break Some(value);", when you need to return None, write "break None;".

@phaux
Copy link

phaux commented Jan 21, 2015

The compiler doesn’t know if it will ever be called, or how many times it will be
called if it is called at all.

It knows it will be called at most one time, because it's of type FnOnce. With the move keyword it could actually work if compiler was smart enough (It currently doesn't). Anyways, the whole point of allowing loops to return a value is to avoid the need for referencing variables from outside the loop.

This is still kinda pointless example. See my previous examples and try to convince
me that there's a cleaner way for doing it with else clause. Protip: You can't.

When you need to return a value, write "break Some(value);", when you need to
return None, write "break None;".

No. Let me do it for you:

Get first even number or first number divisible by 3 and turn it into a String

  1. Option version
let nums = vec![1, 3, 3, 7];
let x = for n in nums.iter() {
    if n % 2 == 0 { break n; }
}.or(for n in nums.iter() {
    if n % 3 == 0 { break n; }
}).map(|x| {
    format!("The number was {}", x)
}).unwrap_or( "No such number".to_string() );
  1. else version:
let nums = vec![1, 3, 3, 7];
let x = for n in nums.iter() {
    if n % 2 == 0 { break format!("The number was {}", n); }
}
else {
    for n in nums.iter() {
        if n % 3 == 0 { break format!("The number was {}", n); }
    }
    else {
        "No such number".to_string()
    }
}
assert_eq!(x, 3);

Option version is much more concise and more appropriate for a functional language like rust.

@m13253
Copy link

m13253 commented Jan 21, 2015

  1. If you introduce Option, the return value of a traditional loop will be a Option<_>::None, which is an incomplete type. And this breaks backward compatibility. In order not to break anything, we should guarantee traditional loop must return a ().
  2. The compiler should be transparent. In no circumstances will the compiler force the programmer use Option. If an Option is necessary, write Some and None by hand.

This is still kinda pointless example. See my previous examples and try to convince me that there's a cleaner way for doing it with else clause. Protip: You can't.

So let me show you it is possible. Just write it.

  1. The version that works with current Rust
let nums = [1_i32, 3, 3, 7];
let found =
    nums.iter().filter(|&x| *x % 2 == 0).next().or_else(||
    nums.iter().filter(|&x| *x % 3 == 0).next()
);
match found {
    Some(x) => format!("The number was {}.", x),
    None => "No such number.".to_string()
}
  1. The version that use for-else loop
let nums = [1_i32, 3, 3, 7];
let found = for i in nums.iter() {
    if i % 2 == 0 { break Some(i) }
} else for i in nums.iter() {
    if i % 3 == 0 { break Some(i) }
} else {
    None
};
match found {
    Some(x) => format!("The number was {}.", x),
    None => "No such number.".to_string()
}

@pnkfelix
Copy link
Member

So, I admit I have not followed the conversation here in great detail yet, but I have a couple questions for @P1start , or really, anyone who has been following the conversation:

  • Is there a backwards compatibility issue with the RFC as written? From my understanding, we could add this backwards compatibly post 1.0 (which means I'm then inclined to postpone this until that time).
  • Is there some revision or alternative design posted in the comments that does have a backwards compatibility issue? (Such an issue would lead me to at least try to resolve that, though doing so is not as important as it would be if there were such an issue in the RFC itself as written.)

@ftxqxd
Copy link
Contributor Author

ftxqxd commented Jan 23, 2015

This is indeed backwards-compatible (or at least I haven’t found a backwards-incompatible subtlety yet) so long as we stick with using else as the keyword (or of course gain a way to backwards-compatibly add keywords to the language post-1.0, as I’ve heard discussed a few times). Among the alternatives and related proposals in the comments, as I understand them, the backwards-incompatible ones are:

  • changing the else keyword in loops to something else, because it would break code that used that keyword as an identifier;
  • making iterators yield Result<T, E> instead of Option<T> (although I believe this could be done in addition to this RFC), because it would break explicit calls to next and implementations of Iterator; and
  • I think making loops yield Option as @phaux proposes is backwards-incompatible as well, because code such as expect_unit(while cond { ... }) would pass None to a function expecting ().

@Ericson2314
Copy link
Contributor

@pnkfelix I made some comments in http://discuss.rust-lang.org/t/reader-stablization-errors-and-iterators/1345/6 . But I think a better solution that what I proposed there is to temporarily make loops statements, to avoid @P1start's third point.

@pnkfelix
Copy link
Member

@P1start @Ericson2314 thank you. I'm skeptical that we'd change the Iterator interface at this point.

But changing the looping syntactic forms to be statements rather than expressions might be doable for 1.0, and sounds like it would give us more design freedom here. I'll try to float the idea and/or prototype that particular change.

@aturon
Copy link
Member

aturon commented Mar 5, 2015

ping @pnkfelix, what's the status?

@pnkfelix
Copy link
Member

pnkfelix commented Mar 5, 2015

@aturon I haven't had a chance to prototype anything yet. But even in the absence of concrete information about the resulting fallout, I would still recommend that we at least consider making all the looping forms statements rather than expressions.

(I am pretty confident that it is sound to replace all loop-expressions with { loop-stmt }; this replacement would not have been 100% applicable to for-loops until somewhat recently, when #21984 landed, but I think it is now sound for all of our looping forms.)

@pnkfelix
Copy link
Member

pnkfelix commented Mar 5, 2015

(ah, it is possible there is a reason for while let ... to remain an expression, solely based on the rules about terminating scopes... i really should try to prototype the change.)

@Ericson2314
Copy link
Contributor

Arguably while let should return the value that didn't match, so hopefully it too could be made a statement in the short term.

@pnkfelix
Copy link
Member

pnkfelix commented Mar 9, 2015

@aturon okay, now that I've filed RFC #955, I think we can probably postpone this RFC. (What we are able to do with this RFC when we get around to it will depend on how we handle RFC #955, but as stated several times in RFC #955, even if #955 itself is rejected, we could still adopt #352 as written.)

@aturon
Copy link
Member

aturon commented Mar 10, 2015

@pnkfelix Thanks!

@P1start Thanks for the PR; I'm going to close this as postponed for the moment, and discussion can move to #955 #961 for future proofing.

@aturon aturon closed this Mar 10, 2015
@aturon aturon added the postponed RFCs that have been postponed and may be revisited at a later time. label Mar 10, 2015
kenrick95 added a commit to kenrick95/piece_table that referenced this pull request Oct 11, 2017
@petrochenkov petrochenkov removed the postponed RFCs that have been postponed and may be revisited at a later time. label Feb 24, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.