Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generator support #43076

Merged
merged 128 commits into from
Aug 28, 2017
Merged

Generator support #43076

merged 128 commits into from
Aug 28, 2017

Conversation

Zoxc
Copy link
Contributor

@Zoxc Zoxc commented Jul 5, 2017

This adds experimental support for generators intended to land once rust-lang/rfcs#2033 is approved.

This is not yet ready to be merged. Things to do:

  • Make closure arguments on generators an error
  • Spot FIXMEs
  • Pass make tidy
  • Write tests
  • Document the current syntax and semantics for generators somewhere
  • Use proper error message numbers
  • Make the implicit argument type default to ()

@rust-highfive
Copy link
Collaborator

r? @nikomatsakis

(rust_highfive has picked a reviewer for you, use r? to override)

@alexcrichton
Copy link
Member

cc @rust-lang/compiler, I suspect a number of you may be interested in this!

@bors
Copy link
Contributor

bors commented Jul 6, 2017

☔ The latest upstream changes (presumably #42727) made this pull request unmergeable. Please resolve the merge conflicts.

@alexcrichton alexcrichton added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jul 6, 2017
@Zoxc Zoxc force-pushed the gen branch 2 times, most recently from 18a9c9b to 0cacc20 Compare July 6, 2017 22:02
@@ -1059,6 +1071,12 @@ pub enum Expr_ {
/// For example, `[1; 5]`. The first expression is the element
/// to be repeated; the second is the number of times to repeat it.
ExprRepeat(P<Expr>, BodyId),

/// A suspension point for generators
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to give an example of Rust syntax; this corresponds to a yield, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In an older version this was separate from yield, now it's the same and it could use an renaming.

ExprSuspend(P<Expr>),

/// The argument to a generator
ExprImplArg(NodeId),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This corresponds to a gen arg, right? I am wondering if we should think about removing this from the PR, since it is not needed for async-await, and I find it hard to imagine this syntax ultimately being stabilized (it just seems rather unlike any other syntax we have in the language). I'm not sure how much actual simplicity would result though, maybe very little.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am also against the concept as a whole - it's there to get rid of thread-local state but I do not think it's a satisfactory solution.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Skimming through the PR, it seems like this would be a reasonably nice simplification.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eddyb

I am also against the concept as a whole - it's there to get rid of thread-local state but I do not think it's a satisfactory solution.

Does this mean you are also against yield returning a value, sort of like let x = yield 22?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see what you're hinting at. In my hand-desugared examples I ended up having the generator equivalent of IntoIterator produce the first Yield/CoResult/etc. alongside the generator itself. The other options are multiple entry points, taking Option<Input>, or maybe a different suspend model. Maybe a yield without a value and produce the first input.

Anyway, what I'm against is any such scheme used with async IO, when it can be avoided.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having explicitly named arguments would be an alternative which is just as expressive and could probably reuse most of the code dealing with regular arguments. One of the problems with this is that if we allow yield inside function and closure bodies in addition to generator literal bodies we need syntax for another set of arguments. This is less of a problem if we're committed to using compiler plugins for ergonomic async I/O.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have to admit that I am finding these last 2 comments a bit confusing. e.g., @eddyb, when you write:

The other options are multiple entry points

I don't really understand what you are referring to. The other options for.. what exactly? In any case, it seems like neither iterators nor futures require the ability to provide "feedback" during execution, so I would personally be happy to "defer" that part for later PRs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Zoxc

Having explicitly named arguments would be an alternative which is just as expressive and could probably reuse most of the code dealing with regular arguments.

I would think so -- those regular arguments wind up being assigned to standard locals in MIR, iirc, almost immediately upon function entry. In fact, I thikn it would "just work" just fine -- unless you meant that, after a yield, the function arguments would be updated "in place" with the new values?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unless you meant that, after a yield, the function arguments would be updated "in place" with the new values?

This is indeed what I meant.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nikomatsakis By "multiple entry points" I meant that you can also have yield produce a value by providing multiple ways to resume a given generator, the one taking no value for the very first time, and then using the one requiring a value. Using the wrong one would "just" panic. If you do have multiple entry points, though, you can also start doing crazier things, like having multiple types that yield can produce, similar to what ends up being done in dynamic language. However, while quite flexible, the dynamic nature of it makes me uneasy and I'm very glad that yield producing no value back to the generator can be used for async I/O.

@@ -1014,7 +1026,7 @@ pub enum Expr_ {
/// A closure (for example, `move |a, b, c| {a + b + c}`).
///
/// The final span is the span of the argument block `|...|`
ExprClosure(CaptureClause, P<FnDecl>, BodyId, Span),
ExprClosure(CaptureClause, P<FnDecl>, BodyId, Span, Option<GeneratorClause>),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's worth adding a comment -- what is an Option<GeneratorClause>?

(Also, I wonder if we should convert this to a struct variant at some point.)

if gen.is_some() {
self.head("gen")?;
space(&mut self.s)?;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't the syntax that users use, right? In that case, we probably shouldn't either. Also, it seems weird that we don't print if this is "movable" or "immovable" -- iirc from the last time I looked, that stuff isn't really used in this branch, maybe we should remove it from this PR too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need syntax to mark a generator as movable or immovable. I haven't come up with any good ideas for that. The best idea I got is using static for immovable generators. The only syntax that seems to be nailed down is yield

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this idea of movable vs immovable a kind of separate, orthogonal proposal? That is, at the moment, I thought we would be starting with only the "movable" kind...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Immovable types is a separate orthogonal proposal. Immovable generators would be a extension to generators which depend on immovable types. Immovable generators are very important for async I/O experiments however.

I do intend to implement immovable generators, but landing that in master would block on the immovable types RFC.

@@ -150,6 +150,9 @@ pub enum TypeVariants<'tcx> {
/// `|a| a`.
TyClosure(DefId, ClosureSubsts<'tcx>),

/// The anonymous type of a generator. Pairs with a TyClosure for closure generators.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this comment could be expanded. For example, what does "pairs with" mean? My guess is that this is the type assigned to an expression like || yield 22, right? In other words, it represents a "resumable" function that, each time it is called, picks up from the last yield point? (Only it works through the generator trait, not the function traits?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is outdated.

impl Iterator<Item=Ty<'tcx>> + 'tcx
{
let state = tcx.generator_layout(def_id).fields.iter();
let state: Vec<_> = state.map(|d| d.ty.subst(tcx, self.substs)).collect();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Silly question, but is it necessary to collect into an intermediate vec here, given that we're returning an impl Iterator?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using Vec is simpler than dealing with lifetime errors

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There shouldn't be any. Can you paste them here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the error I get if I just return state.map(|d| d.ty.subst(tcx, self.substs)):

error[E0477]: the type `core::iter::Map<core::slice::Iter<'_, mir::LocalDecl<'_>>, [closure@src\librustc\ty\sty.rs:290:19: 290:51 tcx:&ty::context::TyCtxt<'a, 'gcx, 'tcx>, self:&ty::sty::ClosureSubsts<'tcx>]>` does not fulfill the required lifetime
   --> src\librustc\ty\sty.rs:287:9
    |
287 |         impl Iterator<Item=Ty<'tcx>> + 'tcx
    |         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    |
note: type must outlive the lifetime 'tcx as defined on the impl at 282:1
   --> src\librustc\ty\sty.rs:282:1
    |
282 | / impl<'a, 'gcx, 'tcx> ClosureSubsts<'tcx> {
283 | |     /// This returns the types of the MIR locals which had to be stored across suspension points.
284 | |     /// It is calculated in rustc_mir::transform::generator::StateTransform.
285 | |     /// All the types here must be in the tuple in GeneratorInterior.
...   |
302 | |     }
303 | | }
    | |_^

error[E0495]: cannot infer an appropriate lifetime for capture of `tcx` by closure due to conflicting requirements
   --> src\librustc\ty\sty.rs:290:19
    |
290 |         state.map(|d| d.ty.subst(tcx, self.substs))
    |                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    |
note: first, the lifetime cannot outlive the lifetime 'a as defined on the impl at 282:1...
   --> src\librustc\ty\sty.rs:282:1
    |
282 | / impl<'a, 'gcx, 'tcx> ClosureSubsts<'tcx> {
283 | |     /// This returns the types of the MIR locals which had to be stored across suspension points.
284 | |     /// It is calculated in rustc_mir::transform::generator::StateTransform.
285 | |     /// All the types here must be in the tuple in GeneratorInterior.
...   |
302 | |     }
303 | | }
    | |_^
note: ...so that the reference type `&ty::context::TyCtxt<'a, 'gcx, 'tcx>` does not outlive the data it points at
   --> src\librustc\ty\sty.rs:290:9
    |
290 |         state.map(|d| d.ty.subst(tcx, self.substs))
    |         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
note: but, the lifetime must be valid for the lifetime 'tcx as defined on the impl at 282:1...
   --> src\librustc\ty\sty.rs:282:1
    |
282 | / impl<'a, 'gcx, 'tcx> ClosureSubsts<'tcx> {
283 | |     /// This returns the types of the MIR locals which had to be stored across suspension points.
284 | |     /// It is calculated in rustc_mir::transform::generator::StateTransform.
285 | |     /// All the types here must be in the tuple in GeneratorInterior.
...   |
302 | |     }
303 | | }
    | |_^
note: ...so that the type `core::iter::Map<core::slice::Iter<'_, mir::LocalDecl<'_>>, [closure@src\librustc\ty\sty.rs:290:19: 290:51 tcx:&ty::context::TyCtxt<'a, 'gcx, 'tcx>, self:&ty::sty::ClosureSubsts<'tcx>]>` will meet its required lifetime bounds
   --> src\librustc\ty\sty.rs:287:9
    |
287 |         impl Iterator<Item=Ty<'tcx>> + 'tcx
    |         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

error: aborting due to previous error(s)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

error[E0495]: cannot infer an appropriate lifetime for capture of tcx by closure due to conflicting requirements

You forgot a move.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

state.map(move |d| d.ty.subst(tcx, self.substs)) gives this error:

error[E0477]: the type `core::iter::Map<core::slice::Iter<'_, mir::LocalDecl<'_>>, [closure@src\librustc\ty\sty.rs:289:19: 289:56 tcx:ty::context::TyCtxt<'a, 'gcx, 'tcx>, self:ty::sty::ClosureSubsts<'tcx>]>` does not fulfill the required lifetime
   --> src\librustc\ty\sty.rs:287:9
    |
287 |         impl Iterator<Item=Ty<'tcx>> + 'tcx {
    |         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    |
note: type must outlive the lifetime 'tcx as defined on the impl at 282:1
   --> src\librustc\ty\sty.rs:282:1
    |
282 | / impl<'a, 'gcx, 'tcx> ClosureSubsts<'tcx> {
283 | |     /// This returns the types of the MIR locals which had to be stored across suspension points.
284 | |     /// It is calculated in rustc_mir::transform::generator::StateTransform.
285 | |     /// All the types here must be in the tuple in GeneratorInterior.
...   |
301 | |     }
302 | | }
    | |_^

error: aborting due to previous error(s)

Copy link
Contributor

@arielb1 arielb1 Aug 1, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need move and + 'a I think (to capture the 'a lifetime in 'tcx).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, to change the return type to impl Iterator<Item=Ty<'tcx>> + 'a and use a move closure. This error message really needs to be improved.

@@ -276,6 +279,50 @@ impl<'a, 'gcx, 'acx, 'tcx> ClosureSubsts<'tcx> {
}
}

impl<'a, 'gcx, 'tcx> ClosureSubsts<'tcx> {
pub fn state_tys(self, def_id: DefId, tcx: TyCtxt<'a, 'gcx, 'tcx>) ->
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this function could use a comment. I am guessing it returns the types of local variables that are (potentially) saved/restore in the state?

state.into_iter()
}

pub fn field_tys(self, def_id: DefId, tcx: TyCtxt<'a, 'gcx, 'tcx>) ->
Copy link
Contributor

@nikomatsakis nikomatsakis Jul 7, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly, what is field_tys? Comment seems good.

@@ -201,6 +219,19 @@ fn check_aliasability<'a, 'tcx>(bccx: &BorrowckCtxt<'a, 'tcx>,
}
}

fn check_yields<'a, 'tcx>(bccx: &BorrowckCtxt<'a, 'tcx>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect this check to be done in the check_loans code -- basically, each time we reach a yield point, we would check that no loans are in scope.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you point me to an appropriate location in the check_loans code?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I forgot that check_loans uses the ExprUseVisitor. I would add a "callback" to that interface for "yields". So, specifically the expr_use_visitor::Delegate interface. We could add a fn yield(id: ast::NodeId) method. Then, on each callback, we could check whether there are any loans in scope at that point, e.g. by calling CheckLoans::each_in_scope_loan().

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Sorry, missed this comment till just now.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was looking around there and that doesn't seem correct. No loan records are created when we get RestrictionResult::Safe, so I assume that means the loans don't show up with CheckLoans::each_in_scope_loan() leading to unsoundness. We also want to ignore ReEarlyBound and ReFree borrows and it looks like that information is lost by then.

@bors
Copy link
Contributor

bors commented Jul 8, 2017

☔ The latest upstream changes (presumably #42996) made this pull request unmergeable. Please resolve the merge conflicts.

types and such.

* Traits like `Send` and `Sync` are automatically implemented for a `Generator`
depending on the captured variables of the environment. Note, though, that
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unlike closures, this also depends on the values inside the generator which are live across a suspension point.

generator progresses.

* Generator literals produce a value with a unique type which implements the
`std::ops::Generator` trait. This allows actual execution of the genrator
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/genrator/generator/

@@ -189,6 +190,13 @@ pub use self::range::{RangeInclusive, RangeToInclusive};
#[unstable(feature = "try_trait", issue = "42327")]
pub use self::try::Try;

#[unstable(feature = "generator_trait", issue = "0")]
#[cfg(not(stage0))]
pub use self::generator::State;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

core::ops::State on its own has no connection to generators, so we might want to export this with a more descriptive name or introduce a core::generator module, though this can certainly wait until the future RFC that stabilizes this feature.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is in a generator module. Perhaps we should remove the reexport of it?

@leonardo-m
Copy link

Is a simpler to remember syntax like this possible?

fn test() -> impl GeneratorNoReturn<u8> {
   for i in 1 .. 6 {
       yield i
   }
}

@arielb1 arielb1 added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jul 11, 2017
@nikomatsakis
Copy link
Contributor

nikomatsakis commented Jul 11, 2017

So I've read over the code again -- not in depth -- and I remain fairly positive. @Zoxc I want to say that this is nice work. =) That said, I still think we could (and should) try to simplify it as much as possible -- it's basically impossible to review a branch of this size, so the more we can do to prune it down (and then add things in separately later) the better, since those additions can then get reviewed in more depth.

The obvious thing for removal remains gen arg -- it feels like it adds a fair bit of complexity into the PR, and it is not needed for most use cases from what I can tell. (Though I think we may want to add some feature like it back at some point -- I don't personally feel confident of just what I think that feature should look like yet.)

The other candidate is the "movable/immovable" stuff, which if I recall is not (yet) deeply integrated. I'd prefer to layer that in separately as well.

So, before r+, I would like to see:

  • reach consensus about simplifications
    • I remain fairly convinced we should remove them, but if for some reason that is infeasible, I'd like to understand that better
  • I'd like to give one more read over the borrowck and sensitive bits, and review the tests. =)
    • But I don't expect those last reviews to take long on my part.
  • I'd prefer if @eddyb or somebody else on @rust-lang/compiler wanted to chime in and indicate that they've at least given this a preliminary read and don't have any blocking concerns

@rozaliev
Copy link

I'm updating my prototype lib based on coroutines right now. So I wonder if this kind of code is supposed to be working on this branch?

struct A(u32);

fn g1() -> impl Generator<Yield=(), Return=()> {
    move || {
        let a = A(33);
        let inner = a.g2();
        while let State::Yielded(_) = inner.resume(()) {
            yield
        }
    }
}

impl A {
    fn g2<'a>(&'a self) -> impl Generator<Yield=(), Return=()> + 'a {
        move || {
            println!("use it {:?}", self.0);
            yield
        }
    }
}

@kennytm
Copy link
Member

kennytm commented Aug 28, 2017

RLS test still failed.

(BTW could some of the commits be squashed? This PR will bring in 127 commits, many of which are "fixing this and that".)

alexcrichton pushed a commit to rust-lang/rls that referenced this pull request Aug 28, 2017
@alexcrichton
Copy link
Member

@bors: r=arielb1

At this point the commit history is basically an accurate reflection of this history of this feature...

@bors
Copy link
Contributor

bors commented Aug 28, 2017

📌 Commit 98e3ebe has been approved by arielb1

@alexcrichton
Copy link
Member

@bors: r=arielb1

@bors
Copy link
Contributor

bors commented Aug 28, 2017

📌 Commit a996d5e has been approved by arielb1

@bors
Copy link
Contributor

bors commented Aug 28, 2017

⌛ Testing commit a996d5e with merge 9a59d69...

bors added a commit that referenced this pull request Aug 28, 2017
Generator support

This adds experimental support for generators intended to land once rust-lang/rfcs#2033 is approved.

This is not yet ready to be merged. Things to do:
- [x] Make closure arguments on generators an error
- [x] Spot FIXMEs
- [x] Pass make tidy
- [x] Write tests
- [x] Document the current syntax and semantics for generators somewhere
- [x] Use proper error message numbers
- [x] ~~Make the implicit argument type default to `()`~~
@bors
Copy link
Contributor

bors commented Aug 28, 2017

☀️ Test successful - status-appveyor, status-travis
Approved by: arielb1
Pushing 9a59d69 to master...

@mgattozzi
Copy link
Contributor

This is super exciting and I can't wait to play around with it on the next nightly

@hadronized
Copy link

Hells yeah, great success! <3

Is there any plan on adding support for coroutines as well? Anyway, ultra good job, I’ll play with it as soon as the nightly lands!

@rushmorem
Copy link

I’ll play with it as soon as the nightly lands!

What do you mean? This is already in the latest nightly. I've been playing with this since yesterday.

@hadronized
Copy link

Yeah I just saw that, and I’ve started playing with it as well! I’ve already pushed a patch to rust.vim as well \o/

Congrats to everyone!

@lilianmoraru
Copy link

This does not seem to use stackless coroutines support from LLVM or I am understanding it wrongly?
Seemed like a good fit for this.

@lilianmoraru
Copy link

Specifically, the first LLVM RFC(cannot edit my comment on mobile): http://lists.llvm.org/pipermail/llvm-dev/2016-July/102337.html

@pftbest
Copy link
Contributor

pftbest commented Sep 6, 2017

@lilianmoraru

You can look at the comments here:

rust-lang/rfcs#1823 (comment)

@valff
Copy link
Contributor

valff commented Sep 12, 2017

Is there a better way to get a generator which doesn't yield?

let mut generator = || {
    if false {
        yield;
    }
    0xDEAD_BEEF
};

@Zoxc
Copy link
Contributor Author

Zoxc commented Sep 12, 2017

@lilianmoraru LLVM's coroutine support require heap allocation, so it's not suitable for Rust.

@valff Currently, that is the best way. The if will be optimized away in MIR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author.
Projects
None yet
Development

Successfully merging this pull request may close these issues.