Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Abstract output type parameters #1305

Closed

Conversation

Kimundi
Copy link
Member

@Kimundi Kimundi commented Oct 3, 2015

Enable monomorphized return types of a function that are only abstractly defined by an trait interface by introducing a new set of "output" generic parameters that can be defined for a function, and extending trait bounds with the concept of "conditional" bounds.

Rendered

@Kimundi Kimundi changed the title Abstract output type paramters Abstract output type parameters Oct 3, 2015
@hanna-kruppe
Copy link

I like the general direction of this RFC. It seems to address the OIBIT leaking and the conditional trait bound problem in a comparatively simple and orthogonal way. It's somewhat complex, of course, but all the parts are (or at least seem) immediately obvious to me and I certainly don't consider myself a type system guru.

Regarding the alternative of having only -> impl Trait, without the possibility to define your own output types: At first glance, that seems like a waste of potential to me. Am I right that this option would rule out signatures like two_closures()<FA: Fn(), FB: Fn()> -> (FA, FB)? However, I am not thrilled about the syntax, so my position on this may depend on whether someone figures out a pleasant syntax.

Here's my current position (subject to me changing my mind) on the other alternatives:

  • Don't do type. I am not convinced that it's even useful. If there are no trait bounds, it is just a second, slightly longer and non-standard way to spell a type parameter. Adding bounds puts it into the territory of the impl sugar, and looks weird to me (the "double colon" fn(x: type T: Trait)).
    • Edit: Actually it seems like a slightly longer and non-standard way to spell type parameters in any case. Can you illuminate me as to what benefit it should bring? The only thing I can imagine would be that the bounds (if any) are right next to the parameter, but that's really weak and only works if there is only one use of the type, in which case impl still wins.
  • Only support impl in the return position. "Generalized impl" caused a huge racket during the original "unboxed return types" RFC, and because it just adds syntactic sugar, this is not a hill I am willing to die on.
  • The abstract output type created by -> impl Trait should be called Return. Because this is a valid identifier, it can also be used by manually-named abstract output types that fulfill the same function. I think this would be a useful convention.

@tbu-
Copy link
Contributor

tbu- commented Oct 3, 2015

How about re-using the impl syntax?

abstract type Foo<T> = Option<T>;
impl<T:Copy> Clone for Foo<T>; // This just takes the implementation of the real type.
impl<T> Default for Foo<T>;

This would allow you to customize which traits to export depending on type parameters as well.

@glaebhoerl
Copy link
Contributor

Like @rkruppe, I like the general direction, but this is a lot of stuff. For lack of any larger or more constructive ideas at the moment, various notes and nits:

  • Having both prefix and postfix <...> on functions would, I fear, lead to Rust getting a reputation as a language with cryptic and unapproachable syntax (even moreso than it already has!).

  • I'm wary of the "everything is a module" slope we'd be slipping down if even functions were to have their own associated items, though we've already slid a ways down it with "UFCS". It would be nice if we could keep our name resolution and uses of :: from further approaching the complexity of C++. (If it were entirely up to me, it would only be legal to write :: after things that were mods.)

  • I'm still not very fond of the impl Trait syntax, which feels to me more like a placeholder syntax for discussion which ended up staying on by default, than something that really coheres with the rest of the language. The syntax impl Trait suggests to me an implementation of that trait (as in, for instance, a vtable), not an abstract type which implements the trait. (I still prefer abstract Trait here.)

  • For the simplest use cases, I still think it would be nicest if we could just write e.g. fn print(thing: Display) and fn printable() -> Display, without attaching any extra qualifiers to the trait. I'm told that this is problematic, because Display already has a meaning as a dynamically-sized type. Independently, there's also been talk of "passing DSTs by value". Can someone better versed in these matters explain to me what the difference between the two notions would be in practice? I would really appreciate it.

  • I wonder how significant an item the "conditional bounds" A: Foo if B: Bar feature would be. The idea is intuitive enough, but when I proposed an analogous feature for GHC a few years ago, Simon PJ's reaction was that "I have no idea how difficult this would to actually implement, but last time I thought about it, it seemed pretty hard" and "The design and implementation would be a significant task".

  • My inclination would be to, as a first step, see how far we could get with just the module-level abstract type feature, which feels like the "cleanest" and most general, orthogonal addition to me. The subtlety is in how to support the original use case of returning e.g. an abstract closure type (which you can't, or don't want to, name).

    As a first guess, what if we also allowed abstract type Foo: Trait; (resp. abstract type Foo where ...) at the module level even without an = Type RHS, and inferred the identity of the type from its use within the module in that case? In other words, the whole module would get to know the true identity of the type, just as if it did have an RHS. So for example, this would work:

    mod foo {
        pub abstract type Foo: Debug;
        pub fn make_foo() -> Foo { true }
        pub fn use_foo(foo: Foo) { println!("{}", if foo { "yes" } else { "no" }) }
    }
    

    This is still somewhat unfortunate, as it involves spooky action-at-a-limited-distance via intra-module-global type inference within mod foo, but importantly, it still preserves the property that all interfaces are explicit and abstractions are watertight for clients of foo. As modules are already Rust's primary abstraction boundary in the language as it is, this seems like it might be livable, or at least moreso than fn make_foo() -> _.

@ftxqxd
Copy link
Contributor

ftxqxd commented Oct 3, 2015

This proposal seems a little too complicated for my liking, even though I really would like the functionality it allows. I rather like @glaebhoerl’s idea of using abstract type without specifying a type, as that removes the need for <> in return position and potentially even the need for T: Foo if U: Bar in many cases:

fn foo<T>(t: T)<U> -> U
    where U: Clone if T: Clone
{ t }

becomes

abst type Foo<T>; // this is resolved as equivalent to `T` because of `foo`’s body
impl<T: Clone> Clone for Foo<T> {}
fn foo<T>(t: T) -> Foo<T> { t }

This does however seem like it could be very difficult for the type checker to handle, but I don’t know anywhere near enough about the type checker/inferer to say for certain.

I also have problems with the function::T syntax, because function names don’t currently occupy the module/type namespace, meaning that you could define a module or type with the same name as the function in the same module, making function::T potentially ambiguous:

mod foo { struct T; }
fn foo()<T: Clone> -> T where T: Mul { 42 }
let x: foo::T; // which `foo` are we referring to here?

signature, it may be declared directly inline with the syntax
`type IDENT[: BOUNDS]`.
Example:
```rust

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code block (as well as the one in example 2) is rendering strangely, maybe add a new line and remove the space before the ticks?

@mitchmindtree
Copy link

Thanks for taking the time to write this up @Kimundi :)

A few things that came to mind:

  • I find myself feeling similarly to @glaebhoerl and @P1start in that this seems to be introducing a lot of stuff for one RFC - it could almost be broken up into several RFCs?
    • abstract type
    • anonymised generics for function return types
    • sugar for generics
  • I'm unsure I understand the purpose of the assignment of () in the abstract type example:
// Could this:
abstract type Foo: Clone = ();

// possibly just be this?
abstract type Foo: Clone;
  • The sugar section might be prone to a lot of bike-shedding and might swamp the comments a little, especially as it is not just sugar for the new feature but also for existing generics - perhaps it may have been a little cleaner to say "here are some sugar ideas for the new feature, here's how they may also apply to existing generics, but all of this can be properly discussed further at a later point in time"? Though I guess it is necessary to at least consider the possibilities in some detail to determine the feasibility of the feature. I guess we'll see how it goes :)

In general I really like where this is going 😸 The ability to talk about the anonymised return types in the same way that we can talk about existing input generics (using where clauses and taking advantage of the type param style) is a huge plus for both consistency and expressiveness 👍

@eddyb
Copy link
Member

eddyb commented Oct 4, 2015

Keep in mind that "parameter" is quite wrong as parameters are input by definition.

@hanna-kruppe
Copy link

I only just realized that closures don't fit quite as neatly into this picture as iterator adapters do. It's all very nice that we can define abstract type aliases, but if we need to name the underlying type, then that capability can't be used for closure types. If general output types are omitted and only abstract type aliases remain (which is an option I really want to have available), then you can't have an alias for an unboxed closure because you can't name it. The impl Trait syntax, which is supposedly just syntactic sugar, would be absolutely required for returning unboxed closures. I don't like that idea at all.

This can be avoided either by using the output type machinery, or by following the idea of @glaebhoerl (declaring abstract types without naming the underlying type). But as soon as things depend on input type (and lifetime) parameters, it gets complicated again:

fn make_factory<T: Clone>(x: T) -> impl Fn() -> T {
    move || x.clone()
}

One would have to define an abstract alias Factory<T> or equivalently an output type with an appropriate bound. However, if this is supposed stand in for the (unnamed) generic closure type (in the proper type theoretic sense that Factory is the same type constructor that the desugared closure type is), then how do we match Factory's type parameters with the type parameters of the closure struct? In the current desugaring, that struct simply gets all the type and lifetime parameters of make_factory, and additionally a type parameter for each of its upvars, so its parameter list is <T, U0>. (@ Compiler hackers: Is that correct?) I see no way to match that up with Factory<T> automatically. So the impl "sugar" seems to be non-optional (if returning unboxed closures is supposed to work in its full generality).

Edit: Oh wait. If output types are really associated types, then of course make_factory::<i32>::Return can easily be an alias for the specific instantiation of the closure type for those type (and lifetime) parameters. Apparently this complicated machinery really is more powerful.

@m4rw3r
Copy link

m4rw3r commented Oct 4, 2015

One concern I have is how simple it would be for a library user to use an abstract output type defined in a library in their own function signature.

As a particular example I have parser-combinators; they usually encourage users of the library to write small composable functions returning parsers (and not a lot of inline closures or similar). The impl Trait usage is pretty decent from that standpoint, since it only adds one additional keyword in front of the trait of the return type of the function.

If the usage of an already defined abstract output type in a function signature will require complex and/or lots of code, then this will increase the noise and complexity of any code which encourages users to define composable functions all unified under the same abstract return type.

@Kimundi
Copy link
Member Author

Kimundi commented Oct 4, 2015

Update

Heads up, @eddyb informed me on IRC that this needs HKT support in the compiler (though its not explicitly needed on the language level apart from what is used in this proposal)

The issue is, right now if you have something like

trait Foo<T> {
    fn bar<U>();
}

then thats actually internally closer to

fn Foo_bar<Self, T, U>() { /* body determined by paramters */ }

rather than

trait Foo<T> {
    type bar<U>: FnOnce();
}

which would be HKT, and needed to access the output types proposed by this RFC.

Replies

@rkruppe

  • Yeah, type sugar doesn't really pull its weight, it should probably go.
  • impl only in return position to prevent confusion is also reasonable, though I'm hoping that the explicit existence of output types can explain the sugar more easily: "impl Trait generates a type on nearest preceding type list". That would also be a reasonable system for extending impl Trait to positions outside of function signatures, like struct Foo { pub bar: impl Trait } being the same as struct Foo<anon: Trait> { pub bar: anon }
  • Defaulting the return type name to Return is reasonable, but really there is no good name because we already have a official name for this type: Output, as used by the closure traits. Which we can't reuse because then referring to function::Output would be ambiguous in all non-generic cases.

@tbu-

I feel like thats getting verbose enough to not be that different from defining a newtype, with or without a impl delegation mechanism.

@glaebhoerl

  • Yeah, I don't expect it to be likely for the additional <> actually landing in the language, for the same reasons you outlined, but I wanted to raise the possibility.

  • I have to disagree about the everything being a module thing, I think that would be quite wonderful ;)

  • Guilt as charged about the impl syntax, though at least in this proposal abstract Trait might be ambiguous with abstract type T if it gets shortened to just abstract T

  • Just Display already has a established meaning in the typesystem, its a unsized trait. Overwriting that meaning in a function signature just seems plain weird/problematic/non-compositional to me. The older proposal you mean is the possibility of accepting unszed types as function arguments, like

    fn foo(t: Display) {}

    Display would still be a unsized type here, method calls would still go through a vtable, and it would be implemented internally like

    fn foo(t: &move Display) {}

    that is, it would just be translated as a pointer you could move out of, which is only possible with function arguments because of the function call stack allowing cleanup of the original location after a return (as far as I understand it at least).

  • abstract types only really get introduced by this proposal to have their semantic usable in general, rather than as a special case for function return types, so just having them on their own would be kind backwards here. And changing them to be defined by another item in the same scope was intentionally not considered (See my reply to @P1start below).

@P1start

  • Defining abstract type items without the actual type being part of the item syntax is something I intentionally didn't propose her, because it seemed to be weird action-at-a-distance to me, but its definitely a possibility. The issue is stuff like generics in your example needing HKT, as I elaborated above.
  • I kinda glossed over foo::T not really working. I guess you'd actually need to do something like <typeof(foo)>::T.

@mitchmindtree

  • The features can be introduced independently, sure, but then you might lack the discussion about wether they integrate well with each or not. I had/have that problem with other RFCs, incrementally getting changes into the language in order to get closer to some ideal situation, but it takes a long time, and other RFCs or changed situations can get in the way halfway through.
  • The () is just the actual underlying type, it could have been i32 or bool as well, but there is no need to special case () (Especially because you'd probably never want ())

@rkruppe You don't name the actual type with output type paramters, so using the non-impl syntax is totally fine:

fn foo()<T: Display> -> T { 0_i32 }
fn bar()<U: Fn()> -> U { move || println!("hello {}", foo()) }
abstract V: Fn() = bar::U;
let f: V = bar();

@hanna-kruppe
Copy link

@Kimundi

I'm hoping that the explicit existence of output types can explain the sugar more easily

Makes sense. Still, if these extensions are as controversial today as they were during the discussion of the original RFC, I would drop them in a heartbeat.

You don't name the actual type with output type paramters, so using the non-impl syntax is totally fine:

Yes, I realized that afterwards, see the edit. It's mostly a problem for various hypothetical simplifications that I liked because they would remove most of the complex machinery of this RFC.


That said, @eddyb has curbed my enthusiasm soundly. I don't know what he thinks of this RFC and don't want to imply anything, but he certainly pointed out several problems that I completely missed. So not only does this RFC look less rosy than it used to, I'm also questioning my judgement now. The HKT problem was already described by @Kimundi above. Furthermore, most variations along the lines of abstract type (which was the part that I liked most) sneak in some kind of existential types through the back door, without acknowledging it or laying out a plan for generalizing to other examples of existential types. Bottom line is, I now fear that this is more of an attractive nuisance than a future-proof solution.

@glaebhoerl
Copy link
Contributor

@rkruppe

fn make_factory<T: Clone>(x: T) -> impl Fn() -> T {
    move || x.clone()
}

One would have to define an abstract alias Factory<T> or equivalently an output "parameter" with an appropriate bound. However, if this is supposed stand in for the (unnamed) generic closure type (in the proper type theoretic sense that Factory is the same type constructor that the desugared closure type is), then how do we match Factory's type parameters with the type parameters of the closure struct? In the current desugaring, that struct simply gets all the type and lifetime parameters of make_factory, and additionally a type parameter for each of its upvars, so its parameter list is <T, U0>. (@ Compiler hackers: Is that correct?) I see no way to match that up with Factory<T> automatically.

If I understand your concern right: a function returning an unboxed, abstract closure type cannot return different closure types with different upvars, because then the result could not be unboxed. So the set of upvars (and corresponding type variables) are fixed, and the abstract return type may only depend on the function's own generic parameters. So using only abstract type, you would in fact express the example like this:

abstract type Factory<T>: Fn() -> T;
fn make_factory<T: Clone>(x: T) -> Factory<T> { ... }

This has two drawbacks: you need to repeat the type/lifetime parameter list on both the abstract type and the fn, and you need to declare a new abstract type for each fn that returns a different concrete type (even if they implement the same interface), which makes it not-very-suitable for the kind of use cases like @m4rw3r mentions. So we probably would need some kind of additional sugar like -> abstract Trait, at least. (Or maybe we could use the general Type @ Interface abstraction operator that @aturon proposed...)

Furthermore, most variations along the lines of abstract type (which was the part that I liked most) sneak in some kind of existential types through the back door

Front door. Module-scoped existential types are what abstract types are. :)

@Kimundi Could you show a concrete example of where you'd need HKTs internally - the surface-level "abstract output types" code along with the internal HKT-implicating code it'd need to be lowered to? I'm having a bit of trouble drawing the connection on my own.

@hanna-kruppe
Copy link

@glaebhoerl I think you do not understand, which may be because my post went through at least three edits and was at all times a rambling mess. Here is how the closure in make_factory will be desugared (as before, under the assumption that I understand the desugaring correctly):

struct Closure<T, U0> {
    x: U0
}

impl<T> Fn<(), Output=T> for Closure<T, T> {
    fn call(&self) -> T {
        self.x.clone()
    }
}

Note that there are two type parameters and, outside of the impl, there is no causal relationship that would allow us to say "actually there's only one type parameter". The desugared struct has parameters for all type and lifetime parameters of its containing function, and furthermore one type parameter per upvar. And there can be almost any relationship among those, consider for example this function:

fn foo<I: Iterator>(i: I, msg: &str) -> ??? {
    let first = i.first();
    move || {
        first.unwrap_or_else(|| panic!(msg))
    }
}

This returns some type that implements FnOnce and (in its impl) is parametrized by <'a, I, &'a str, Option<I::Item>> or something similar. Look at it! Look at it in all its hideous glory. All of that would leak out into the world if you spelled it out on the abstract type. And I believe there is no way around the need to spell it out, if the abstract type is separately named and parametrized.

Perhaps one could define closure desugaring in enough detail that it's possible to reliably express "abstract over" closure types, by just spelling out all lifetimes and throwing all lifetime and type parameters and all upvar types at the abstract type. However, that would be excessively difficult to use, easy to get wrong, and very verbose. Not to mention possibly constrain future evolution of the desugaring.

@hanna-kruppe
Copy link

@glaebhoerl I have to add that this is all under the assumption of parametricity, i.e. that one generic abstract type would correspond to exactly one underlying generic type. One could "throw out" parametricity and just say: For any given T, there is some concrete ("rank 0") type that implements FnOnce() -> T, and there is not necessarily a relationship among those types for different values of T. That is theoretically possible. You could find that type by substituting the concrete T into make_factory, infer types, and take typeof(the closure) as the FnOnce() -> T type. That seems very unprincipled though: As said before, it forgoes parametricity.

@glaebhoerl
Copy link
Contributor

@rkruppe It's certainly possible that I don't understand. :)

Considering your second example, suppose we write:

abstract type FooRet<'a, I: Iterator>: FnOnce() -> Option<I::Item> + 'a;
fn foo<'a, I: Iterator>(i: I, msg: &'a str) -> FooRet<'a, I> {
    let first = i.first();
    move || {
        first.unwrap_or_else(|| panic!(msg))
    }
}

Given this, why couldn't the compiler infer:

abstract type FooRet<'a, I: Iterator> = anon_foo_Closure<'a, I, &'a str, Option<I::Item>>;

?

As for the complexity of the FooRet definition we had to write ourselves: the duplication of the parameter list to the left of the : is certainly unfortunate (as I acknowledged earlier), but even with something like abstract Trait / impl Trait, I think the part to the right of the : is the same as what you'd have to write in as being the Trait? (Though I'm not quite sure about the 'a. Looking at it, I think even the fn foo definition I wrote seems like it would qualify for lifetime elision if I wanted it to, and it's just the abstract type FooRet where it has to be explicit.)

Separately, it's not quite clear to me why the anonymous closure types need to have a type parameter for each of their upvars... considering for instance a function fn blah<T> whose body contains a lambda with a single upvar v: Vec<T> (say), then why should the generated closure type be struct Closure<T, U1> { v: U1 }, with U1 later specified to be Vec<T>, rather than simply struct Closure<T> { v: Vec<T> }? The lambda's environment cannot be generic over more things than the enclosing function is generic over, so it seems like the same generic parameter list which the fn itself has should suffice.

@hanna-kruppe
Copy link

@glaebhoerl

Given this, why couldn't the compiler infer:

I don't know if it absolutely physically couldn't. But I, personally, don't see a general procedure that gives predictable results and always works. Perhaps such a procedure exists, then my point is moot (or rather, the problem becomes a simple complexity trade off).

Separately, it's not quite clear to me why the anonymous closure types need to have a type parameter for each of their upvars...

But this is how the desugaring currently works. A comment goes into some detail as to why. IIUC, it's not strictly necessary but significantly simplifies the implementation.

@glaebhoerl
Copy link
Contributor

I, personally, don't see a general procedure that gives predictable results and always works. Perhaps such a procedure exists, then my point is moot (or rather, the problem becomes a simple complexity trade off).

I'm not a master of type inference either. We should probably wait for someone like @nikomatsakis to comment.

But this is how the desugaring currently works. A comment goes into some detail as to why.

Thanks. That comment gives a persuasive motivation (chiefly, function-internal lifetimes). That said, it seems to me that a function couldn't ever return a closure object which actually does reference lifetimes internal to that function, because error: borrowed value does not live long enough error: aborting due to previous error. So I should slightly amend my earlier claim such that a closure returned by a function cannot possibly [need to] be generic over more things than the function itself is. But looking at it from the perspective of fundamentals, rather than the particulars of the current implementation, my intuition that you should always be able to model an abstract return type of a function with an abstract type generic over the same things as the function still seems valid to me.

@nikomatsakis nikomatsakis added the T-lang Relevant to the language team, which will review and decide on the RFC. label Oct 5, 2015
@aturon aturon self-assigned this Oct 8, 2015
@critiqjo
Copy link

Readability concern -- instead of

fn foo<T, U: A>()<V, W: B> -> X<V, W> where T: C, V: D { ... }

how about:

type V, W: B in
fn foo<T, U: A>() -> X<V, W> where T: C, V: D { ... }

(inspired by let ... in syntax of Haskell)

@eddyb
Copy link
Member

eddyb commented Oct 30, 2015

@critiqjo That's a neat syntax for existentials, I like it!

One caveat, though, it should somehow come after generics if those existentials can capture the generics, e.g:

for<T> type V, W: B in
fn foo<U: A>() -> X<V, W> where T: C, V: D { ... }

This would mean V and W can capture (depend on) T, but not U.

@ticki
Copy link
Contributor

ticki commented Dec 23, 2015

What is the state of this?

@apasel422
Copy link
Contributor

How would this interact with object safety?

@eddyb
Copy link
Member

eddyb commented Jan 8, 2016

@apasel422 Where would there be any interaction? Object safety is about actually being able to construct a vtable, while anything like impl Trait would be statically dispatched.

@nrc
Copy link
Member

nrc commented Jan 12, 2016

@RalfJung we only have plans to allow specialisaton of impls, not functions,so I don't think this becomes an issue. Not sure if you can reframe this issue with impls rather than functions.

On the general question, I think that the abstraction boundary should be treated like the privacy boundary - it's ok for functions to leak information about the abstracted type if they have 'permission' to do so, in the same way that it is ok for a public function to return a private field.

@aturon
Copy link
Member

aturon commented Jan 12, 2016

@nrc The issue @RalfJung is raising isn't tied to impls vs functions; it's straightforward to rewrite his example using traits/impls instead:

mod foomod {
    abstract type X: Clone = i32;
    fn make_X() -> X { 42 }
}

trait Foo {
    fn foo(self) -> Self;
}

impl<T> Foo for T {
    default fn foo(self) -> T { self }
}
impl Foo for i32 {
    fn foo(self) -> i32 { 42 }
}

fn bar() {
    let x = foomod::make_X();
    x.foo(); // which version of foo gets called?
}

@RalfJung, to answer your question: in the type system, for clients of abstract types (aka impl Trait), the unpacked type already needs to be distinguished from the underlying concrete type to avoid other forms of leakage. As long as that distinct form doesn't unify with i32 in this example -- which it must not for other reasons -- the specialization will automatically fail to apply, without any extra work in the implementation.

Likewise, when you're inside the scope of the abstraction, you know what the concrete type is, and hence the specialization should trigger.

You claim that it's very easy to leak across the abstraction boundary with a specialized function, and you may be right, but your example doesn't quite show how. Can you make an example of the leakage you have in mind?

In terms of the relation to type case, you should consider an unpacked abstract type to be essentially a fresh type about which we have certain assumptions (trait bounds) but is otherwise unanalyzable. That's exactly how it should appear to specialization.

@nikomatsakis
Copy link
Contributor

@aturon

in the type system, for clients of abstract types (aka impl Trait), the unpacked type already needs to be distinguished from the underlying concrete type to avoid other forms of leakage.

I actually don't think this is as clear as you make it out to be. In particular, which impl is chosen for specialization is a decision that can be made at type-checking time, but will frequently also be made at trans time, and I would have naively assumed that abstract types will be erased at trans time. It seems like you in contrast are describing something more like a newtype -- that is, the abstract X in your selection is a truly different type from i32 even at trans time, in which case it would uniformly affect selection and so forth, but that also implies kind of implicit "pack and unpack" operations at the boundaries of the abstraction, which seems a bit...tricky to get right to me.

I confess though that I, like @RalfJung, have not had time to keep up with this RFC nor its associated discussion thread, so I am commenting somewhat from my own "intutions" about how I expect abstract type to work. Perhaps this has come up in the discussion thread or RFC text and been addressed more thoroughly. If so, I apologize and would appreciate some pointers!

@RalfJung
Copy link
Member

You claim that it's very easy to leak across the abstraction boundary with a specialized function, and you may be right, but your example doesn't quite show how. Can you make an example of the leakage you have in mind?

This may be covered by what you said about the type being wrapped, but I thought of something loike this (again using function specialization syntax, since that's just shorter :D )

mod foomod {
    abstract type X: Clone = i32;
    pub fn make_X() -> X { 42 }

    pub fn foo<T>(x: T) -> T { x }
    pub fn foo<i32>(x: i32) -> i32 { 42 }

    pub foo2<T>(x: T) -> T { foo(x) }
}

fn bar() {
    let x = foomod::make_X();
    foomod::foo(x); // which version of foo gets called?
    foomod::foo2(x); // which version of foo gets called?
}

Naively, when the outside world calls foo2, it looks like the call to foo is actually coming from inside the module and hence it can take the actual type X into account. However, I think it'd be very surprising, given above definitions, if there was ever a difference between calling foo and calling foo2. So somehow, when the outside world calls into the module that knows about X, it has to be the case that the module does not recognize its own type. That, too, seems rather strange to me: Now the same value of the same type X behaves differently depending on whether it was produced by make_X, passed to the outside, and then passed back in - or whether the entire "journey" of the value produced by make_X remains inside the module.

@nikomatsakis
Copy link
Contributor

@RalfJung

So somehow, when the outside world calls into the module that knows about X, it has to be the case that the module does not recognize its own type.

(Or both of them could call the specialized variant, which is what I would expect.)

@RalfJung
Copy link
Member

(Or both of them could call the specialized variant, which is what I would expect.)

Well, but then the abstraction is leaky, and we better don't have it.
Or do you mean foo and foo2 should behave differently? Again, I don't think that's desirable.

Existential types and type case fundamentally don't mix well. I don't think we should have both, the result is going to be confusing. Rust generally uses privacy as its abstraction mechanism, so we don't fundamentally need existential types. We can achieve similar effects with newtypes, so I'd be all in favor of adding sugar for supporting those better, making it easy to use just the existing privacy mechanisms to abstract away types -- the existing type system is expressive enough, just not convenient enough to use.

@nikomatsakis
Copy link
Contributor

@RalfJung

Well, but then the abstraction is leaky, and we better don't have it.

It all depends on your perspective, I guess. I think this is precisely the kind of leak that specialization is meant to enable -- that is, it's sort of the "antiparametricity" feature.

Existential types and type case fundamentally don't mix well

I agree there is no single answer to this question that is always what a user would want or expect.

@RalfJung
Copy link
Member

The way I think about it, specialization doesn't break privacy, so it is not leaky for the only abstraction mechanism Rust uses so far. (Technically speaking, one could consider the types of closures existential types, but in any case closure types are not nameable and so do not interact with specialization - and it would also be technically possible to let closures rely on privacy only.) But calling a type abstract raises expectations, and it doesn't seem like these expectations can be upheld without causing surprises in other corners of the language.

Having "a bit of an abstraction", IMHO is worse than not having it. Either you have abstraction, and then you can rely on it even in your unsafe code, or you cannot rely on it, and then you better don't have it because people might think they can rely on it. (Note that I would argue the same in a language that doesn't have unsafe. This is not related tounsafe. People rely on abstraction when they reason, intuitively, about why their code is correct. Providing "false abstraction" results in code that the author thinks is correct, but it actually isn't. In Rust, "not correct" + unsafe can mean "the program crashes", but it's not like a creatively mis-behaving but perfectly safe program would be significantly better.)

@colin-kiegel
Copy link

The problems you are describing seem to arise only if specialised functions behave differently than their generalisation. But wouldn't that be some kind of code smell? At least I would find it confusing and would expect using traits to define different behaviour for different types is better suited, isn't it?..

@RalfJung
Copy link
Member

I think it is rather dangerous to rely on that. It would mean essentially relying on the client to uphold the module's invariants. Abstraction has been invented to avoid exactly this kind of anti-modularity, because experience show people will screw it up.

@arielb1
Copy link
Contributor

arielb1 commented Jan 15, 2016

Output type parameters are not supposed to be used for strong abstraction - the primary reason they are abstract is to avoid cross-function type-inference cycles and all the confusion that causes.

impl-specialization is also at least supposed to only be used for performance reasons and have no observable effect. However judging by how @aturon tries to take that RFC I doubt this will stay that way for long in practice.

@RalfJung
Copy link
Member

I don't think "weak abstraction" is a thing. I also do not understand what any of this has to do with type inference.

I really like this proposal, in particular how it makes "abstract return types" just syntactic sugar of a more generic principle. However, I think these types should be truly abstract. This would ensure, for example, that the implementation of iterator adapters like map can change the type it uses to implement the resulting iterator silently, without breaking any client code. (Well, safe code. But unsafe could could only blame itself.) And by "ensure", I mean ensure, not "happens to be the case if everybody is careful". I love Rust for how it puts an ensure to the safety - after all, the abstraction provided by std::vector in C++ is safe, too, if everybody is just careful enough. Why should we aim for less in other parts of the language, without a good reason?

Maybe there are good reasons, so let me re-phrase this into a question: I searched for "newtype" in this discussion and the RFC, and found it mentioned only as something abstract types are not. So which goals of the RFC would not be achieved, if abstract type T: <bounds> = <type> would de-sugar to introducing a new type T with a single field of type <type> (struct T(<type>)), and implementing traits (also conditionally) as given by <bounds>? Since this explains "abstract type" in terms of old concepts, the interaction with everything else is already defined. Code in the same module could construct a T from a <type> by doing T(t) (and we can bikeshed about whether functions that use the abstract return type sugar, are implicitly composed with this function - they probably have to be, since the abstract type does not even have a name). Converting the other direction, without any additional means, would be done by t.0, but of course that's just syntax and there may be better choices. I'm not talking about syntax here. I do believe however that there is value in an explicit coercion, precisely because of the interaction with specialization.

@aturon
Copy link
Member

aturon commented Jan 15, 2016

@nikomatsakis

I actually don't think this is as clear as you make it out to be. In particular, which impl is chosen for specialization is a decision that can be made at type-checking time, but will frequently also be made at trans time, and I would have naively assumed that abstract types will be erased at trans time. It seems like you in contrast are describing something more like a newtype -- that is, the abstract X in your selection is a truly different type from i32 even at trans time, in which case it would uniformly affect selection and so forth, but that also implies kind of implicit "pack and unpack" operations at the boundaries of the abstraction, which seems a bit...tricky to get right to me.

I agree with all of this, and you're right that I was probably assuming too much about how this would end up looking from trans's perspective. And I agree that the implicit (un)packs may be tricky -- but, on the other hand, the question of "what is the scope of abstraction for impl Trait" is one of the core questions we have to resolve for the design.

Regarding @RalfJung's second example, I agree that the two calls must behave the same way. (Note: the example is a bit better if make_X returns a different number than 42). I think it's possible to explain this by treating the module itself as the scope of abstraction for X. But at that point it makes sense to start talking about the actual proposals -- via impl Trait and trait specialization -- since the fine details start to matter.

Taking a step back, remember that there are at least two separate motivations for impl Trait:

  • "Return type inference" or otherwise avoiding the need to name types in signatures
  • Type abstraction: actually preventing downstream code from discovering details about the types other than the given bounds.

My earlier blog post goes into more detail, and discusses proposals that attack only one of the goals, as well as ones that attack both.

At this point, I think there is broad consensus that the impl Trait feature, whatever it becomes, needs to tackle both of the above goals to some degree. But it's less clear exactly "how abstract" the type abstraction here needs to be, given that we already have abstraction via privacy (as @RalfJung mentions). The rough consensus has been: about the same level of abstraction you get with a newtype + forwarding trait impls. We could consider a newtype-like semantics as a guiding principle for figuring out the interaction with specialization (or, as @nikomatsakis, even an implementation strategy).

All that said, I'm not sure how problematic allowing specialization to "see through" this abstraction really is. Clearly, we would lose the guarantee that you can change the concrete type without breaking client code. But that guarantee is already available by using newtypes. And uses of specialization that meaningfully reveal the type (and would break if you change it) are probably rare. The benefit of making this abstraction transparent to specialization would be that you get a very straightfoward semantics for the interaction of the two features.

On balance, I think I'd like to see how far we can get with a newtype-like semantics. Can we provide a good programmer model for the implicit packs/unpacks involved?

@aturon
Copy link
Member

aturon commented Jan 15, 2016

Hah, I see @RalfJung is heading in roughly the same direction.

I meant to give a concrete example: imagine you have code that returns impl Iterator and then use some further iterator adapters on the result. In some hypothetical future, we might want to specialize certain combinations of iterators and adapters, for performance. But if we seal off the abstraction in the newtype style, these specializations won't apply for client code. How much do we care?

@glaebhoerl
Copy link
Contributor

I suspect the "newtype-like semantics" would not be nearly as simple as it sounds like. In general any approach which treats two types as equal in some parts of the type system and as different in others feels fraught with danger. The issue is that not all traits can be extended to a newtype with a simple forwarding impl: in the general case you need to essentially transmute the vtable of the base type to apply to the newtype. This is what GeneralizedNewtypeDeriving in GHC does. The problem is that while this feature treats the base and new types as equal, there are also ways to distinguish them, such as TypeFamilies (a.k.a. associated types). That gives you situations where a particular associated type (not necessarily in the same trait) of the base type is defined as A, while for the newtype it's B, and then if you call a method on the newtype which had been transmuted from the base type and involves this associated type in its signature, it go boom. I'm not sure if the same examples from GHC would carry over, but in any case we should tread very carefully here. The interaction of GeneralizedNewtypeDeriving and TypeFamilies was a gaping soundness hole in GHC's type system for many years. Lately they've closed it by adding TypeRoles as well, but a lot people feel that this solution is heavy-handed and involves more complexity and awkwardness than it's worth.

Having "a bit of an abstraction", IMHO is worse than not having it. Either you have abstraction, and then you can rely on it even in your unsafe code, or you cannot rely on it, and then you better don't have it because people might think they can rely on it. (Note that I would argue the same in a language that doesn't have unsafe. This is not related to unsafe. People rely on abstraction when they reason, intuitively, about why their code is correct. Providing "false abstraction" results in code that the author thinks is correct, but it actually isn't. In Rust, "not correct" + unsafe can mean "the program crashes", but it's not like a creatively mis-behaving but perfectly safe program would be significantly better.)

I'm so glad that I'm not the only person with this perspective :)

@RalfJung
Copy link
Member

I suspect the "newtype-like semantics" would not be nearly as simple as it sounds like. In general any approach which treats two types as equal in some parts of the type system and as different in others feels fraught with danger.

I agree, this is exactly what causes the specialization confusing in the proposal at hand. So I assume "newtype-like" refers to pretty much the proposal at hand, where "abstract type" is a newtype but the coercions are implicitly applied in the module that defines it? As opposed to, dunno, "fully newtype" semantics, where the coercions have to be stated explicitly (which is the proposal I made)?

@nikomatsakis
Copy link
Contributor

This is a pretty interesting conversation. But I want to push back a bit on
the assumption that having abstract type have a newtype-like semantics is
a good idea. Imagine that I am a client that is producing something that
implements IntoIterator:

fn foo() -> impl IntoIterator<&i32> { &[] }

and then I have a client that does:

let mut v = vec![];
v.extend(foo());

If we adopt a strong newtype semantics, then the client is going to lose
out on optimization here. This seems like a problem to me.

Like everything else, it's a give and take. Abstraction lets you make more
changes, but adds a tax. This is the tax that specialization is supposed to
free us from.

Put another way, one might be surprised that this function can potentially
observe and deduce what X is even though it has no bounds:

fn foo<X>(x: X) { ... }

But indeed, via specialization, it certainly can! So I don't think this
problem is really specific to abstract type.

On Fri, Jan 15, 2016 at 3:58 PM, Ralf Jung notifications@github.com wrote:

I suspect the "newtype-like semantics" would not be nearly as simple as it
sounds like. In general any approach which treats two types as equal in
some parts of the type system and as different in others feels fraught with
danger.

I agree, this is exactly what causes the specialization confusing in the
proposal at hand. So I assume "newtype-like" refers to pretty much the
proposal at hand, where "abstract type" is a newtype but the coercions
are implicitly applied in the module that defines it? As opposed to, dunno,
"fully newtype" semantics, where the coercions have to be stated explicitly?


Reply to this email directly or view it on GitHub
#1305 (comment).

@nikomatsakis
Copy link
Contributor

One final way to put this. I feel like the contract of an impl Trait is that the fn will produce something that implements that trait and that's it (modulo structural/auto/OIBIT traits). In particular, the callee reserves the right to change the details of the implementation. If the caller goes and relies on those details in some way, that's the caller's fault. It doesn't feel any different to me than if you had some hash function and you changed your hashing algorithm, or you had a sort function and you changed your sorting algorithm. You might break some clients that were relying on specific return values for specific inputs, but you'd still call it a minor version bump.

Update: To be clear, I think we can make it so that your client code will continue type-checking and compiling, even when you change the implementation (modulo OIBITS). The only way it can "observe" the type you return is based on the values generated by (and behavior of) specialized traits.

@eddyb
Copy link
Member

eddyb commented Jan 16, 2016

@nikomatsakis or TypeId, I guess? Presumably that can observe the post-monomorphization concrete type.

@RalfJung
Copy link
Member

I see. This is a design goal I did not take into consideration.

In this case, however, my first inclination is to suggest to not even try to make anything about that type "abstract". I think that will just cause confusion. A function fn foo<...>(...) -> impl Traits would be pretty close to a function fn foo<...>(...) -> _, except that the compiler would check that the return type actually satisfies the trait bound. In let x = foo(...);, x would literally have the actual return type of foo and be type-checked accordingly.

Now, I think I can see a practical difference to the actual proposal at hand, which is that clients could (accidentally) rely on x to actually implement more traits that what was been promised. Then changing foo could result in clients breaking which is bad. I take it that's how you ended up with this half-abstract proposal.

The questions of whether a type implements a trait is not changed by adding specialization, right? That's just all about which implementation is selected. So maybe another way to explain what kind of "hiding" is happening is that for an "abstract type", it is hiding whether the type implements some trait, but not which particular implementation is picked. The answer to "what is selected" in #1305 (comment) would then be "always the specialized version". Correct?

This is probably the least-confusing possible behavior of the code I sketched, but I still don't like how this is some kind of semi-abstract type. It's more something of "upper-bounding" the trait implementations that people see for the type, rather than actually hiding anything. I think calling this "newtype-like" is rather confusing. I take it "upperbound" is not acceptable as the keyword here? ;-) . I'm not sure I have any good ideas for how to word/design this to be less confusing. (And maybe this is only confusing to people who know about parametricity and are thus tempted to assume that they actually control how the "abstract type" is inhabited?)

@nikomatsakis
Copy link
Contributor

@RalfJung

A function fn foo<...>(...) -> impl Traits would be pretty close to a function fn foo<...>(...) -> _, except that the compiler would check that the return type actually satisfies the trait bound.

See also http://aturon.github.io/blog/2015/09/28/impl-trait/

So maybe another way to explain what kind of "hiding" is happening is that for an "abstract type", it is hiding whether the type implements some trait, but not which particular implementation is picked. The answer to "what is selected" in #1305 (comment) would then be "always the specialized version". Correct?

Yes.

@RalfJung
Copy link
Member

See also http://aturon.github.io/blog/2015/09/28/impl-trait/

Yes, I noticed I was getting close to some parts of that post. That's why I added "my first inclination" and immediately went on ;-)

@aturon
Copy link
Member

aturon commented Mar 10, 2016

@Kimundi, do you want to close this in favor of your new RFC?

@eternaleye eternaleye mentioned this pull request Apr 27, 2016
@icorderi
Copy link

cc: @icorderi

@aturon
Copy link
Member

aturon commented Jul 7, 2016

Given that the author's alternative minimal impl Trait RFC has been merged, I'm going to close this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-lang Relevant to the language team, which will review and decide on the RFC.
Projects
None yet
Development

Successfully merging this pull request may close these issues.