Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom self types #2362

Closed

Conversation

withoutboats
Copy link
Contributor

@withoutboats withoutboats commented Mar 14, 2018

Extends support for new self types, such as Rc<Self> and Arc<Self>.

Rendered

@Centril Centril added the T-lang Relevant to the language team, which will review and decide on the RFC. label Mar 14, 2018
@sfackler
Copy link
Member

How are conflicting method names resolved?

pub struct Ptr<T>(T);

impl<T> Deref<T> {
    type Target = T;

    fn deref(&self) -> &T {
        &self.0
    }
}

impl<T> Ptr<T> {
    pub fn foo(&self) {}
}

pub struct Bar;

impl Bar {
    fn foo(self: &Ptr<Self>) {}
}


However, we also feel that `CoerceUnsized` is not ready to stabilize without
further consideration of the trade offs. For that reason, defining your own
arbitrary method receivers may not be stabilized as quickly.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this division makes sense. It shouldn't even require an extra feature gate, since CoerceUnsized is already unstable, correct? In other words, the delayed stabilization described here doesn't require delaying stabilization of any part of this RFC-- it just specifies that CoerceUnsized, an existing unstable feature, will remain unstable.

@cramertj
Copy link
Member

cramertj commented Mar 14, 2018

How will conflicts between methods on e.g. Arc<T> and self: Arc<MyType> be handled? For example:

// In an upstream crate
impl<T> CleverPtr<T> {
    fn foo(self) { ... }
}

// In a downstream crate:
struct MyType;
impl MyType {
    fn foo(self: CleverPtr<MyType>) { ... }
}

fn main() {
    let x = CleverPtr::new(MyType);
    x.foo(); // ERROR: conflicting methods.
    // What do I write to solve this? UFCs doesn't help:
    <CleverPtr<MyType>>::foo(x) // Which method is this?
}

Edit: whoops, looks like @sfackler beat me to it :)
Edit 2: see below, UFCs does solve the problem

@withoutboats
Copy link
Contributor Author

@sfackler Good question! Its already implemented, so they must be resolved somehow. Let's check... (playpen)

It's currently a name collision.

(@cramertj I think your question is the same.)

@sfackler
Copy link
Member

Ah cool, and you can disambiguate with Ptr::foo and Bar::foo.

@cramertj
Copy link
Member

cramertj commented Mar 14, 2018

Ah, I see-- the type in UFCs is the type of the impl block, not the type of the method receiver.

(In other words, impl Foo { fn bar(self: Arc<Self>) {...} } is called like Foo::bar(x), not <Arc<Foo>>::bar(x).)

@withoutboats
Copy link
Contributor Author

@sfackler Right! It seems analogous to two traits containing the same method name.

I could see us introducing a hierarchy some day so that impls defined where Self is the type of the variable get precedence (so that Ptr<T>'s foo comes first), similar to how inherent methods get precedence over traits. But the name collision is forward compatible with that change, and I think I'd like to get experience using this and seeing how often its a problem before deciding whether or not to do that (I can add it as an unresolved question).

# Summary
[summary]: #summary

Allow types that implement `Deref` targeting `Self` to be the receiver of a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is clearer:

Allow types that implement Deref<Target = Self> ...

feature.

This feature is increasingly relevant because of the role of special pointer
types to constraint self-referential types, such as generators containing
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: s/constraint/constrain

[guide-level-explanation]: #guide-level-explanation

When declaring a method, users can declare the type of the `self` receiver to
be any type `T` where `T: Deref<Target = Self>`. Shorthand exists, so that
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/Shorthand exists/Shorthands exist/

fn by_rc(self: Rc<Self>);
fn by_arc(self: Arc<Self>);
}
```
Copy link
Contributor

@Centril Centril Mar 14, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about HKTs in the future (assuming we get HKTs which we might not)?
Consider:

trait Functor<Self : * -> *> { // strawman syntax for encoding the kind of Self : * -> *
    fn fmap<A, B, F>(self: Self<A>, mapper: F) -> Self<B>
    where F: Fn(A) -> B;
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we have a bound where Self<A> derefs to Self, I don't see why not.

Copy link
Contributor

@Centril Centril Mar 15, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But Self is not of the kind of values, it has kind * -> *, so you shouldn't be able to construct a value of Selfat all in this case (hence, you can't deref Self<A> to Self).

I guess you could have a separate rule for higher kinded types and say that this rule only applies to *-kinded types.


To support object-safety, we must have a way of obtaining a vtable from the
reference type, and then passing the correct receiver type to the function in
the vtable.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you considered interactions with multiple trait bounds in the future?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there's an interaction. This RFC doesn't introduce any constraint that doesn't already exist today.

@glaebhoerl
Copy link
Contributor

Two sortof-related questions:

  • Are trait objects the reason why Deref is necessary, or is there something else? Writing a free fn taking a Foo<T> as argument doesn't (of course) require it to be Deref. Is it because formulating a "the receiver must [somehow] be related to Self" rule would otherwise require HKT, and/or because we want to support non-generic receivers (e.g. String: Deref<Target=str>)?

  • How come DerefMut isn't relevant? (We don't really care about the methods only the Target type?)

@withoutboats
Copy link
Contributor Author

@glaebhoerl I don't really understand your questions. Its true that object safety requires Deref to find the vtable, and also true that we'd want to keep the constraint anyway. I think the answer to your first question is basically "yes," we don't want something like:

impl String {
    fn this_is_a_string_method(self: i32) { }
}

// These two expressions do the same thing:
0.this_is_a_string_method()
String::this_is_a_string_method(0)

I don't see how DerefMut could come into things at all.

@glaebhoerl
Copy link
Contributor

(I think both of them boil down to trying to clarify for myself why symmetries which typically manifest -- between the requirements of free fns and methods, and between pairs of things which work with Deref resp. DerefMut -- in this case don't.)

@arielb1
Copy link
Contributor

arielb1 commented Mar 14, 2018

@glaebhoerl

At least when I wrote the proposal, Deref is required for the same reason that importing traits is - type-based resolution should not depend on the universe - in other words, adding a new sibling crate while not using it in the current scope should not be able to cause ambiguities that need UFCS to be resolved where they did not exist before.

For this reason, DerefMut does not matter - if we have a Deref-chain, we already know that the method is "in our universe".

I'm not sure how useful is that property, because adding a new dependency can worsen type inference, but it felt like a useful thing to keep.

@arielb1
Copy link
Contributor

arielb1 commented Mar 14, 2018

@withoutboats

When I last looked on this, I remember that the current checking of CoerceUnsized did not quite have the right invariant that we needed for packing vtables.

However, CoerceUnsized is unstable and ill-documented, so I think the right idea is to:

  1. Figure out the right rules for CoerceUnsized to make autosizing and autounsizing sound, while making sure that there are sane rules that don't break anything important.
  2. Document these rules and change the implementation of CoerceUnsized to use them.
  3. Maybe think about stabilizing it.

@clarfonthey
Copy link

I assume that self: &Box<T> will always be the appropriate bound, and not &self: Box<T>?

@F001
Copy link
Contributor

F001 commented Mar 15, 2018

Is it possible to allow trait object as the type of self? I mean:

trait Bar {
    fn baz(self: &dyn Deref<Target=Self>);
}

@F001
Copy link
Contributor

F001 commented Mar 15, 2018

And I have another question which might be off topic. Does this RFC solve the issue rust-lang/rust#28796? My understanding is that it is possible to allow Box<FnOnce()> just work. Or it is an orthogonal problem?

@withoutboats
Copy link
Contributor Author

cc @mikeyhew who will hopefully be spearheading the implementation work on this :)

@nikomatsakis
Copy link
Contributor

@withoutboats

With respect to the trait object rule, there's something a bit implicit in there I want to better understand. The RFC states:

Given a reference type Ptr<dyn Trait>, the compiler must be able to prove that T: Unsize<dyn Trait> implies Ptr<T>: CoerceUnsized<Ptr<dyn Trait>>.

It seems like we would require a "HKT-like" feature here to decide what Ptr is, right? It feels like we should say something about this. Presumably the rule is that, looking at the self type in the trait definition, the Self parameter must appear exactly once, so that we can instantiate it with Self = dyn Trait. Maybe we want to be more restrictive, in fact, and say that it must an ADT with one type parameter (Self)? That would fit e.g. Rc<Self> and friends; we probably want to support Ref<'_, Self> too.

But, for example, I imagine we don't want anyone to write this:

trait Foo {
    fn method(self: Deref1<Self, Self>) { .. }
}

struct Deref1<A: ?Sized, B: ?Sized> { }
impl<A: ?Sized, B: ?Sized> Deref for Deref1<A,B> {
  type Target = A;
}

@nikomatsakis
Copy link
Contributor

Similarly, the way the RFC is written, it is fine (in inherent impls, anyway) to have a receiver type derefs to Self but doesn't mention Self, right?

struct Foo { }
struct Bar { }
impl Deref for Foo {
   type Target = Bar;
}

impl Bar {
    fn on_foo(self: Foo) { } // `Foo: Deref<Bar>` is ok
}

@nikomatsakis
Copy link
Contributor

I think it'd be useful to dig a bit more into how method dispatch on a trait object will work. Maybe I'm just slow, but the text in the RFC is a bit terse. =)

So, let me try to spell it out for my own edification. We start out with a Rc<dyn Trait> and we see an invocation of a method foo, defined like so:

trait Trait {
    fn foo(self: Rc<Self>);
}

We consider Trait to be "dyn capable", as we have previously checked that (in Chalk terms):

forall<T> {
  if (T: Unsize<dyn Trait>) {
    Rc<T>: CoerceUnsized<Rc<dyn Trait>>
  }
}

Then, at runtime, we can deref Rc<Self> to attain &self and extract the vtable. Given the current limitations on the CoerceUnsized trait, this implies that Rc must effectively be just a newtype'd pointer (which, indeed, it is):

For custom types, the coercion here works by coercing Foo<T> to Foo<U> provided an impl of CoerceUnsized<Foo<U>> for Foo<T> exists. Such an impl can only be written if Foo<T> has only a single non-phantomdata field involving T. If the type of that field is Bar<T>, an implementation of CoerceUnsized<Bar<U>> for Bar<T> must exist. The coercion will work by coercing the Bar<T> field into Bar<U> and filling in the rest of the fields from Foo<T> to create a Foo<U>. This will effectively drill down to a pointer field and coerce that.

Assuming these docs are accurate (I should check the code), then it seems like we already know that actually we can just strip the vtable from the Rc<dyn Trait> and pass the pointer -- i.e., we don't really need to extract the type info out of the vtable in particular, right? (Am I missing something?)

@arielb1 I'm curious what constraints you think are insufficient.

(If we were to generalize CoerceUnsized -- e.g., to allow user-specified computation as was originally proposed -- we might have to do more.)

@withoutboats
Copy link
Contributor Author

withoutboats commented Mar 16, 2018

@nikomatsakis You're not slow, I'm just uncertainly cribbing from the notes that you and @mikeyhew gave me. :)

Everything you've said seems accurate. I think we should restrict the self type in traits to having the Self parameter appear exactly once, but no more restrictions (e.g. having other parameters seems fine to me, as long as it can be implemented).

@mikeyhew
Copy link

@nikomatsakis

it seems like we would require a "HKT-like" feature here to decide what Ptr is, right?

I agree with this notion. I'm wary of stabilizing the way that we determine if a self type is object-safe until we have more trait-system features.

I don't think that the CoerceUnsized requirement is the ultimate solution here, I think it's more of a hack that we can use now (and keep unstable) until we have a better solution.

Given the current limitations on the CoerceUnsized trait, this implies that Rc must effectively be just a newtype'd pointer (which, indeed, it is)

It doesn't have to be a newtype'd pointer, it just has to have a field that is a pointer (it can have other fields too).

This is why the `CoerceUnsized` bound is necessary for object-safe receiver
types. Using the type ID stored in the vtable, we can downcast the `Self` type
to the correct concrete type to pass to the function pointer for the method
we're calling, effectively reversing the unsizing coercion.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure that "type ID" and "downcast" is the right terminology here? My understanding was that the compiler looks up the method in the vtable, coerces the Rc<Foo> to Rc<WhateverTypeThatImplementsFoo> without actually knowing what that type is, and calls the method using the coerced Rc as the self argument

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right.

@nikomatsakis
Copy link
Contributor

nikomatsakis commented Mar 17, 2018

@mikeyhew

It doesn't have to be a newtype'd pointer, it just has to have a field that is a pointer (it can have other fields too).

However, I believe those fields have to be phantom-data -- or at least zero-sized? That's what the description says, anyway.

be implemented. Given a reference type `Ptr<dyn Trait>`, the compiler must be
able to prove that `T: Unsize<dyn Trait>` implies `Ptr<T>:
CoerceUnsized<Ptr<dyn Trait>>`. If the compiler can prove this, methods with
these receivers are object safe (how they object safe conversion is implemented
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: typo they -> the

@Nemo157
Copy link
Member

Nemo157 commented Jul 21, 2018

Something not covered in the RFC and I don't recall seeing discussed is lifetime elision, in the futures codebase there is a function signature:

pub fn get_pin_mut(self: PinMut<'a, Self>) -> PinMut<'a, Si>

If arbitrary self types supported the same elision as normal references this could be:

pub fn get_pin_mut(self: PinMut<Self>) -> PinMut<'_, Si>

But that seems a bit terse, and loses out on the "all types have either & or '_ to denote they contain a lifetime" property of the '_ return lifetime. Maybe this could be allowed to be written

pub fn get_pin_mut(self: PinMut<'_, Self>) -> PinMut<'_, Si>

to remove the need to give a name to this very transient lifetime.

(I found this because Clippy's needless_lifetimes lint has a false positive on this line with the current arbitrary_self_types implementation).

@mikeyhew
Copy link

@Nemo157 I think going with '_ makes a lot of sense. As far as I'm concerned, the fact that it's not supported right now is a bug.

@L-as
Copy link

L-as commented Aug 30, 2018

so.... will this be merged?

@mikeyhew
Copy link

Before this gets merged, I think we should either remove the part about object safety, or say that the way object-safety is determined will remain unstable for a while, and will probably need a new RFC for a final decision to be made. Right now, in the implementation, we are using a new, unstable trait called CoerceSized that has to be implemented in order for a receiver type to be considered object-safe. While I think it probably doesn't need to be decoupled from CoerceUnsized, doing so is the more conservative choice, and so that's what we're doing for now.

Another change that I'm mostly in favour of, is calling the self argument the method's receiver, to keep from confusing lowercase self and uppercase Self. Under this logic, "custom method receivers" would be a better name for this feature.

@Centril Centril added A-typesystem Type system related proposals & ideas A-method-call Method call syntax related proposals & ideas labels Nov 22, 2018
@nikomatsakis
Copy link
Contributor

Nominating this for @rust-lang/lang -- I think it just got overlooked? AFAIK we stabilized some subset of this functionality!

@nikomatsakis
Copy link
Contributor

We discussed this briefly in the @rust-lang/lang meeting today and decided to postpone. I was mistaken in saying that we stabilized the functionality in here -- in truth, what we stabilized was extending the set of "recognized self types" to include Pin<&mut Self> and a few other selected types. We didn't really work out the general case (e.g., Rc<Self>) and it seems like it's sufficiently complex that we don't think doing so makes sense at this point (there's nobody with time to concentrate on it). Therefore, I am moving to postpone the RFC until we do have such bandwidth.

(Note that the RFC content would obviously be a useful starting point for future discussion when we do revisit the topic.)

@rfcbot postpone

@rfcbot
Copy link
Collaborator

rfcbot commented Aug 29, 2019

Team member @nikomatsakis has proposed to postpone this. The next step is review by the rest of the tagged team members:

No concerns currently listed.

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

@rfcbot rfcbot added proposed-final-comment-period Currently awaiting signoff of all team members in order to enter the final comment period. disposition-postpone This RFC is in PFCP or FCP with a disposition to postpone it. labels Aug 29, 2019
@mikeyhew
Copy link

For the record, we stabilized Rc and Arc and Pin, but did not stabilize the mechanism to make user-defined smart pointer types usable as method receivers.

@comex
Copy link

comex commented Sep 3, 2019

That’s is a really unfortunate state to leave things in indefinitely. std is not suitable for all use cases and we should avoid giving it special privileges if at all possible.

@Centril
Copy link
Contributor

Centril commented Sep 3, 2019

That’s is a really unfortunate state to leave things in indefinitely.

This is not intended as the end state of things. We should get back to arbitrary_self_types eventually, but the language and compiler teams have limited bandwidths that I believe should be focused on finishing a smaller set of tasks better.

std is not suitable for all use cases and we should avoid giving it special privileges if at all possible.

I believe we agree overall.

@burdges
Copy link

burdges commented Sep 3, 2019

Are there interesting complexities in the mutable cases like Box, MutexGuard, and RwLockReadGuard? RefCell cannot happen because it does not implement Deref?

@rfcbot rfcbot added the final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised. label Sep 19, 2019
@rfcbot
Copy link
Collaborator

rfcbot commented Sep 19, 2019

🔔 This is now entering its final comment period, as per the review above. 🔔

@rfcbot rfcbot removed the proposed-final-comment-period Currently awaiting signoff of all team members in order to enter the final comment period. label Sep 19, 2019
@rfcbot
Copy link
Collaborator

rfcbot commented Sep 29, 2019

The final comment period, with a disposition to postpone, as per the review above, is now complete.

As the automated representative of the governance process, I would like to thank the author for their work and everyone else who contributed.

The RFC is now postponed.

@rfcbot rfcbot added finished-final-comment-period The final comment period is finished for this RFC. postponed RFCs that have been postponed and may be revisited at a later time. and removed final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised. disposition-postpone This RFC is in PFCP or FCP with a disposition to postpone it. labels Sep 29, 2019
@rfcbot rfcbot closed this Sep 29, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-method-call Method call syntax related proposals & ideas A-typesystem Type system related proposals & ideas finished-final-comment-period The final comment period is finished for this RFC. postponed RFCs that have been postponed and may be revisited at a later time. T-lang Relevant to the language team, which will review and decide on the RFC.
Projects
None yet
Development

Successfully merging this pull request may close these issues.