-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC - Add an owning "borrowed" pointer type &move
#1617
Conversation
I like it. The only thing I really disagree with is having to use The other thing is in your unresolved questions, you ask about |
trait DerefMove: DerefMut | ||
{ | ||
/// Return an owned pointer to inner data | ||
fn deref_move(&mut self) -> &move Self::Target; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could very well be wrong, but shouldn’t this take &move self
? Wouldn’t this otherwise allow things like the following:
let x = Box::new(...);
{
drop(*x.deref_move());
}
// use x (even though its contents have been destroyed!)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I suppose making it unsafe is the cop-out option (which semi-fits, because you'd need unsafe code to implement it).
It can't really be &own, because that "moves" the container into the deref_move
method, leading to it being dropped before the pointer you return.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the crux of the problem. We need typestate and an associated "empty container" type to do this correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm I have an idea where &move Uninit<T>
== &out T
. I am going to fork the RFC to write this. edit nope typestate + move + drop by value + ... would take forever to write.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A teaser:
"type state" in my parlance would allow for something like:
fn foo(arg: {'lifetime, BeforeType, DuringType, AfterType}, ...) -> ...;
e.g.
fn take_ref<'a, T>(self: {'a, T, MutBorrowed<T>, T}) -> &'a mut T;
fn take_mut_ref<'a, T>(self: {'a, T, Aliased<T>, T}) -> &'a T;
Now suppose there is some Uninit<T>
, which has no destructor and can only be written too (in which case it becomes a T
).
fn move_in<'unused, T>(self: {'unused, Uninit<T>, Whatever, T}, T) -> ();
fn move_out<'unused, T>(self: {'unused, T, Whatever, Uninit<T>}) -> T;
Now suppose that T::Toggle = Uninit<T>
, and Uninit<T>::Toggle = T
, so we have a way to go there and back:
fn take_move_ref<'a, T>(self: {'a, T, MutBorrowed<T>, T::Toggle}) -> &'a move T;
// Type App
fn take_move_ref::<T::Toggle>: fn (self: {'a, T::Toggle, MutBorrowed<T::Toggle>, T::Toggle::Toggle}) -> &'a move T::Toggle;
// Eliminate ::Toggle::Toggle
fn take_move_ref::<T::Toggle>: fn (self: {'a, T::Toggle, MutBorrowed<T::Toggle>, T}) -> &'a move T::Toggle;
type OutPtr<'a, T> = & 'a move T::Toggle;
And low and behold we get out-pointers for free! (spelling credit: @eddyb)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For re-borrowing:
fn reborrow_mut<'a: 'b, T>(self: {'b, &'a mut T, &'a mut MutBorrowed<T>, &'a mut T}) -> &'b mut T;
fn reborrow_move<'a: 'b, T>(self: {'b, &'a move T, &'a move MutBorrowed<T>, &'a move T::Toggle}) -> &'b move T;
and for the original motivation:
struct Box<T>(&'static move T);
type DeadBox<T>(&'static move Unwrap<T>);
fn straight_outta_box<'a, T>(self: {'a , Box<T>, Box<MutBorrowed<T>>, DeadBox<T>}) -> &'a move T {
reborrow_move(self.0)
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And finally, dropping, which is very easy:
impl<T> Drop for Box<T>
{
fn drop_interior(arg: &move Interior<Box<T>>) {
let x: &move T = move_out(arg).0;
// arg: OutPtr<Box<T>> // no-op to drop
drop(reborrow_move(x));
// x: OutPtr<T> // we can still use it (no-op to drop, too)
let ptr: *u8 = x as _;
unsafe {
heap::deallocate(ptr, mem::size_of::<T>(), mem::align_of::<T>());
}
}
}
Dropping Uninit<T>
is a no-op, so we don't need to worry about dropping DeadBox
es at all!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@glaebhoerl pinging you because of our shared interest in these things.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, in hindsight the &move Unint<T>
is an outpointer thing doesn't hold up, because out-pointers are linear / can only be consumed through writes, but &move Unint<T>
needs to be droppable after a move-out, like in my Box
example.
Otherwise I do think this stuff works, though granted its a major addition to the language.
I might just be too new to the language to see what this solves. I'd love to see a simple but meaningful example of what I can express that I couldn't express before. The examples in the RFC aren't explaining why I want it in a way that I can get it. I'm not familiar with the difficulty that arises with Partially this is curiosity. But other people will ask this question if the RFC is approved, so I feel justified in asking it now. |
How do you avoid the unsafety of |
@camlorn With this, you can now owned iterate over an array, for example. |
@ubsan What do you mean by "owned iterate over an array", and how does it differ from |
@bstrie |
How exactly would this RFC allow for iterating over an owned array?
For most usecases, it would be better if we allowed passing DSTs by-value (which under the hood would generate the same code as |
@dgrunwald |
@thepowersgang Make sure to add a section about *move, and how it will change Unique. |
# Summary | ||
[summary]: #summary | ||
|
||
Introduce a new pointer type `&move` that logically owns the pointed-to data, but does not control the backing memory. Also introduces the DerefMove trait to provide access to such a pointer. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does 'logically owns' mean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That the pointer owns the pointed-to data, and is responsible for calling the destructor on it (and can be moved out of)
@camlorn @bstrie |
- Add a new unary operation `&move <value>`. With a special case such that `&move |x| x` parses as a move closure, requiring parentheses to parse as an owned pointer to a closure. | ||
- This precedence can be implemented by ignoring the `move` when parsing unary operators if it is followed by a `|` or `||` token. | ||
- Add a new operator trait `DerefMove` to allow smart pointer types to return owned pointers to their contained data. | ||
- A type that implements `DerefMove` cannot implement `Drop` (as `DerefMove` provides equivalent functinality) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Arc
and Rc
implement Drop
in order to decrement the refcount and possibly deallocate the contained object. Presumably they will also want to implement DerefMove
. How will this work? Will drop(arc);
cause DerefMove::deallocate(&mut arc);
to be called?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DerefMove
makes no sense for shared ownership IMO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does DerefMove
provide equivalent functionality to Drop
?
Imagine that Box<T>
implemented DerefMove
. Then it couldn't implement Drop
.
fn main() {
let x = Box::new(0);
// I never moved out of the box, so it gets leaked here?
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It provides the same functionality as Drop. See the example in the RFC.... that said, I'm not sure how dropping of non-moved members would be handled.
I'm not sure if I buy this as a solution for the |
@rphmeier Sure |
@eddyb I don't fundamentally disagree with the concept of a The way we move out of Box today consumes the box and returns the value. To me, that implies a trait like this: trait DerefMove: DerefMut {
fn deref_move(self) -> Self::Target;
} with an implementation for impl<T: Sized> DerefMove for Box<T> {
fn deref_move(self) -> T {
let val = ptr::read(&self.0);
let (size, align) = (mem::size_of::<T>(), mem::align_of::<T>());
if size > 0 {
alloc::deallocate(self.0, size, align);
}
mem::forget(self) // skip destructor
val
}
} The compiler can special-case this trait like it does Although the signature of DerefMove inherently doesn't allow for moving unsized types, We can pass a pointer to These two pieces of information together can enable calling by-value self methods on trait objects by a) marking the unsized field on the pointer type as already dropped (so we don't drop it again) This will cause some interactions with destructors for pointer types which interact with the data they wrap. This is why today we don't allow moving out of types which implement At the same time, I do think that this could be made to work under the hood without any |
@rphmeier The problem with returning by value there is determining when the underlying allocation is released (and allowing a library type to release that memory) |
@rphmeier So you'd be replacing the need for |
Oh, by the way; in my opinion, moving out should not deallocate immediately, but at end of scope. This would allow for people to write {
let v = vec![1, 2, 3]
let x = &move v[..];
x[1] // still allowed, v is not deallocated yet
}
{
let v = vec![1, 2, 3];
let x = &move v[..] as *move _;
unsafe { (*x)[2] } // still allowed, v is not deallocated yet
} This means there's no confusing rules about lets or anything, and you can guarantee that the thing will get dropped at end of scope. This is useful for something like: struct Matrix {
indices: *move [usize],
buffer: *move [u8],
}
impl DerefMove for Matrix {
fn deref_move(&mut self) -> &move [u8] {
&move *self.buffer
}
fn deallocate(&mut self) {
// I know this is silly, but hopefully it gets the point across
Box::from_raw(indices);
Box::from_raw(buffer);
}
} |
@ubsan I agree, the deallocation should be where drops would occur atm, which also makes the borrows simpler to work with (they just have to stop being used at some point before the drop/deallocation). |
@camlorn You could argue that you could have a My experiments in API design led me to this contraption which is pretty close to maximally flexible for my usecases, but not as advanced as @Ericson2314's ideas. I'm really not sure what we should do here. Hence why this is a RFC. |
@camlorn To get some intuition, you can think of it like this. (Warning: intuition. Not deeply technical and precise explanation.) The borrow checker is fundamentally concerned with four things: (a) can I read from this? (b) can I write to this? (c) can I move out of this -- deinitialize it? (d) do I need to initialize this -- move into it? When working with function-local variables (and |
@eddyb @glaebhoerl I've been following the RFCs because it's fun for like 6 months now and this is the first one where I outright couldn't understand it from the RFC text. I only had slight problems with specialization. Consequently I thought it was probably worth being insistent on the this needs explanation part of the discussion. |
@camlorn I do admit that some of us have internalized "we need an owning reference to a borrowed place" for years and the first need I can think of is "being able to move The second that comes to mind, and which probably affects more code in the wild, is having a common interface over by-value iteration of Now if we had generics over constant integers, you could argue that you don't need of this ownership manipulation, you can just make one by-value iterator for each of those 3 categories of types ( There are several options in this area of development of Rust's semantics (including my |
@camlorn #965 has some more examples which may help too. |
#965 does help. Everyone else having my problem should go there too. Thanks. I'm starting to see the why of it, but I think I need to see it in practice before I fully grasp the motivation. One interesting point of the linked issue is that this helps avoid unnecessary copies. Is this still a possible use? If so, I want to go elsewhere and start discussing compiler optimizations with a possible eye to working on this one. |
@camlorn thanks again for bringing this issue up. I've agreed 100% but hadn't had the time to leave a comment saying so. The Ember RFC process has a "how do we teach this" section, which is specifically designed to make sure that this kind of thing is considered, I really like it. |
@camlorn yes indeed, the extra copying / allocation is still a problem. Also, for one more very convoluted example, in https://github.com/RustOS-Fork-Holding-Ground/libfringe/blob/params/src/context.rs#L27 and https://github.com/RustOS-Fork-Holding-Ground/libfringe/blob/params/src/context.rs#L57 I have a thread context (only for suspended threads) "owning" (with With Also, the init closure, a DST, is written to the allocated stack not boxed separately, so the |
@Ericson2314 pls don't use transmute to turn &mut into &move: |
@ubsan Well yeah I do a variation on yours for |
Ah much simpler: https://github.com/RustOS-Fork-Holding-Ground/libfringe/blob/params/src/fat_args.rs#L34-L36 After passing a pointer to args for the incoming thread, the incoming thread is responsible for moving them out not the outgoing thread. If I had |
It certainly is. To fix a stack overflow in one unit test which involves moving large FnOnce closures I had to use pointers/memcpy/forget because LLVM was unable to elide the copies between (inlined!) function calls. |
@Ericson2314 Thanks :), I'm not sure why I even said anything. |
@mahkoh and anyone else who can answer: Going briefly off topic in the interest of saving time, is there any theoretical reason the compiler can't optimize this before LLVM and where should I take the discussion (besides setting up to build Rust on my machine)? I have some vague, ten-thousand-foot view ideas as to how to do it. Sort of. And it could be interesting. |
@ubsan well I should of said "cast" rather than "transmute", no use being imprecise when I'm choosing the longer, more obscure, word :)! |
Obviously this request has a long history (it's also, to some extent, tied in with wanting to move DST by value). We discussed this some in the @rust-lang/lang meeting but before I write that comment I wanted to just give my personal position at the moment. (I've moved around from time to time.) First, as a generate note, I think that using The points raised in the previous paragraph suggest that using a type like And of course you can go further. If you're trying to work with data in the heap, maybe you don't need the lifetime, for example, you just need a way to free the pointer when you're done. This more or less brings you to eddyb's It would be good, I think, to start slower, by assembling the use cases we aim to address. How many of them are concerned with unsafe code aiming to make maximum efficiency? How many of them are things we expect users to use on a common basis? Where is compiler support needed? And so forth. I've definitely felt the need for features like this (e.g., I think Rayon could benefit in some places). But I also don't see this as the most pressing problem that Rust faces right now, personally. I'd like to accumulate some more examples and take our time to think on it. |
OK, now for the more official comment. :) So yes we talked about this in the @rust-lang/lang meeting a bit. The general consensus was that we'd be inclined to postpone this RFC under the (existing) issue, #998, to be revisited at some later point, when more "design bandwidth" is available. Therefore, I'm going to close this RFC. Thanks @thepowersgang! |
(Not sure if I should respond here or #998) @nikomatsakis It is a big extension right at the core of Rust, but I think that's precisely why I believe new syntax is warranted. Moreover, I see big overlap between this (including out pointers), fixing |
I still haven't taken the time to fully understand @eddyb's proposal, but I am suspicious because it looks like unsafe encapsulation tricks (safe interface, unsafe implementation) when features I mention above are about making both implementation and interface safe---so its apples and oranges basically. |
As to ergonomics/learnability my thoughts are: Yes were making the language more complex, but we're also making it more regular. Because nobody is proposing breaking backwards compat, "beginner's Rust" will stay the same. But I think the trade-off of simplicity for regularity will actually benefit programmers moving to the intermediate/advanced stage---like the point where you realize moving out of |
Actually it does. One is left with scratch pointer which, when written to, yields another |
In fact, with moving temporarily out of
All of these fill a "borrowing the location, owning its contents" role, just with the above restrictions on the initialization status at the lifetime boundaries. |
Kinda already said this, but I think it's important to take in account "transitive usage". The code that can be made safe with this is together used ubiquitously, even by beginners. |
Yeah, I don't know :) probably #998 is better, since I think the primary question here is not details of this RFC, but rather how to prioritize this change. To be honest, I'd personally prefer to discuss on internals.rust-lang.org, since I think GH doesn't manage sprawling comment threads very well (and throw a link onto #998).
Hmm. I suppose that's true. It's not how borrowing of other things works, but it could be fit in I imagine. Interesting! (It's still rather different from other borows, in that it leaves the value in a distinct state from how it began, but I agree you could make it fit better than I thought initially.)
I'm not sure that |
So, I've been thinking a lot about this RFC over the last few weeks, and I've decided that I was wrong to close it. I didn't intend to "shut down" conversation on the topic, but I think that this is how it came across. I do have lots of doubts about introducing Anyway, I decided to post this comment last night and at that time I had intended to re-open the RFC (or ask for suggestions about where to move the conversation to). But when I awoke this morning I noticed that in the meantime RFC 1646 has been opened, so perhaps I'll just comment there? (Thoughts?) |
This RFC was pretty badly specified, and to be honest the new one is far better :) (I was feeling a little unwell when I wrote this). @nikomatsakis Comments on the syntax should probably be raised on the new RFC pull (maybe as a summary of IRC/forum discussion?) |
Rendered
Big remaining questions:
&move
is stored somewhere? Ideally when it's not stored, the memory is released as soon as possible (i.e. at the end of the statement), but when stored that would have to be deferred until the end of the original scope.