-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: Uninitialized Pointers #98
Conversation
I can see the use for this sort of thing; IMO a similar "&overwrite" pointer would be interesting to further clarify 'input'/'output' information for a function. By default function arguments are inputs; then you annotate "mut" for something changeable - which is both an input and an output - but an "overwrite" pointer is purely an output, which completes the range of cases. Some CPU architectures have the potential for optimisation with 'cache-line allocate-zero' instructions, if you know ahead of time that some memory will be completely over-written you can avoid reading it into the CPU before its modified.. reducing memory bandwidth requirement; but even without that, I think its nice to be able to communicate more about what a function does in its signature (my idea would be better named "&overwrite" since you're saying what you will do with it, not what its' existing state is - as you indicate you would be able to create an "&uninit" from an existing "&mut", even though that may be initialised - but a very common use would be from data that is uninitialised, |
IMO this doesn't seem worth the extra complexity of yet another pointer type. Everyone will have to learn what this does but it will be used very rarely. Rust is already getting the reputation (fairly or unfairly, doesn't really matter) of "that language with too many different pointer types," let's not make that worse unless strictly necessary. |
I wouldn't worry about that - there's a difference between necessary and accidental complexity: without a garbage collector to lean on, its inevitable that its going to have that complexity shifted to the user. If you've reasoned fully about all the possibilities for a pointer - then you already know this case, we'd just be assigning a name to it and potential for the compiler to catch some more errors/more potential to communicate through types. I suppose it's possible that RVO catches some of the need for an overwrite type? .. but for those of us who've come from C, the ability to state something explicitly can be comforting. It would be a rarely used case, so I wouldn't mind it having a long,verbose, name..(not clashing with existing vars..) you won't be using it often. But if as the OP claims it can reduce the amount of unsafe blocks -then that makes the case more compelling beyond simple 'communication' as I thought it would be. |
I'm not saying this doesn't address a use-case; it appears that it does. But that same use-case is already being addressed with BTW this issue doesn't seem like a 1.0 blocker, so could be revisited in the future if more experience with Rust shows that this is necessary. I'd just rather not go through the 1.0 gate and then teach people Rust with "and here are the 11 different pointer types; ignore 8 of them for now." |
Fair enough; I would certainly agree I there are way more important feature requests, and it could come later;
You could say its an annotation of an existing type - and look whats happened with @. The truth is , pointers are just complex :) I think it is valid to cover more and just sort by importance when you teach. And the language should sort by frequency when naming/selecting defaults. The way I see it , a reference has 3 potential 'bits' of information... aliasiable, writable, readable. So the default is 'readable, aliasable)', then &mut (opposite to C's const) imposes 'writeable, non aliasable'; the whole business of 'Cell' could be addressed by adding an '&alias' (opposite to C's restrict)? - and yet another keyword to disable 'readable' goes beyond C in expressiveness and safety for low level code - whats returned by 'malloc' is not safely readable, but the type system doesn't tell you. |
Would more demand for this appear with the potential box() arguments ? (implementing emplace_back and so on..) - would that be a situation where you'll want to pass an 'overwrite' or 'uninit' pointer? |
Mostly a nice feature although it's not a necessity. However, I don't like implicit type states. Destructuring an let &uninit(ref uninit a, ref uninit b) = ptr; |
I would call this |
I was deliberately trying to avoid this connotation, as I really don't like output pointers. However, as @glaebhoerl's placement new formulation shows, this does have a use case, and so I may weaken on this point.
While you do create
This is, unfortunately, a valid point. Although I was more worried about the drop flag, I was hesitant to post this RFC because it has very clear downsides.
As mentioned before, this is why I was explicitly avoiding any examples of using this pointer as a place to output values.
This is definitely not a 1.0 blocker. It is completely backwards compatible (unless we get rid of the drop flag) and could be added at any time.
This type of pointer could subsume the placement new optimization for pointers and add it to a number of other data structures. However, because of this,
To some extent, I agree, and I mentioned this in the drawbacks section. However, this can be fixed with explicit movement functions, as discussed in the issue @glaebhoerl mentioned. I'll add this to the alternatives section.
This sounds like the logical approach, though it might confuse people as to why they can't only partially destructure the value.
As mentioned above, I wanted to avoid the connection with output parameters because that would be two different ways to achieve the same thing. However, your formulation of placement new requires it, so I might change my mind. Regardless, I'll add the possibility to my RFC.
Very interesting - I hadn't seen that discussion before. The point about generic functions is quite worrying, and I'll make sure to add it in. I also like your formulation of placement new - I hadn't figured out a way to return something and a borrowed version of it simultaneously. I think that this cannot be used for tying the knot, but I do think it can do arbitrary permutations, as you mentioned. |
drop(*ptr); | ||
|
||
// This drops the whole vector, but the pointer to 2 is not freed twice because it was zeroed when | ||
// ptr was encountered. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
References don't get freed. If by "pointer to 2" you mean the box
in the vector, we can't rely on zeroing to work, because we'd like to get rid of it and move to precisely tracked destructors instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Yes, I do mean the
Box
in the vector. - I mentioned the reliance on a drop flag in the drawbacks section, and this is the most compelling drawback that I see.
Some thoughts, partly borrowed from the earlier thread: There is a use case for this feature, right in Rust's target audience in fact. In low level / embedded code it's a common technique for functions not to allocate memory for their own results, but to return the result into a pointer provided by the caller, which could be to memory allocated by the caller if necessary, but also to the caller's stack, into an existing structure, etc. This feature would allow the pattern to be expressed directly and proved safe by the compiler, which is not currently possible. The current alternative is I agree with @pczarn about destructuring (except of course I would call it
I don't think we should, or need to, have variables change their types for this to work. Linear types already allow us to express everything that we need within the current system. If you want to turn an You write "[it] points to possibly uninitialized data", but in fact it must be uninitialized data. If it points to initialized data and you overwrite it through the pointer, its destructor should be run, but isn't. In the types-are-propositions sense an EDIT: Now I remember that you were relying on drop flags. In that case read this as how to avoid drop flags. Getting rid of drop flags, and tracking moves statically, is something I believe we're planning to do anyways. Every variable is always either definitely initialized or definitely uninitialized, and the compiler (the borrow checker) knows which. The issue with For the case I was originally concerned about, viz. what happens if you have an So here's the new plan:
LATE EDIT: For what it's worth, we could do without the static restriction in the second point above and just make the destructor for |
Here's another really cool thing with Array slices. The problem I was thinking about was, how could Rust provide analogues to some of the classic list-based functions in Haskell's
We obviously can't make infinite arrays in the case of
(Or at least, the type signature. Not necessarily the implementation.) But if the size is determined at runtime:
What can we return? We could return a But we could do this:
Here
Edit: While thinking about the |
@glaebhoerl that's interesting. It seems like (FWIW, those two functions would normally be written as iterators, especially |
+1 This is a very interesting proposal, and I can see that it could be very useful, although I’m not so sure about the name— I was also wondering how this would interact with rust-lang/rust#12624, given that it (AIUI) allows moves out of There’s one thing I don’t understand, however—why does the state of the pointer have to be defined at all times? Surely it would be safe to simply assume that, when it’s indeterminate, it’s uninitialised? Example: let x = &mut 3;
if condition {
drop(*x);
}
// x is always assumed to be &uninit here
*x = 3; This behaviour is similar to that of regular moves—if it’s indeterminate, it defaults to being moved. Example (that works today): let x = box 3;
if condition {
drop(x);
}
// x is moved here, so referring to it is invalid |
As I pointed out in a line comment, any type with a destructor is linear :). But
Yeah I guess that makes sense. (But you still couldn't initialize an uninitialized array of dynamic size with them in safe code without |
This is definitely a nice feature to have, especially for writing small cases of a simple For now, however, this is a backwards compatible change, so we're going to close this as postponed. We would like to revisit this, however, as this would definitely make fighting with the borrow checker easier in some cases. As always, thank you for the RFC! We're all quite interested in seeing the various alternatives for have a system such as this! |
No description provided.