-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
improve #[may_dangle]
for type parameters
#3417
base: master
Are you sure you want to change the base?
Changes from 5 commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,239 @@ | ||
- Feature Name: (`dropck_eyepatch_v3`) | ||
- Start Date: (fill me in with today's date, YYYY-MM-DD) | ||
- RFC PR: [rust-lang/rfcs#0000](https://github.com/rust-lang/rfcs/pull/0000) | ||
- Rust Issue: [rust-lang/rust#0000](https://github.com/rust-lang/rust/issues/0000) | ||
|
||
# Summary | ||
[summary]: #summary | ||
|
||
Cleanup the rules for implicit drops by adding an argument for `#[may_dangle]` on type | ||
parameters: `#[may_dangle(droppable)]` and `#[may_dangle(must_not_use)]`. Change `PhantomData` | ||
to get completely ignored by dropck as its current behavior is confusing and inconsistent. | ||
|
||
# Motivation | ||
[motivation]: #motivation | ||
|
||
The current rules around dropck and `#[may_dangle]` are confusing and have even resulted in | ||
unsoundness [multiple](https://github.com/rust-lang/rust/issues/76367) | ||
[times](https://github.com/rust-lang/rust/issues/99408). Even without `#[may_dangle]`, | ||
dropping `PhantomData` is currently quite weird as you get "spooky-dropck-at-a-distance": | ||
|
||
```rust | ||
use std::marker::PhantomData; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Stylistic note: Large blocks of code are hard for me to deal with. Can you add comments or break this into paragraphs to walk me through why each piece matters? |
||
struct PrintOnDrop<'a>(&'a str); // requires `'a` to be live on drop. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. e.g., Consider first a type |
||
impl Drop for PrintOnDrop<'_> { | ||
fn drop(&mut self) { | ||
println!("{}", self.0) | ||
} | ||
} | ||
|
||
fn assign<T>(_: T) -> PhantomData<T> { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. explain what this is for |
||
PhantomData | ||
} | ||
|
||
// The type of `_x` is `AdtNoDrop<'not_live>` which doesn't have drop glue, OK | ||
struct AdtNoDrop<'a>(PhantomData<PrintOnDrop<'a>>, u32); | ||
fn phantom_data_adt_no_drop() { | ||
let _x; | ||
{ | ||
let temp = String::from("temporary"); | ||
_x = AdtNoDrop(assign(PrintOnDrop(&temp)), 0); | ||
} | ||
} | ||
|
||
// The type of `_x` is `AdtNoDrop<'not_live>` which has drop glue, ERROR | ||
struct AdtNeedsDrop<'a>(PhantomData<PrintOnDrop<'a>>, String); | ||
fn phantom_data_adt_needs_drop() { | ||
let _x; | ||
{ | ||
let temp = String::from("temporary"); | ||
_x = AdtNeedsDrop(assign(PrintOnDrop(&temp)), String::new()); | ||
} | ||
} | ||
``` | ||
[playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=9ce9d368d2f13df9ddcbfaf9580721e0) | ||
|
||
# Guide-level explanation | ||
[guide-level-explanation]: #guide-level-explanation | ||
|
||
When a value goes out of scope the compiler adds drop glue for that value, recursively dropping it and all its fields. | ||
Dropping a type containing a lifetime which is no longer live is accepted if that lifetime is never accessed: | ||
```rust | ||
struct MyType<'s> { | ||
reference: &'s str, | ||
needs_drop: String, | ||
} | ||
fn can_drop_dead_reference() { | ||
let _x; | ||
{ | ||
let temp = String::from("I am only temporary"); | ||
_x = MyType { | ||
reference: &temp, | ||
needs_drop: String::from("I have to get dropped"), | ||
}; | ||
} | ||
// We drop `_x` here even though `reference` is no longer live. | ||
// | ||
// This is fine as dropping a reference is a noop and does not | ||
// acess the pointee. | ||
} | ||
``` | ||
The above example will however fail if we add a manual `Drop` impl as the compiler conservatively | ||
assumes that all generic parameters of the `Drop` impl are used: | ||
[playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=e604bcaecb7b2b4cf7fd0440faf165ac). | ||
|
||
In case a manual `Drop` impl does not access a generic parameter, you can add | ||
`#[may_dangle]` to that parameter. This **unsafely** asserts | ||
that the parameter is either completely unused when dropping your type or only | ||
recursively dropped. For type parameters, you have to declare whether you | ||
recursively drop instances of `T`. If so, you should use `#[may_dangle(droppable)]`. | ||
If not, you may use `#[may_dangle(must_not_use)]`. | ||
|
||
```rust | ||
struct MyType<T> { | ||
generic: T, | ||
needs_drop: String, | ||
} | ||
// The impl has to be `unsafe` as the compiler may not check | ||
// that `T` is actually unused. | ||
unsafe impl<#[may_dangle(droppable)] T> Drop for MyType<T> { | ||
fn drop(&mut self) { | ||
// println!("{}", self.generic); // this would be unsound | ||
println!("{}", self.needs_drop); | ||
} | ||
} | ||
fn can_drop_dead_reference() { | ||
let _x; | ||
{ | ||
let temp = String::from("I am only temporary"); | ||
_x = MyType { | ||
generic: &temp, | ||
needs_drop: String::from("I have to get dropped"), | ||
}; | ||
} | ||
// We drop `_x` here even though `reference` is no longer live. | ||
// | ||
// This is accepted as `T` is marked as `#[may_dangle(droppable)]` in the | ||
// `Drop` impl of `MyType`. | ||
} | ||
``` | ||
|
||
`Drop` impls for collections tend to require `#[may_dangle(droppable)]`: | ||
|
||
```rust | ||
pub struct BTreeMap<K, V> { | ||
root: Option<Root<K, V>>, | ||
length: usize, | ||
} | ||
|
||
unsafe impl<#[may_dangle(droppable)] K, <#[may_dangle(droppable)] V> Drop for BTreeMap<K, V> { | ||
fn drop(&mut self) { | ||
// Recursively drops the key-value pairs but doesn't otherwise | ||
// inspect them, so we can use `#[only_dropped]` here. | ||
drop(unsafe {ptr::read(self) }.into_iter()) | ||
} | ||
} | ||
``` | ||
|
||
A type where `#[may_dangle(must_not_use)]` would be useful is a `Weak` pointer for a variant of `Rc` | ||
where the value is dropped when the last `Rc` goes out of scope. Dropping a `Weak` pointer | ||
would never access `T` in this case. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is helpful and avoids the main problem I saw in the previous RFC -- it'd be great to expand a bit on where the two modes should be used in the stdlib to help reader understand the role of them. Also, what is an example of code that compiles because of this but wouldn't otherwise? |
||
|
||
|
||
# Reference-level explanation | ||
[reference-level-explanation]: #reference-level-explanation | ||
|
||
Whenever we use a value of a given type this type has to be **well-formed**, requiring | ||
that all lifetimes in this type are live. An exception to this is the implicit drop when | ||
a variable goes out of scope. While borrowck ensures that dropping the variable is safe, | ||
this does not necessarily require all lifetimes to be live. | ||
|
||
When implicitly dropping a variable of type `T`, liveness requirements are computed as follows: | ||
- If `T` does not have any drop glue, do not add any requirements. | ||
- If `T` is a trait object, `T` has to be live. | ||
- If `T` has an explicit `Drop` impl, require all generic argument to be live, unless | ||
- they are marked with `#[may_dangle]`: | ||
- arguments for lifetime parameters marked `#[may_dangle]` and type parameters | ||
marked `#[may_dangle(must_not_use)]` are ignored, | ||
- we recurse into arguments for type parameters marked `#[may_dangle(droppable)]`. | ||
- Regardless of whether `T` implements `Drop`, recurse into all types *owned* by `T`: | ||
- references, raw pointers, function pointers, function items and scalars do not own | ||
anything. They can be trivially dropped. | ||
- tuples and arrays consider their element types to be owned. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. for arrays this is currently also a bit inconsistent, see rust-lang/rust#110288 and https://rust-lang.zulipchat.com/#narrow/stream/326866-t-types.2Fnominated/topic/.23110288.3A.20.60.5BT.3B.200.5D.60.20adding.20outlives.20requirements.20to.20dropck for details about this. Shouldn't matter for this RFC apart from deciding whether to add " |
||
- all fields (of all variants) of ADTs are considered owned. We consider all variants | ||
for enums. The only exception here is `ManuallyDrop<U>` which is not considered to own `U`. | ||
`PhantomData<U>` does not have any fields and therefore also does not consider | ||
`U` to be owned. | ||
- closures and generators own their captured upvars. | ||
|
||
Checking drop impls may error for generic parameters which are known to be incorrectly marked: | ||
- `#[may_dangle(must_not_use)]` parameters which are recursively owned | ||
- `#[may_dangle(droppable)]` parameters which are required to be live by a recursively owned type | ||
|
||
This cannot catch all misuses, as the parameters can be incorrectly used by the `Drop` impl itself. | ||
We therefore require the impl to be marked as `unsafe`. | ||
|
||
## How this differs from the status quo | ||
|
||
Right now there is only the `#[may_dangle]` attribute which skips the generic parameter. | ||
This is equivalent to the behavior of `#[may_dangle(must_not_use)]` and relies on the recursion | ||
into types owned by `T` to figure out the correct constraints. This is now explicitly annotated | ||
using `#[may_dangle(droppable)]`. | ||
|
||
`PhantomData<U>` currently considers `U` to be owned while not having drop glue itself. This means | ||
that `(PhantomData<PrintOnDrop<'s>>, String)` requires `'s` to be live while | ||
`(PhantomData<PrintOnDrop<'s>>, u32)` does not. This is required for get the | ||
behavior of `#[may_dangle(droppable)]` for parameters otherwise not owned by adding `PhantomData` | ||
as a field. One can easily forget this, which caused the | ||
[unsound](https://github.com/rust-lang/rust/issues/76367) | ||
[issues](https://github.com/rust-lang/rust/issues/99408) mentioned above. | ||
|
||
# Drawbacks | ||
[drawbacks]: #drawbacks | ||
|
||
This adds a small amount of implementation complexity to the compiler while not not being | ||
fully checked and therefore requiring `unsafe`. | ||
|
||
This RFC does not explicitly exclude stabilizing these two attributes, as they are clearer and far less | ||
dangerous to use when compared with `#[may_dangle]`. Stabilizing these attributes will make it harder to | ||
stabilize a more general solution like type state. | ||
|
||
# Rationale and alternatives | ||
[rationale-and-alternatives]: #rationale-and-alternatives | ||
|
||
The status quo of `#[may_dangle]` and "spooky-dropck-at-a-distance" is far from ideal and has already | ||
resulted in unsound issues. Documenting the current behavior makes is more difficult to change later | ||
while not officially documenting it is bound to lead to more issues and confusion going forward. | ||
It is therefore quite important to improve the status quo. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't buy this as an argument against documenting the status quo, given that this is all unstable. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The issue is that it is not, or well We have the stable behavior that use std::marker::PhantomData;
struct MyType<T, U> {
// _marker: PhantomData<T>, // errors
_marker: PhantomData<*const T>, // doesn't error
other: U,
}
impl<T, U> MyType<T, U> {
fn mk(_: T, other: U) -> Self {
MyType {
_marker: PhantomData,
other,
}
}
}
struct PrintOnDrop<'s>(&'s str);
impl<'s> Drop for PrintOnDrop<'s> {
fn drop(&mut self) {
println!("dropping `{}`", self.0)
}
}
fn main() {
let x;
{
let temp = String::from("temporary");
x = MyType::mk(PrintOnDrop(temp.as_str()), String::from("needs_drop"));
}
} While it should be fine™ to change this stable behavior even after documenting it as nobody should rely on it, it does make me uncomfortable. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ah I see. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
do you mean "we should keep the current behavior because it's a breaking change not to"? Or more "whether the current behavior is documented shouldn't change our decision on whether we should change it"?
wrt to this I pretty clearly feel like this change is a clarification of underspecified language semantics . I don't think we ever intended to have "does not need
I personally disagree with this perspective if that's what you intended to say. By documenting something publicly we declare that the current behavior is intended as is and something users may rely on. I do believe that the current state here is a bug. Dropping fields separately should be equivalent to dropping them as a pair. Once this lands I intend to add a check that There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
The latter. I am totally fine with changing the behavior. But I don't like "we failed to document it" as justification. Anyway this is probably more of a discussion for rust-lang/rust#103413? |
||
|
||
A more general extension to deal with partially invalid types is far from trivial. We currently | ||
assume types to always be well-formed and any approach which generalizes `#[may_dangle]` will | ||
have major consequences for how well-formedness is handled. This impacts many - often implicit - | ||
interactions and assumptions. It is highly unlikely that we will have the capacity for any such change | ||
in the near future. The benefits from such are change are also likely to be fairly limited while | ||
adding significant complexity. | ||
|
||
# Prior art | ||
[prior-art]: #prior-art | ||
|
||
`#[may_dangle]` is already a refinement of the previous `#[unsafe_destructor_blind_to_params]` attribute | ||
([RFC 1327](https://github.com/rust-lang/rfcs/pull/1327)). | ||
|
||
There is also [RFC 3390](https://github.com/rust-lang/rfcs/pull/3390) which attempts to define a more | ||
general extension to replace `#[may_dangle]`. As mentioned in the rationale, such an approach is not | ||
feasable right now. | ||
|
||
|
||
# Unresolved questions | ||
[unresolved-questions]: #unresolved-questions | ||
|
||
Should these attributes remain purely unstable for use in the standard library or do we want | ||
to provide them to stable users? | ||
|
||
|
||
# Future possibilities | ||
[future-possibilities]: #future-possibilities | ||
|
||
Extending or generalizing the dropck eyepatch... something something type state. | ||
|
||
[`IndexVec`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_index/vec/struct.IndexVec.html) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Re-phrasing the key point of this RFC at a slightly higher level, the argument seems to be: inferring the exact effect of
may_dangle
through field ownership is subtle and easy to get wrong, we should instead explicitly annotate what we want.I generally agree with this direction. However, this is a departure from what seems to me as a pretty fundamental design decision in Rust: to infer all sorts of properties of a type based on "which other types it owns". We do this for dropck, but also for auto traits and variance. This decisions leads to issues such as rust-lang/rust#99408 where changing the type of a field accidentally also changes this inferred property (that issue was
T
toManuallyDrop<T>
affecting dropck; I think we also had&mut T
to*mut T
affecting auto traits -- lucky enough erring on the safe side -- and it's easy to imagine similar situations with variance).I am not sure how I feel about attacking this fundamental problem in a piecemeal fashion. At the very least, this RFC should state somewhere that the question of inferring may_dangle dropck from field ownership is closely related to inferring auto traits and variance. And long-term, IMO it would be very odd to have this explicit annotation for may_dangle without also having something similar for auto traits and variance. Admittedly that's a lot harder, (a) since those are stable and (b) since we only want this more explicit style when unsafe code is involved -- which for may_dangle is 100% of the cases, but not so for auto traits and variance.
Put differently: if PhantomData is not fit for the purpose of may_dangle, why is it ever fit for any purpose? Given the arguments in this RFC, it seems wrong to keep that type around (well it will stay for backwards compatibility obviously). IMO we should only accept the RFC if we generally agree that PhantomData should be deprecated and replaced by more explicit annotations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we feel like there's no room for error with unsafe. inference isn't inherently wrong, but it should fail-safe if you forget something. that appears to hold for
Send
andSync
but not so much formay_dangle
.an alternative would be to validate the
Drop
impl based on the inferred bounds/outlives requirements. but that seems more awkward to use. and it doesn't solve "spooky-dropck-at-a-distance" (which is not unstable).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so. If you have a
*mut T
field and later decide to make it&mut
, you might be accidentally addingSend
/Sync
to your type without realizing. Similar examples exist for variance.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's not an example, that's an argument.
we can't think of an example where changing
*mut T
to&mut T
causes it to become unsound due toSend
/Sync
, but plenty of examples where it becomes unsound due to aliasing. we would love to see such an example tho.on the other hand variance is a footgun, from experience. at the same time, variance is fairly unintuitive, extremely verbose, and it shows up everywhere, so there's a huge added value from having it inferred by default. this is in contrast to
#[may_dangle]
which (on stable) only shows up on std collections and pointer wrappers.further,
#[may_dangle]
opts-in to inference (impl Drop
is the opt-out), whereas there's no way to opt-out of variance.but the main issue is still the mismatch between
needs_drop
andoutlives
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is such an example: rust-lang/rust#41622
Going to think more about that and will try to come back with an answer later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It wasn't a regression, but it was a case of accidental
Sync
inference:MutexGuard
used&mut
which propagatesSync
; if it had used*mut
(which arguably would be more correct as well since the lifetime of the reference is not accurate) the bug would have been avoided.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, so maybe we should have more opt-outs, but having an opt-out mixed with a partial opt-in seems like a massive footgun, especially if it's not actually checked for correctness.
But also if custom auto traits are ever to be a thing, how do they fit into all this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, here are my thoughts after thinking about this during lunch:
PhantomData<T>
is not appropriate for dropck is because it always implementsCopy
.I think it could be appropriate if it were to only implement
Copy
ifT
isCopy
. In this case for anyT
with drop glue we could require noop drop glue forPhantomData
with the outlives requirements ofT
. However, such aCopy
impl would be limiting other uses ofPhantomData
as it's quite useful to freely copy it around.Because it implements
Copy
we're currently stuck with a type which has liveness requirements when involved in a drop, even though it doesn't get dropped itself. I personally label this behavior to clearly be a bug, hence the effort to change it via this RFC (and rust-lang/rust#110288 for[T; 0]
).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So you're basically saying, PhantomData for dropck is 'even weirder' than for the other purposes. I can't disagree with that, for the specific
Copy
case you mention. However that's not the argument the RFC is making, so the RFC should be amended. It also still seems worth mentioning the relationship with auto traits and variance, even if the issue is more pressing for dropck.[T; 0]
isn't copy so I don't follow that argument. I'd rather special-case array length 0 in fewer cases than more. But anyway that's not part of this RFC.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated the motivation and mentioned auto traits + variance in the future possibilities section.