Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PhantomData: fix documentation wrt interaction with dropck #103413

Merged
merged 3 commits into from
May 13, 2023

Conversation

RalfJung
Copy link
Member

@RalfJung RalfJung commented Oct 22, 2022

As far as I could find out, the PhantomData-dropck interaction only affects code using may_dangle. The documentation in the standard library has not been updated for 8 years and thus stems from a time when Rust still used "parametric dropck", before RFC 1238. Back then what the docs said was correct, but with may_dangle dropck it stopped being entirely accurate and these days, with NLL, it is actively misleading.

Fixes #102810
Fixes #70841
Cc @nikomatsakis I hope what I am saying here is right.^^

@rustbot rustbot added the T-libs Relevant to the library team, which will review and decide on the PR/issue. label Oct 22, 2022
@rustbot
Copy link
Collaborator

rustbot commented Oct 22, 2022

Hey! It looks like you've submitted a new PR for the library teams!

If this PR contains changes to any rust-lang/rust public library APIs then please comment with @rustbot label +T-libs-api -T-libs to tag it appropriately. If this PR contains changes to any unstable APIs please edit the PR description to add a link to the relevant API Change Proposal or create one if you haven't already. If you're unsure where your change falls no worries, just leave it as is and the reviewer will take a look and make a decision to forward on if necessary.

Examples of T-libs-api changes:

  • Stabilizing library features
  • Introducing insta-stable changes such as new implementations of existing stable traits on existing stable types
  • Introducing new or changing existing unstable library APIs (excluding permanently unstable features / features without a tracking issue)
  • Changing public documentation in ways that create new stability guarantees
  • Changing observable runtime behavior of library APIs

@rust-highfive
Copy link
Collaborator

r? @scottmcm

(rust-highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Oct 22, 2022
@scottmcm scottmcm added T-lang Relevant to the language team, which will review and decide on the PR/issue. T-types Relevant to the types team, which will review and decide on the PR/issue. I-lang-nominated Nominated for discussion during a lang team meeting. I-types-nominated Nominated for discussion during a types team meeting. and removed T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Oct 23, 2022
@RalfJung
Copy link
Member Author

Looks like this was brought up more than 2 years ago already, in #70841. @pnkfelix pointed at the problematic docs paragraph that this PR is changing, and also raised the concern that

I do worry a little bit about the user's mental model for this case, however. The claim "PhantomData<T> is just like a T" doesn't quite hold up. (not that it ever did ...)

@RalfJung
Copy link
Member Author

I think this PR closes #70841. The main question that remains is about subtle types of the form I am sketching here. I should probably try actually coming up with that weird example...

@mu001999
Copy link
Contributor

It seems that the PhantomData-dropck interaction not only affects code using #[may_dangle]. Considering the following demo:

use std::marker::PhantomData;

struct Foo<T>(Box<i32>, PhantomData<T>);

struct Bar<T>(T);
impl<T> Drop for Bar<T> {
    fn drop(&mut self) {}
}

fn main() {
    let _foo;
    let s = String::from("Hello, world!");
    _foo = helper(&*s);
}

fn helper<'a>(_: &'a str) -> Foo<Bar<&'a str>> {
    Foo(Box::new(1), PhantomData)
}

Without impl<T> Drop for Foo<T> and #[may_dangle], the PhantomData introduces the dropck for Bar<&'a str>.

@SoniEx2
Copy link
Contributor

SoniEx2 commented Nov 23, 2022

our proposed MCP for dropck bounds might be useful to explain the PhantomData behaviour.

tho with our MCP's flexibility, we'd argue this should be changed in the future. so, don't rely on it.

(but hey, it is only an experiment so far)

@RalfJung
Copy link
Member Author

@mu001999 Interesting example... I first thought this was not dropck, but this compiles, indicating that it is indeed dropck.

Looks like we need a research project to even figure out our current dropck rules...

@TOETOE55
Copy link

TOETOE55 commented Nov 23, 2022

I outlined the current drop check rules for myself:

  1. Drop check occurs only on "need drop" variables in the current scope when they are dropping.

  2. Denote a dropping variable t and its type T, in the scope of 't.

    dropck(T, 't) :=

    1. if T isn't a "needs drop" type expect T == PhantomData, skip the drop check
    2. if T == PhantomData<P>, check dropck(P, 't)
    3. if T is dyn trait or impl trait, check T: 't
    4. if T is ADT with generic params <..., P, ...>,(P could be lifetime or type)
      1. if T impl Drop, check P: 't, unless P is marked as #[may_dangle]
      2. drop check for all fields of T recursively.

NB: all T: 't in drop check means "strictly outlive".


For the first situation , PhantomData<T>, ManuallyDrop<T> or Pending<T> variables never "need drop", so they won't do the drop check.

@danielhenrymantilla
Copy link
Contributor

danielhenrymantilla commented Nov 23, 2022

From all we've seen, I've been able to gather the following understanding of the dropck behavior.

Given a Wrapper<T> (which could be a struct Wrapper<T>, or any T-dependent type (e.g., (String, T)).

To illustrate, consider Wrapper<T> = Box<T>, and T being &'dangling_in_drop str or DisplayOnDrop<&'dangling_in_drop str>

Code example
struct Box<T> {
    ptr: *const T,
    // … (e.g., should we `PhantomData<T>` or not)
}

impl<T> Drop for Box<T> { /* or:
unsafe impl<#[may_dangle] T> Drop for Box<T> { */
    fn drop(&mut self) {
        unsafe { ptr::drop_in_place::<T>(self.ptr); } // no-op if `T` has no drop glue (so `T` could dangle).
        unsafe { alloc::dealloc(self.ptr,); } // does not use values of type `T`.
    }
}

/* e.g.
impl<'dangling_in_drop> for Box<&'dangling_in_drop str> {
// or
unsafe impl<#[may_dangle] 'dangling_in_drop> for Box<&'dangling_in_drop str> {

// as well as:
impl<'dangling_in_drop> for Box<DisplayOnDrop<&'dangling_in_drop str>> {
// or
unsafe impl<#[may_dangle] 'dangling_in_drop> for Box<DisplayOnDrop<&'dangling_in_drop str>> {
*/

// where
struct DisplayOnDrop<T : Display>(T);
impl<T : Display> Drop for DisplayOnDrop<T> {
    fn drop(&mut self) {
        println!("{}", self.0);
    }
}

fn main() {
    let (b1, b2);
    {
        let dropped_before_the_boxes = String::from("…");
        let thus_dangling_in_drop = &*dropped_before_the_boxes;

        b1 = Box::new(thus_dangling_in_drop);
        b2 = Box::new(DisplayOnDrop(thus_dangling_in_drop));
     } // `drop(dropped_before_the_boxes)`
     /* `thus_dangling_in_drop` is dangling here! */
} // `drop((b1, b2))`
  1. Ideally, we'd want b1 to be accepted since drop_in_place::<&str>() does nothing and thus &str being a &'dangling str is fine.

  2. For soundness, we need b2 to be rejected since drop_in_place::<DisplayOnDrop<&str>>() displays the given string, so it must not dangle.

The practical TL,DR with drop and PhantomData is that:

  • impl<T> Drop for Box<T> { achieves 2., but at the cost of breaking 1.;
  • unsafe impl<#[may_dangle] T> for Box<T> {, alone, achieves 1. but at the cost of breaking 2. (unsound!);
  • unsafe impl<#[may_dangle] T> for Box<T> {, coupled with a "T-owning field" (even a PhantomData<T> one) in Box, preserves 1. while bringing back 2., and thus, soundness.

But there are other edge cases which may require a more detailed exposition of the rules at play, here:


  • When does strict dropck (assume any use of T in drop) apply

    A.k.a., should T : 'drop_point have to hold? (this rejects both T = &'dangling str and T = DisplayOnDrop<&'dangling str>).

    This occurs when:

    • impl<T> Drop for …
    • type Wrapper<T> = dyn Bounds… or type Wrapper<T> = impl Bounds… (except when Bounds : Copy), since type erasure (be it dyn or abstract/impl) means the user of that type no longer knows whether there may be drop glue1

    Since this property is more restrictive than dropck(T, 'drop_point); the following dropck considerations do not matter in practice when we are in this case.

  • When does precise dropck (only use of T in drop is drop_in_place::<T>) apply

    A.k.a. should dropck(T, 'drop_point) have to hold? (this rejects T = DisplayOnDrop<&'dangling str> only, allowing T = &'dangling str).

    This occurs when both:

    • Wrapper<T> may have drop glue;

    • and T is structurally present in an owned fashion (e.g., T, PhantomData<T>, ManuallyDrop<T>) in Wrapper<T>. EDIT: ManuallyDrop<T> does not count as owning T ⚠️

  • These properties become structural, unless we run into any of these "root rules" at any layer.

Footnotes

  1. it could be nice to have a generalization of this Copy opt-out, since this caveat often makes existential types (such as -> impl Iterator) less ergonomic than their named counterparts (struct Iter<'slice, T>).

@SoniEx2
Copy link
Contributor

SoniEx2 commented Nov 23, 2022

nope, ManuallyDrop doesn't have drop glue, while PhantomData does. see also #103318 (comment)

@danielhenrymantilla
Copy link
Contributor

Ah, interesting, let me update the post. It does mean that "PhantomData<T> acts as ManuallyDrop<T>" (which I saw somewhere) is not accurate either 🥴

@SoniEx2
Copy link
Contributor

SoniEx2 commented Nov 23, 2022

we think it's best to describe these things in terms of bounds. it makes it really intuitive that way. here's hoping we get our MCP through and then we have bounds-based dropck.

and then we should no longer need "PhantomData is relevant for Drop" at all

@mu001999
Copy link
Contributor

mu001999 commented Nov 23, 2022

I outlined the current drop check rules for myself:

1. Drop check occurs **only** on  "need drop" variables in the current scope when they are dropping.

2. Denote a dropping variable `t` and its type `T`, in the scope of `'t`.
   `dropck(T, 't)` :=
   
   1. if `T` isn't a "needs drop" type expect `T == PhantomData`, skip the drop check
   2. if `T == PhantomData<P>`,  check `dropck(P, 't)`
   3. if `T` is `dyn trait` or `impl trait`, check `T: 't`
   4. if `T` is ADT with generic params `<..., P, ...>`,(`P` could be lifetime or type)
      
      1. if `T` impl `Drop`, check `P: 't`, unless `P` is marked as `#[may_dangle]`
      2. drop check for all fields of `T` recursively.

NB: all T: 't in drop check means "strictly outlive".

PhantomData<T>, ManuallyDrop<T> or Pending<T> variables never "need drop", so they won't do the drop check

These rules are outlined very clearly!

With these rules, it can be said that PhantomData<P> assumes the field(data) of type P is dropped if T "needs drop", impl<..., P, ...> Drop for T<..., P, ...> assumes data of type P is accessed in the T::drop, and #[may_dangle] P assumes data of type P won't be accessed in the T::drop.

In turn, PhantomData<P> introduces dropck for P and means T "owns" the data of type P if and only if T "needs drop", impl<..., P, ...> Drop for T<..., P, ...> requires P: 't, and #[may_dangle] P discards the requirement P: 't.

Thus, PhantomData<P> only cares if T "needs drop" and does not care about if #[may_dangle] P or not. #[may_dangle] P + PhantomData<P> only means that T::drop does not accesses data of type P but drops data of type P, e.g., T = Vec<P>, PhantomData<P> works fine according to the previous rules.

@TOETOE55
Copy link

This is the “needs drop” what I said above.
https://doc.rust-lang.org/stable/std/mem/fn.needs_drop.html

@TOETOE55
Copy link

TOETOE55 commented Nov 23, 2022

I’m still curious about this question, #70841 (comment)

If the following MyBox unsound or not:

struct MyBox<T>(NonNull<T>);

impl<T> Drop for MyBox<T> {}

From my understanding of dropck, considering MyBox<T> impl no #[may_dangle] Drop:

If it construct

  • without PhantomData<T>, the dropck would be:

    dropck(MyBox<T>, 'scope) := T: 'scope (strictly)

  • with PhantomData<T>:

    dropck(MyBox<T>, 'scope) := T: 'scope & dropck(T, 'scope) where dropck(PhantomData<T>, 'scope) := dropck(T, 'scope)

If you want to construct a ub from the former, it is equivalent to finding a T that satisfies T: 'scope but not dropck(T, 'scope), and causes ub when it destructs.

But I tried to construct such a T and failed. I can't even find a counterexample for T:' scope - > dropck(T, 'scope).

IMO, generic params in Drop impl/#[may_dangle] attribute/PhantomData<T> play a different role in drop checking:

  1. the normal generic param T in Drop impl, means that “I may touch T in drop”
  2. the generic param T marked as #[may_dangle], means that “I won’t touch T in drop”
  3. T wrapping in PhantomData from the field of a dropping value, means that “It may call drop_in_place::<T>() while the value is dropping”.

I think there’s a subtle difference between 1 and 3.(maybe 3 is weaker than 1, I’m not sure). We usually mark T as may_dangle and add a PhantomData<T> in our structure when implementing Drop.(i.e. Vec<T>).

@mu001999
Copy link
Contributor

@TOETOE55T From my understanding, T: 't is stricter than dropck(T, 't), because the latter requires recursively at most all generic params of T outlives 't. This is also why we may need PhantomData if we have used #[may_dangle].

@TOETOE55
Copy link

TOETOE55 commented Nov 26, 2022

I just glanced at NLL's RFC, layer-3-accommodating-dropck section https://rust-lang.github.io/rfcs/2094-nll.html#layer-3-accommodating-dropck

This operation executes the destructor for variable, effectively "de-initializing" the memory in which the value resides (if the variable -- or parts of the variable -- have already been dropped, then drop has no effect; this is not relevant to the current analysis)

But the status quo still do the dropck on variables are dropped: https://play.rust-lang.org/?version=stable&mode=release&edition=2021&gist=936fe411cc2d2eb720223c0fdc74f8f2

use std::marker::PhantomData;

struct Foo<T>(Box<i32>, PhantomData<T>);

struct Bar<T>(T);
impl<T> Drop for Bar<T> {
    fn drop(&mut self) {}
}

fn main() {
    let _foo;
    let s = String::from("Hello, world!");
    _foo = helper(&*s);
    std::mem::drop(_foo); // or `drop(_foo.0)
}

fn helper<'a>(_: &'a str) -> Foo<Bar<&'a str>> {
    Foo(Box::new(1), PhantomData)
}

@nikomatsakis
Copy link
Contributor

I think the correct rules are:

  • If the value's type has drop glue (needs proper definition) and a generic parameter T:
    • It is considered to drop a value of type T if...
      • the struct has a field which owns a T
      • there is phantom-data of type T
      • or the struct implements Drop
  • Note that phantom-data by itself does not have drop-glue, so when dropped it is a no-op to borrow check

Another way to put it:

  • The behavior of the DROP GLUE knows what gets dropped
    • but it also considers PhantomData values to be dropped

But I've not run any experiments, so maybe I'm missing something. Have to pull back in the examples we tried last time.

Per these rules:

  • PhantomData<T> is not considered to drop T, it has no drop-glue
  • a struct Foo<T> { d: T, e: String } is considered to drop T (because String has drop-glue)
  • and so is struct Foo<T> { d: PhantomData<T>, e: String }
  • but struct Foo<'a, T> { d: &'a T, e: String } is not (I think)
  • not sure about struct Foo<T> { d: *mut T, e: String } ..?

If a struct actually implements Drop, then it is always assumed to drop all the things (modulo the unstable feature). So if there is a impl Drop for Foo for any such examples, it would always be assumed to drop T.

@SoniEx2
Copy link
Contributor

SoniEx2 commented Dec 6, 2022

we should probably have an RFC for what to do here.

@RalfJung
Copy link
Member Author

RalfJung commented Dec 7, 2022 via email

@SoniEx2
Copy link
Contributor

SoniEx2 commented Dec 7, 2022

nod

we should probably have an RFC to remove the existing spooky action at a distance instead of documenting it.

@lcnr
Copy link
Contributor

lcnr commented Apr 13, 2023

@rfcbot concern resolve how we handle [T; 0] wrt to dropck

@rfcbot concern want to take another look at rust-lang/rfcs#3390 whether it changes anything about my reasoning here

@lcnr
Copy link
Contributor

lcnr commented Apr 13, 2023

@rfcbot resolved want to take another look at rust-lang/rfcs#3390 whether it changes anything about my reasoning here

rust-lang/rfcs#3390 (comment) still summarizes my feelings about this PR even after feeling like I now have a good understanding of dropck: "Spooky-dropck-at-a-distance" does not feel spooky anymore. It just feels unfortunate. Even if accepted, to my knowledge the RFC is not able to resolve the PhantomData behavior of having outlives constraints but not requiring drop itself 🤔

"having outlives constraints" is necessary for the sound usage of PhantomData.

We could change PhantomData back to requiring noop drop glue but that will also be inconsistent as it is Copy so it would lose the "needs to be live at drop" when used in generic functions. I don't think we want PhantomData to be Copy while having a liveness requirement when going out of scope is something we want.

I guess a potential alternative is to make PhantomData even less impactful wrt to drop and actually only adding outlives requirements when used as a field of a type with a manual drop impl, i.e. PhantomData actually only impacts manual Drop impls when they have #[may_dangle]. I would like to explore that route more.

@SoniEx2, would you be available for a sync meeting about this topic in general?

@rustbot concern potential way to remove "Spooky-dropck-at-a-distance"


@rustbot resolve resolve how we handle [T; 0] wrt to dropck

opened #110288

@SoniEx2
Copy link
Contributor

SoniEx2 commented Apr 13, 2023

@lcnr

I guess a potential alternative is to make PhantomData even less impactful wrt to drop and actually only adding outlives requirements when used as a field of a type with a manual drop impl, i.e. PhantomData actually only impacts manual Drop impls when they have #[may_dangle]. I would like to explore that route more.

this is effectively what rust-lang/rfcs#3390 proposes - tho it tries to do so in a much more formalized way, basically defining some syntax and semantics and a whole system around it, as well as trying to explain how the current and desired behaviours fit into the proposed model (in other words: in a rather roundabout way). but for stable code, yeah, the observable result would be the same. maybe we should've leaned harder on spooky-dropck-at-a-distance as motivation, given that it (and being opposed to this very pull request) is what led us to write the RFC in the first place (indeed, the zulip thread is literally called "deprecating spooky-dropck-at-a-distance")... but anyway!

we can try a sync meeting? uh we've never really done of those tho? so we're not sure what to expect or how to prepare for it? >.<

@lcnr

This comment was marked as duplicate.

@lcnr

This comment was marked as duplicate.

@lcnr
Copy link
Contributor

lcnr commented Apr 14, 2023

we can try a sync meeting? uh we've never really done of those tho? so we're not sure what to expect or how to prepare for it? >.<

wouldn't have needed to prepare anything for it 😀 mostly just wanted to have a quicker conversation because having conversations via github is a horrible experience in my opinion. going to instead use the existing zulip thread. https://rust-lang.zulipchat.com/#narrow/stream/213817-t-lang/topic/deprecating.20spooky-dropck-at-a-distance

@RalfJung
Copy link
Member Author

There's an argument in rust-lang/rfcs#3417 that we shouldn't stably document this behavior, i.e. we shouldn't land this PR, because we want to change this behavior anyway. I guess alternatively we could add a preface saying that this is not guaranteed? That makes one wonder about the point of the docs though.

Either way I guess I won't spend time on this PR until that discussion is resolved. Seems like a waste of time if we're not going to document this anyway.

Copy link
Contributor

@lcnr lcnr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we finally got to this issue in t-types triage: https://rust-lang.zulipchat.com/#narrow/stream/326132-t-types.2Fmeetings/topic/2023-05-08.20triage.20meeting/near/356728590

we intend to have a deep dive about the dropck rules and how to change them (rust-lang/types-team#92), but are open (some of us were in favor) of merging this documentation until then with the expectation that this does not block us from changing the subtle details of dropck later.

lgtm on the current documentation, but please add a note that this is not a stability guarantee. after that r=me

library/core/src/marker.rs Outdated Show resolved Hide resolved
/// The Nomicon discusses the need for [drop check in more detail][drop check].
///
/// To reject such code, the "drop check" analysis determines which types and lifetimes need to
/// still be live when `T` gets dropped:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please state somewhere that these exact rules may change in the future.

@lcnr lcnr removed the I-types-nominated Nominated for discussion during a types team meeting. label May 9, 2023
Co-authored-by: lcnr <rust@lcnr.de>
@RalfJung
Copy link
Member Author

I made it clear that these rules are not stably guaranteed.

Looks like we can finally land this. Thanks a lot to everyone involved!
@bors r=lcnr

@bors
Copy link
Contributor

bors commented May 12, 2023

📌 Commit b93fd83 has been approved by lcnr

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels May 12, 2023
@bors
Copy link
Contributor

bors commented May 13, 2023

⌛ Testing commit b93fd83 with merge 9850584...

@bors
Copy link
Contributor

bors commented May 13, 2023

☀️ Test successful - checks-actions
Approved by: lcnr
Pushing 9850584 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label May 13, 2023
@bors bors merged commit 9850584 into rust-lang:master May 13, 2023
@rustbot rustbot added this to the 1.71.0 milestone May 13, 2023
@rfcbot rfcbot removed proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. labels May 13, 2023
@rust-timer
Copy link
Collaborator

Finished benchmarking commit (9850584): comparison URL.

Overall result: ❌ regressions - no action needed

@rustbot label: -perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
0.6% [0.5%, 0.6%] 4
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) - - 0

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
1.3% [1.3%, 1.3%] 1
Regressions ❌
(secondary)
3.8% [2.4%, 5.2%] 2
Improvements ✅
(primary)
-3.0% [-3.0%, -3.0%] 1
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) -0.9% [-3.0%, 1.3%] 2

Cycles

This benchmark run did not return any relevant results for this metric.

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 659.199s -> 659.629s (0.07%)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-types Relevant to the types team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

PhantomData behavior is very surprising w.r.t. lack of Drop implications PhantomData<T> no longer dropck?