Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[NLL] prohibit "two-phase borrows" with existing borrows? #56254

Closed
nikomatsakis opened this issue Nov 26, 2018 · 39 comments · Fixed by #96268
Closed

[NLL] prohibit "two-phase borrows" with existing borrows? #56254

nikomatsakis opened this issue Nov 26, 2018 · 39 comments · Fixed by #96268
Assignees
Labels
A-NLL Area: Non-lexical lifetimes (NLL) NLL-sound Working towards the "invalid code does not compile" goal P-medium Medium priority T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@nikomatsakis
Copy link
Contributor

@RalfJung raised this example in which the "two-phase" borrow of x is compatible with a pre-existing share:

fn two_phase_overlapping1() {
    let mut x = vec![];
    let p = &x;
    x.push(p.len());
}

This poses a problem for stacked borrows, as well as for the potential refactoring of moving stacked borrows into MIR lowering (#53198) -- roughly for the same reason. It might be nice to change this, but -- if so -- we've got to move quick!

cc @arielb1 @pnkfelix

@nikomatsakis nikomatsakis added P-high High priority T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Nov 26, 2018
@nikomatsakis nikomatsakis self-assigned this Nov 26, 2018
@nikomatsakis nikomatsakis added the A-NLL Area: Non-lexical lifetimes (NLL) label Nov 26, 2018
@nikomatsakis nikomatsakis added this to the Rust 2018 Release milestone Nov 26, 2018
@nikomatsakis
Copy link
Contributor Author

(It's actually not clear if we would want to backport this -- ideally we would, but it's probably a corner case.)

@Centril Centril added I-nominated T-lang Relevant to the language team, which will review and decide on the PR/issue. labels Nov 27, 2018
@Centril
Copy link
Contributor

Centril commented Nov 27, 2018

Nominated for discussion on the next T-lang meeting since this seems to a affect the type system in observable ways and because I'd like to understand this better... provided that we can wait until Thursday... ;)

@nagisa
Copy link
Member

nagisa commented Nov 27, 2018

I only have theoretical knowledge of NLL’s implementation but it seems extremely hard to forbid this…?

@RalfJung
Copy link
Member

From what I hear it's actually easy, we just have an additional constraint that such that when the two-phase borrow starts, all existing loans for that ref get killed (like they usually would for a mutable ref).


The problem is the "fake read" desugaring we do to make sure that match arms cannot mutate the discriminee:

fn foo(x: Option<String>) {
  match x {
    Some(mut ref s) if s.starts_with("hello") => s.push_str(" world!"),
    _ => {},
  }
}

Becomes something like

_fake1 = &shallow x;
_fake2 = &(x as Some).0;
// switch on discriminant
s_for_guard = &mut2phase (x as Some).0;
s_for_guard_ref = &s_for_guard;
// guard, using *s_for_guard_ref instead of s
FakeRead(_fake1);
FakeRead (_fake2);
s = s_for_guard;
// Arm as usual

When s_for_guard is created, we create a new mutable ref to something that has outstanding shared refs.

@RalfJung
Copy link
Member

RalfJung commented Nov 28, 2018

I once proposed an alternative to this desugaring that avoids 2-phase-borrows. I was told (by @arielb1 and probably others) it doesn't work because it doesn't preserve pointer identity (if the guard compares addresses it'd notice), but I actually don't see why: I think all pointers are the same as in the desugaring above. Namely, we should do:

_fake1 = &shallow x;
_fake2 = &(x as Some).0;
// switch on discriminant
s_for_guard = &(x as Some).0;
s_for_guard_ref = fake_mut(&s_for_guard);
// guard, using *s_for_guard_ref instead of s
FakeRead(_fake1);
FakeRead (_fake2);
s = &mut (x as Some).0;
// Arm as usual

where fake_mut is

fn fake_mut<'a, 'b, T>(x: &'a &'b T) -> &'a &'b mut T {
  std::mem::transmute(x)
}

fake_mut is actually safe to call with any possible x. And the pointers are exactly the same as in the desugaring above. So why does this not work?

@arielb1
Copy link
Contributor

arielb1 commented Nov 28, 2018

@RalfJung

In this translation, addr_of(s_for_guard) != addr_of(s), while in the previous translation it can be. However, I'm not sure how important this property is, and in any case, addr_of(s_for_guard) != addr(s) today.

And if we really wanted to preserve this property, we could have s be a union between &T and &mut T.

@RalfJung
Copy link
Member

However, I'm not sure how important this property is, and in any case, addr_of(s_for_guard) != addr(s) today.

Okay, so we agree that my proposal doesn't break more than what we currently do -- but it might be harder to fix (if we care).

And if we really wanted to preserve this property, we could have s be a union between &T and &mut T

It would however still be the case that the mutable reference was created after the guard runs, which could be observable in terms of Stacked Borrows / LLVM noalias.

@arielb1
Copy link
Contributor

arielb1 commented Nov 28, 2018

Okay, so we agree that my proposal doesn't break more than what we currently do -- but it might be harder to fix (if we care).

Sure enough. So I think @RalfJung's solution (having an &&mut T -> &&T transmute, 2 addresses for ref/ref mut bindings in guards, and 2-phase borrows rejecting existing borrows) is actually fine.

@pnkfelix
Copy link
Member

pnkfelix commented Nov 28, 2018

seems best to be conservative (in terms of erring on the side of rejecting a larger set of sound programs) and do this now, if we can land a backportable patch in time for the edition.

(In other words, I'm in favor of moving forward on this proposal)

@pnkfelix pnkfelix added the NLL-sound Working towards the "invalid code does not compile" goal label Nov 28, 2018
@pnkfelix
Copy link
Member

pnkfelix commented Nov 29, 2018

triage: We discussed this in the NLL team meeting last night, and essentially decided that we think we will make a fix for this in the master branch without backporting it.

The main risk implied by that decision is that there may be 6 weeks of 2018 edition code that writes code similar to that of fn two_phase_overlapping that is subsequently outlawed by the subsequent version of Rust.

  • However, we also think there will be some pressure to issue a point release after the initial release anyway. So it may not be a full 6 weeks of code that we have to deal with.
  • Also, in case you thought that this case is rare in practice ... we discovered on PR [WIP] Make 2PB reservations conflict with shared borrows #56301 the cruel irony that the new code to test the Stacked Borrows model was, unbeknownst to its author (@RalfJung), actually "fundamentally relies on the thing @RalfJung was suggesting to not allow"

@pnkfelix
Copy link
Member

pnkfelix commented Nov 29, 2018

also, in terms of triage: I don't think we need to discuss this at the T-compiler meeting. It may or may not merit discussion at the T-lang meeting, given that WG-compiler-nll is planning to plug this hole; just not in a manner that we'd backport to beta...

@alexcrichton
Copy link
Member

Removing from the 2018 edition milestone due to #56254 (comment)

@alexcrichton alexcrichton removed this from the Rust 2018 Release milestone Nov 29, 2018
@scottmcm
Copy link
Member

scottmcm commented Nov 29, 2018

Lang team discussion:

  • Everyone seems fine with prohibiting this for now, though generally the the code was considered reasonable.
  • There was some concern about details of the stability promise here, and thus whether it's justifyable to disable it once it's in a stable release.

@joshtriplett
Copy link
Member

Is there some documentation of why exactly the model prohibits this, and what it would take to have a model that doesn't? Because in particular, it seems like you could have a region for p that ends after p.len() and then call x.push after that region ends.

@nikomatsakis
Copy link
Contributor Author

I feel a bit torn. The fact that we see various bits of code using this makes me a bit reluctant to rule it out, and a bit more inclined to consider extending stacked borrows to account for it, though it depends on how big of a mess results. It'd be nice however to do that evaluation without time pressure, which is what gives me some incentive to want to rule it out -- and then see what it takes to allow it again.

@joshtriplett
Copy link
Member

@nikomatsakis Likewise. I don't want to put anyone under pressure to cope with this, but I do think we ought to accept this code.

@RalfJung
Copy link
Member

RalfJung commented Dec 5, 2018

Is there some documentation of why exactly the model prohibits this, and what it would take to have a model that doesn't?

The basic model is described in this long blog-post of mine.

Model for Limited Two-Phase Borrows

The changes required to that model to support "limited two-phase borrows" (i.e., two-phase borrows that kill existing shared borrows when they are created) are trivial: When retagging for a two-phase borrow, we follow all the usual steps of retagging for a mutable borrow, and then we re-borrow the new borrow for a shared borrow. Done.

After let w = &2phase *v;, this means that I can read through both w and v (i.e., no "rewriting" that replaces v by w is required). The stack, roughly speaking, looks like [...; Uniq(v); Uniq(w); Shr]; frozen (I am abusing notation to avoid explicit timestamps). Reading through a mutable reference first checks if that reference still has a matching item on the stack (both of them do), and then it does not pop because reading from a frozen location does not require popping.

However, creating a mutable reference still acts like a "virtual write" to this location, which of course means that outstanding, existing borrows get killed. The reasoning for this is that after creating a mutable reference, we usually want to be sure that there are no aliases.

I have implemented this model (it is waiting for a PR to land), and it accepts this test-case. (The entire rest of the miri test suite does not seem to rely on two-phase borrows.)

Model for Unlimited Two-Phase Borrows

The problem with not killing existing shared borrows is that when the two-phase borrows is created, we cannot yet push it to the stack. It would have to "go below" the existing shared borrows so as to not invalidate them.

We could maybe put it directly above the item matching the borrow we re-borrow from. I don't like this for two reasons: First, it kills the stack discipline. Second, one can imagine several sound extensions of two-phase borrows (and indeed @arielb1 has imagined all of them^^) that this cannot scale to. For example, in this model, creating a new two-phase borrow would still have to kill outstanding two-phase borrows. Today it seems like you don't want to support this, but given that "it doesn't introduce UB in a naive translation to LLVM" seems to be a sufficient justification, I wouldn't be surprised if a year from now y'all come to me and said "uh, now we'd really rather like to support these other things as well" and then we have to re-design two-phase stacked borrows. ;)

I had another idea that should fix these problems. But I haven't had the time to implement it yet, so there may be unforeseen consequences. In this model, we add a new kind of "tag" that a pointer can carry: Beyond Uniq(Timestamp) and Shr(Timestamp), it can also carry TwoPhase { base: Timestamp, this: Timestamp }. Now we proceed as follows:

  • When creating a two-phase borrow, we tag it with TwoPhase and record the timestamp of the borrow this is created from, as well as the current timestamp. We do not push a new item to the stack, and we act like a read through the mutable borrow we are created from. This keeps existing shared borrows alive.
  • When reading through a two-phase borrow, we are happy finding a match for either of our two timestamps (or a Shr/frozen, that also always works).
  • When writing through a two-phase borrow, we pop until the first of our two timestamps appears. If that is this, we are good. If that is base, we just found the activation point, and we push this.

This means that two-phase borrows delay the pushing of "their" item (the activation) to their first write (where a re-borrow to a "full" &mut counts as a write), which I think is the intention.

The downsides of this model are:

  • Tags just got almost twice as big. If we ever want to have a valgrind implementation of this, that could become a serious issue. Valgrind currently supports up to twice as many bits of "metadata" attached to a value than the size of the value, so two ptr-sizes worth of a tag per ptr. We would have entirely used up this budget. From what I understand, this could be increased, but of course that comes at a performance cost. And even in miri, making pointers larger will increase the size of pretty much every data structure. There is a limit to how much complexity we can add to the model before checking it becomes infeasible, and I think checkability is a crucial requirement for any kind of UB we introduce (we should learn from C's mistakes instead of repeating them).
  • While this model supports situations such as "creating two 2-phase-reborrows and only deciding at run-time which one to write through (invalidating the other)", it may not always behave in the most permissive way when creating a 2-phase-borrow from a 2-phase-borrow: In that case, we have to either require that the older 2-phase borrow be activated first (killing outstanding shared references at that point), or the younger 2-phase-borrow would actually "borrow" from the original mutable reference the older one comes from (meaning that after activating the younger 2-phase borrow, the older one may not ever be written to again), or we do something dynamic depending on whether the older 2-phase borrow has already been activated (which would likely mean, if it hasn't been activated yet it may never be activated). Mitigating this would require adding a "list of parents" to the tag of a 2-phase-borrow or storing that somewhere out-of-band. Keep in mind that the more complicated we make the extra state we add for this, the more difficult it will be to reason about it and to keep it in your head as you are writing unsafe code / compiler optimizations.

Extreme proposal

One issue I have here is that the only constraint for 2-phase-borrows seems to be "it must be sound in a naive memory model". This is in direct conflict with the expressed desire to have stronger invariants for references, that unsafe code must also follow. Are there any constraints in terms of optimizations you still want to perform, or invariants you still want to uphold, that would put another limit on how far 2-phase-borrows are supposed to go?

Just to map out the design space a bit, here's a really extreme proposal that essentially gives up on tracking for 2-phase borrows: A 2-phase-borrow is just a copy of the mutable reference it is created from. This means that after let w = &2phase *v;, Stacked Borrows would treat v and w as equivalent, and unsafe could would be allowed to perform interleaved accesses through these pointers:

let w = &2phase *v;
let vraw = v as *mut _; // let's assume this wouldn't reborrow, but just cast
let wraw = w as *mut _; // let's assume this wouldn't reborrow, but just cast
// Now this is okay, as both pointers carry the same tag
*vraw = 4;
*wraw = 5;
*vraw = 6;
*wraw = 7;

The borrow checker would still have to impose some restrictions (writing to w could invalidate v if that points into an enum variant, for example), but 2-phase-borrows (better called "aliasing mutable borrows" at this point) would not be subject to any restrictions from Stacked Borrows.

I have not implemented this model, but I think it would be feasible. Of course, this would maximally pessimize compiler transformations as the two pointers would now indeed be allowed to alias. But that's what you get if you don't want to impose any limitations on how far 2-phase borrows should go :)

@pnkfelix pnkfelix self-assigned this Dec 12, 2018
@pnkfelix
Copy link
Member

T-compiler triage: I am planning on taking point on this during this week.

bors added a commit that referenced this issue Feb 25, 2019
Use normal mutable borrows in matches

`ref mut` borrows are currently two-phase with NLL enabled. This changes them to be proper mutable borrows. To accommodate this, first the position of fake borrows is changed:

```text
[ 1. Pre-match ]
       |
[ (old create fake borrows) ]
[ 2. Discriminant testing -- check discriminants ] <-+
       |                                             |
       | (once a specific arm is chosen)             |
       |                                             |
[ (old read fake borrows) ]                          |
[ 3. Create "guard bindings" for arm ]               |
[ (create fake borrows) ]                            |
       |                                             |
[ 4. Execute guard code ]                            |
[ (read fake borrows) ] --(guard is false)-----------+
       |
       | (guard results in true)
       |
[ 5. Create real bindings and execute arm ]
       |
[ Exit match ]
```

The following additional changes are made to accommodate `ref mut` bindings:

* We no longer create fake `Shared` borrows. These borrows are no longer needed for soundness, just to avoid some arguably strange cases.
* `Shallow` borrows no longer conflict with existing borrows, avoiding conflicting access between the guard borrow access and the `ref mut` borrow.

There is some further clean up done in this PR:

* Avoid the "later used here" note for Shallow borrows (since it's not relevant with the message provided)
* Make any use of a two-phase borrow activate it.
* Simplify the cleanup_post_borrowck passes into a single pass.

cc #56254

r? @nikomatsakis
bors added a commit that referenced this issue Feb 26, 2019
More restrictive 2 phase borrows - take 2

Another try at this. Currently changes to a hard error, but we probably want to change it to a lint instead.

cc #56254

r? @pnkfelix

cc @RalfJung @nikomatsakis
bors added a commit that referenced this issue Mar 5, 2019
More restrictive 2 phase borrows - take 2

Another try at this. Currently changes to a hard error, but we probably want to change it to a lint instead.

cc #56254

r? @pnkfelix

cc @RalfJung @nikomatsakis
@RalfJung
Copy link
Member

Also see the tracking issue at #59159

bors added a commit that referenced this issue Apr 7, 2019
More restrictive 2 phase borrows - take 2

Signal lint diagnostic `mutable_borrow_reservation_conflict` when borrow-check finds a 2-phase borrow's reservation overlapping with a shared borrow.

(pnkfelix updated description)

cc #56254 , #59159

blocks PR #59114

r? @pnkfelix

cc @RalfJung @nikomatsakis
@pnkfelix
Copy link
Member

Now that PR #58739 has landed, this is no longer P-high priority.

(I wouldn't call it "resolved" until we settle the question of what model we will end up using. But we may want to still close this issue and open a different one to track that work?)

In any case, re-prioritizing this as P-medium.

@pnkfelix pnkfelix added P-medium Medium priority and removed P-high High priority labels Apr 11, 2019
@RalfJung
Copy link
Member

RalfJung commented Apr 15, 2019

EDIT: Nvm, someone added the #[allow] exactly because of this and I am blind.^^ Srroy for the noise.

Are they really all prohibited? The following function is accepted, and this does look like an overlapping 2PB to me:

fn with_interior_mutability() {
    use std::cell::Cell;

    trait Thing: Sized {
        fn do_the_thing(&mut self, _s: i32) {}
    }

    impl<T> Thing for Cell<T> {}

    let mut x = Cell::new(1);
    let l = &x;

    #[allow(unknown_lints, mutable_borrow_reservation_conflict)]
    x
        .do_the_thing({
            x.set(3);
            l.set(4); // This is an example of overlapping 2PB! l was created before the implicit mutable borrow of x as receiver of this call.
            x.get() + l.get()
        })
    ;
}

@RalfJung
Copy link
Member

RalfJung commented Apr 17, 2019

FWIW, I was able to support the two-phase-borrows test cases with outstanding lones that @matthewjasper wrote in a refurbished version of Stacked Borrows (Miri PR: rust-lang/miri#695).

There'll be a blog post about this soon (TM). I am not sure if this is the model we want (in fact I think it is not, but for reasons mostly unrelated to two-phase borrows), and I have little idea of the consequences this will have on optimizations.

EDIT: The blog post is out

@RalfJung
Copy link
Member

@Manishearth discovered an intersting piece of code that was not accepted by Stacked Borrows 2.0:

fn unsafe_cell_2phase() { unsafe {
    let x = &UnsafeCell::new(vec![]); // Tag(0)
    // Stack: [0: SharedReadWrite]

    let x2 = &*x; // Tag(1)
    // Stack: [0: SharedReadWrite, 1: SharedReadWrite]

    (*x.get()).push(0); // The raw ptr returned by get: Tag(2)
    // Stack: [0: SharedReadWrite, 2: SharedReadWrite, 1: SharedReadWrite]
    // Then it gets two-phase-reborrows as a unique with Tag(3):
    // Stack: [0: SharedReadWrite, 2: SharedReadWrite, 3: Unique, 1: SharedReadWrite]
    // And then 3 gets reborrows as a "proper" unique with Tag(4)
    // Stack: [0: SharedReadWrite, 2: SharedReadWrite, 3: Unique, 4: Unique]

    // Now this is UB because x2's tag is not in the stack any more.
    let _val = (*x2.get()).get(0);
} }

I have annotated what happens with the stack. The issue is that we add the Unique tag "in the middle" of a bunch of existing SharedRW, and that's bad -- a block of consecutive SharedRW should be treated together, almost as if it was a single item.

I tried to fix this by supporting two-phase borrows "for real": I added a new TwoPhase kind of permission that, on the first write, turns into a Unique. I changed creating two-phase borrows such that the item gets added on top of a block of consecutive SharedReadWrite. That makes the above example pass. Unfortunately, it breaks another example involving two-phase borrows of interior mutable data (example courtesy of @matthewjasper):

fn with_interior_mutability() {
    use std::cell::Cell;

    trait Thing: Sized {
        fn do_the_thing(&mut self, _s: i32) {}
    }

    impl<T> Thing for Cell<T> {}

    let mut x = Cell::new(1);
    let l = &x;

    x
        .do_the_thing({
            x.set(3);
            l.set(4);
            x.get() + l.get()
        })
    ;
}

What happens here is that after creating the two-phase borrow, we end up with a stack like [x: Unique, x_for_do_the_thing: TwoPhase, l: SharedRW]. Then we do another (anonymous) shared reborrow for x (for x.set(3)), which gets added just above x, and we write to it, which kills x_for_do_the_thing and l because neither are part of the same block of consecutive SharedRW. I think @matthewjasper's scheme described above has the same problem.

The issue here is (and this kind of came up before already) that the stack just does not encode enough information about which pointer is derived from which. We'd want to know that l and x_for_do_the_thing are both direct children of x, such that adding more SharedRW-children does not affect any of the existing children (unless they are Unique).

So, for now I basically gave up and made two-phase borrows carry SharedRW permission. That's slightly better than the Raw proposal I made back with Stacked Borrows 1, because at least we still properly track the pointer and do not confuse it with raw pointers, but it does mean that you can read from the parent pointer even after the "activation point" (which is not a special point any more really) and still use the child two-phase pointer afterwards.

@Manishearth
Copy link
Member

Nightly miri now passes all of the elsa tests except for sync.rs (which uses threads, which miri doesn't like because of the dynamic loading)

@pnkfelix
Copy link
Member

Just a quick note: If we are still interested in gathering data about how much code was affected by adding the restriction to two-phase borrows, one source of data I had not considered is the set of commits that land that reference the lint's tracking issue #59159

(I count thirteen commits as of today that reference the lint issue.)

It won't be a complete list, of course, but it is a different data source than say crater.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-NLL Area: Non-lexical lifetimes (NLL) NLL-sound Working towards the "invalid code does not compile" goal P-medium Medium priority T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.