-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Immutably and implicitly borrow all pattern ids for their guards (NLL only) #49870
Immutably and implicitly borrow all pattern ids for their guards (NLL only) #49870
Conversation
r? @cramertj (rust_highfive has picked a reviewer for you, use r? to override) |
cc @arielb1 (normally I wouldn't ping someone who I think is quite busy with other parts of their life, but I've seen ariel chiming in on other PR's and I think he may be interested in doing so on this one.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did a first pass -- looking good -- but we need some tests, I suppose? (Or maybe I just didn't get to that commit, actually)
src/librustc_mir/build/block.rs
Outdated
@@ -113,7 +114,8 @@ impl<'a, 'gcx, 'tcx> Builder<'a, 'gcx, 'tcx> { | |||
// Declare the bindings, which may create a visibility scope. | |||
let remainder_span = remainder_scope.span(this.hir.tcx(), | |||
&this.hir.region_scope_tree); | |||
let scope = this.declare_bindings(None, remainder_span, lint_level, &pattern); | |||
let scope = this.declare_bindings(None, remainder_span, lint_level, &pattern, | |||
false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: a plain false
like this is always a bit unclear to me as a reader. Maybe we could put a newtype here, like declare_bindings(..., InGuard(false))
(and struct InGuard(bool)
)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or I should probably just reuse the enum
I introduced somewhere else in this PR...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(well, technically this boolean flag has a different meaning currently than the WithinGuard
/OutsideGuard
enum... but I'll figure something out. I too dislike plain booleans.)
block, binding.var_id, binding.span, OutsideGuard); | ||
self.schedule_drop_for_binding(binding.var_id, binding.span, OutsideGuard); | ||
let rvalue = Rvalue::Ref(region, borrow_kind, binding.source.clone()); | ||
self.cfg.push_assign(block, source_info, &local_for_arm_body, rvalue); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So before I thought we were going to need some sort of cast here -- I was trying to remember why -- I believe it was looking forward to the point where we try to maintain a shared borrow the value being matched over the whole match, in which case this first &mut
borrow would be illegal...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay let me look into this, see if I can make a test case that exposes the scenario you describe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe it was looking forward to the point where we try to maintain a shared borrow the value being matched over the whole match, in which case this first
&mut
borrow would be illegal...
Isn't the answer here is "this is why we have 2-phase borrows"? There's this cute example, which compiles today, is not a soundness issue, and works in the 2phi-borrow interpretation, but does not work if 2phi is not used correctly.
EDIT: now example is a bit stronger.
use std::cell::Cell;
fn main() {
let mut b = 4;
let a : Cell<Option<&u32>> = Cell::new(None);
match b {
ref mut btag if { a.set(Some(&*btag)); false } => {
println!("[guard] a={:?} b={}", a, btag);
}
ref mut btag2 => {
println!("[rest] a={:?} b={}", a, btag2);
}
}
}
match Some(&4) { | ||
None => {}, | ||
mut foo if foo.is_some() && false => { foo.take(); () } | ||
Some(s) => std::process::exit(*s), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
never mind I see test now :)
Does this solve rust-lang/rfcs#2036? |
☔ The latest upstream changes (presumably #49836) made this pull request unmergeable. Please resolve the merge conflicts. |
Ping from triage! Can @nikomatsakis (or someone else from @rust-lang/compiler) review this? |
r=me once nits are addressed |
// MIR-borrowck did not complain with earlier (universally | ||
// eager) MIR codegen, laziness might not be *necessary*. | ||
|
||
let autoref = self.hir.tcx().nll(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we generate different MIR in NLL and non-NLL modes? Is this because we don't want to inject semantic changes in stable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was my assumption -- that said, thinking more about this, if we are going to try and merge NLL down into a warning period, we are going to need the MIR generation to be the same both with and without NLL enabled.
// MIR-borrowck did not complain with earlier (universally | ||
// eager) MIR codegen, laziness might not be *necessary*. | ||
|
||
let autoref = self.hir.tcx().nll(); | ||
if let Some(guard) = candidate.guard { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for being late with this, but I feel that the "guard bit" mapping - figuring out whether a local should access the "arm" version or the dereference of the "guard" version - belongs more in HAIR than in MIR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. I don't have a strong opinion, but that seems reasonable. How do you envision it working? I could imagine having e.g. distinct HAIR nodes for "reference variable from guard" or something...?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I was thinking of was having variables in the HAIR refer to an enum
#[derive(Hash, PartialEq, Eq, Debug)]
enum HairVar {
Binding(ast::NodeId),
GuardRef(ast::NodeId)
}
MIR construction does not quite care about the exact NodeId
of variables, only the scope of their definitions etc., so this should work out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
distinct HIR nodes for "reference variable from guard"
The above option would mean that we put it right in HAIR construction. Of course, we could also put it in resolution (using 2 separate NodeId
s), but I understand the implications of that less.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(In case it wasn't clear, I meant "distinct HAIR nodes", I edited my comment -- but it sounds like we were thinking of similar things)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(at this point I would prefer to land this PR in the short term and come back later to the question of whether the guard bit mapping should part of the HAIR rather than the MIR... I will open an issue for that task once this has landed though...)
|
||
// Here is arielb1's basic example from rust-lang/rust#27282 | ||
// that AST borrowck is flummoxed by: | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does not fix the core of the problem (which is that the match discriminant is not properly protected). For example, I don't think that this variant is prevented:
fn should_reject_destructive_mutate_in_guard() {
let v = &mut Some(&4);
match *v {
None => {},
_ if {
(|| { let bar = v; bar.take() })();
false } => { },
Some(s) => std::process::exit(*s),
}
}
The "discriminant protection" probably belongs in a separate PR, so I would just remove the "27282" tag from this test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this was not intended to fix everything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just an FYI: I agree that the this did not put in the full fix to the core problem. However, the example outlined by @arielb1 is not accepted by the compiler (under NLL), even without this PR in place. In particular, you get these errors (playground):
error[E0382]: use of moved value: `*v`
--> ../../ariel-example-2-27282.rs:11:9
|
8 | (|| { let bar = v; bar.take() })();
| -------------------------------- value moved here
...
11 | Some(s) => std::process::exit(*s),
| ^^^^^^^ value used here after move
error[E0382]: use of moved value: `v.0`
--> ../../ariel-example-2-27282.rs:11:14
|
8 | (|| { let bar = v; bar.take() })();
| -------------------------------- value moved here
...
11 | Some(s) => std::process::exit(*s),
| ^ value used here after move
|
= note: move occurs because `v.0` has type `&i32`, which does not implement the `Copy` trait
I'm assuming that niko's r=me is retracted given arielb1's feedback. (I will try to address that aforementioned feedback after I finish rebasing.) |
…und in a mir-opt test.
…ed_value_does_not_live_long_enough`.
deref'ing such borrows within that guard. Review feedback: Add comment noting a point where we may or may not need to add a cast when we finish the work on rust-lang#27282. Review feedback: Pass a newtype'd `ArmHasGuard` rather than a raw boolean. Review feedback: toggle "ref binding in guards" semantics via specific method. (This should ease a follow-up PR that just unconditionally adopts the new semantics.)
guard expressions of matches (activated only when using new NLL mode). Review feedback: removed 27282 from filename. (The test still references it in a relevant comment in the file itself so that seemed like a reasonable compromise.)
1d37f66
to
28d18fa
Compare
okay I've addressed the crucial feedback from arielb1, and I think I've done so in a way that does not perturb the PR too significantly... so I think I'll re-add niko's r+ once this passes travis... (famous last words...) |
The job Click to expand the log.
I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact |
@bors r=nikomatsakis |
📌 Commit 930e76e has been approved by |
…in-guards, r=nikomatsakis Immutably and implicitly borrow all pattern ids for their guards (NLL only) This is an important piece of #27282. It applies only to NLL mode. It is a change to MIR codegen that is currently toggled on only when NLL is turned on. It thus affect MIR-borrowck but not the earlier static analyses (such as the type checker). This change makes it so that any pattern bindings of type T for a match arm will map to a `&T` within the context of the guard expression for that arm, but will continue to map to a `T` in the context of the arm body. To avoid surfacing this type distinction in the user source code (which would be a severe change to the language and would also require far more revision to the compiler internals), any occurrence of such an identifier in the guard expression will automatically get a deref op applied to it. So an input like: ```rust let place = (1, Foo::new()); match place { (1, foo) if inspect(foo) => feed(foo), ... } ``` will be treated as if it were really something like: ```rust let place = (1, Foo::new()); match place { (1, Foo { .. }) if { let tmp1 = &place.1; inspect(*tmp1) } => { let tmp2 = place.1; feed(tmp2) }, ... } ``` And an input like: ```rust let place = (2, Foo::new()); match place { (2, ref mut foo) if inspect(foo) => feed(foo), ... } ``` will be treated as if it were really something like: ```rust let place = (2, Foo::new()); match place { (2, Foo { .. }) if { let tmp1 = & &mut place.1; inspect(*tmp1) } => { let tmp2 = &mut place.1; feed(tmp2) }, ... } ``` In short, any pattern binding will always look like *some* kind of `&T` within the guard at least in terms of how the MIR-borrowck views it, and this will ensure that guard expressions cannot mutate their the match inputs via such bindings. (It also ensures that guard expressions can at most *copy* values from such bindings; non-Copy things cannot be moved via these pattern bindings in guard expressions, since one cannot move out of a `&T`.)
☀️ Test successful - status-appveyor, status-travis |
I am now realizing that this change is probably very relevant to RFC 107 i.e. rust-lang/rfcs#107. |
For some reason, allowing restricted mutation in match arms exposed an obvious case where a unique borrow can indeed fail, namely something like: ```rust match b { ... ref mut r if { (|| { let bar = &mut *r; **bar = false; })(); false } => { &mut *r } // ~~~~~~~ // | // This ends up holding a `&unique` borrow of `r`, but there ends up being an // implicit shared borrow in the guard thanks to rust-lang#49870 ... } ```
…ake-three, r=nikomatsakis every match arm reads the match's borrowed input This PR changes the `match` codegen under NLL (and just NLL, at least for now) to make the following adjustments: * It adds a `-Z disable-ast-check-for-mutation-in-guard` which, as described, turns off the naive (conservative but also not 100% sound) check for mutation in guards of match arms. * We now borrow the match input at the outset and emit a special `ReadForMatch` statement (that, according to the *static* semantics, reads that borrowed match input) at the start of each match arm. The intent here is to catch cases where the match guard mutates the match input, either via an independent borrow or via `ref mut` borrows in that arm's pattern. * In order to ensure that `ref mut` borrows do not actually conflict with the emitted `ReadForMatch` statements, I expanded the two-phase-borrow system slightly, and also changed the MIR code gen so that under NLL, when there is a guard on a match arm, then each pattern variable ends up having *three* temporaries associated with it: 1. The first temporary will hold the substructure being matched; this is what we will move the (substructural) value into *if* the guard succeeds. 2. The second temporary also corresponds to the same value as the first, but we are just constructing this temporarily for use during the scope of the guard; it is unaliased and its sole referrer is the third temporary. 3. The third temporary is a reference to the second temporary. * (This sounds complicated, I know, but its actually *simpler* than what I was doing before and had checked into the repo, which was to sometimes construct the final value and then take a reference to it before evaluating the guard. See also PR #49870.) Fix #27282 This also provides a path towards resolving #24535 aka rust-lang/rfcs#1006, at least once the `-Z disable-ast-check-for-mutation-in-guard` is just turned on by default (under NLL, that is. It is not sound under AST-borrowck). * But I did not want to make `#![feature(nll)]` imply that as part of this PR; that seemed like too drastic a change to me.
This is an important piece of #27282.
It applies only to NLL mode. It is a change to MIR codegen that is currently toggled on only when NLL is turned on. It thus affect MIR-borrowck but not the earlier static analyses (such as the type checker).
This change makes it so that any pattern bindings of type T for a match arm will map to a
&T
within the context of the guard expression for that arm, but will continue to map to aT
in the context of the arm body.To avoid surfacing this type distinction in the user source code (which would be a severe change to the language and would also require far more revision to the compiler internals), any occurrence of such an identifier in the guard expression will automatically get a deref op applied to it.
So an input like:
will be treated as if it were really something like:
And an input like:
will be treated as if it were really something like:
In short, any pattern binding will always look like some kind of
&T
within the guard at least in terms of how the MIR-borrowck views it, and this will ensure that guard expressions cannot mutate their the match inputs via such bindings. (It also ensures that guard expressions can at most copy values from such bindings; non-Copy things cannot be moved via these pattern bindings in guard expressions, since one cannot move out of a&T
.)