Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve HRTB error span when -Zno-leak-check is used #63678

Merged
merged 4 commits into from
Oct 3, 2019

Conversation

Aaron1011
Copy link
Member

As described in #57374, NLL currently produces unhelpful higher-ranked
trait bound (HRTB) errors when '-Zno-leak-check' is enabled.

This PR tackles one half of this issue - making the error message point
at the proper span. The error message itself is still the very generic
"higher-ranked subtype error", but this can be improved in a follow-up
PR.

The root cause of the bad spans lies in how NLL attempts to compute the
'blamed' region, for which it will retrieve a span for.
Consider the following code, which (correctly) does not compile:

let my_val: u8 = 25;
let a: &u8 = &my_val;
let b = a;
let c = b;
let d: &'static u8 = c;

This will cause NLL to generate the following subtype constraints:

d :< c
c :< b
b <: a

Since normal Rust lifetimes are covariant, this results in the following
region constraints (I'm using 'd to denote the lifetime of 'd',
'c to denote the lifetime of 'c, etc.):

'c: 'd
'b: 'c
'a: 'b

From this, we can derive that 'a: 'd holds, which implies that 'a: 'static
must hold. However, this is not the case, since 'a refers to 'my_val',
which does not outlive the current function.

When NLL attempts to infer regions for this code, it will see that the
region 'a has grown 'too large' - it will be inferred to outlive
'static, despite the fact that is not declared as outliving 'static
We can find the region responsible, 'd, by starting at the end of
the 'constraint chain' we generated above. This works because for normal
(non-higher-ranked) lifetimes, we generally build up a 'chain' of
lifetime constraints away from the original variable/lifetime.
That is, our original lifetime 'a is required to outlive progressively
more regions. If it ends up living for too long, we can look at the
'end' of this chain to determine the 'most recent' usage that caused
the lifetime to grow too large.

However, this logic does not work correctly when higher-ranked trait
bounds (HRTBs) come into play. This is because HRTBs have
contravariance with respect to their bound regions. For example,
this code snippet compiles:

let a: for<'a> fn(&'a ()) = |_| {};
let b: fn(&'static ()) = a;

Here, we require that 'a' is a subtype of 'b'. Because of
contravariance, we end up with the region constraint 'static: 'a,
not 'a: 'static

This means that our 'constraint chains' grow in the opposite direction
of 'normal lifetime' constraint chains. As we introduce subtypes, our
lifetime ends up being outlived by other lifetimes, rather than
outliving other lifetimes. Therefore, starting at the end of the
'constraint chain' will cause us to 'blame' a lifetime close to the original
definition of a variable, instead of close to where the bad lifetime
constraint is introduced.

This PR improves how we select the region to blame for 'too large'
universal lifetimes, when bound lifetimes are involved. If the region
we're checking is a 'placeholder' region (e.g. the region 'a' in
for<'a>, or the implicit region in fn(&())), we start traversing the
constraint chain from the beginning, rather than the end.

There are two (maybe more) different ways we generate region constraints for NLL:
requirements generated from trait queries, and requirements generated
from MIR subtype constraints. While the former always use explicit
placeholder regions, the latter is more tricky. In order to implement
contravariance for HRTBs, TypeRelating replaces placeholder regions with
existential regions. This requires us to keep track of whether or not an
existential region was originally a placeholder region. When we look for
a region to blame, we check if our starting region is either a
placeholder region or is an existential region created from a
placeholder region. If so, we start iterating from the beginning of the
constraint chain, rather than the end.

@rust-highfive
Copy link
Collaborator

r? @eddyb

(rust_highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Aug 18, 2019
@Aaron1011
Copy link
Member Author

r? @nikomatsakis

@rust-highfive rust-highfive assigned nikomatsakis and unassigned eddyb Aug 18, 2019
@Aaron1011
Copy link
Member Author

When I wrote tis PR description, I incorrectly believed that contravariance applied to all HRTBs - it actually only applys to functions. However, I believe that my reasoning about 'constraint chains' is still correct, due to the way that we generate outlives requirements when HRTBs are involved.

@Centril
Copy link
Contributor

Centril commented Aug 22, 2019

@Aaron1011 That's because the terminology is misleading here. for<'a> fn(&'a u8) does not involve so called "Higher Rank Trait Bounds" because there are no trait bounds involved. Rather, for<'a> fn(&'a u8) is rank-1 polymorphic type and fn(for<'a> fn(&'a u8)) is a rank-2 polymorphic type (making it higher ranked).

@nikomatsakis
Copy link
Contributor

@Aaron1011 thanks for the detailed PR description. Very helpful. I do have a few minor nits or clarifications regarding terminology.

As you noted, variance and higher-ranked aren't really connected. Variance really depends on the structure of a type: so if you have fn(T), it is contravariant with respect to T, even if no higher-ranked things are involved at all.

Second, you use placeholders here to refer to multiple things. I try to use placeholders in one specific way: to refer to a free (not bound) lifetime that is meant to represent "any lifetime". Indeed when you have a type for<'a> fn(&'a u32), we will (at times) use placeholders to represent 'a, but the 'a in that type is not a placeholder -- it's a bound lifetime.

Part of the reason to make that distinction is precisely because, in some cases, we replace bound lifetimes with existential lifetimes (as you noted) and in some cases we replace them with placeholders. Which one we use will depend on where the binder appears. For example here, where the binder is in the subtype:

for<'a> fn(&'a u32) <: fn(&'static u32)

we would replace 'a with an existential region. This is because the function has said it accepts a parameter with any lifetime -- therefore, we can convert it to a function fn(&'b u32) for any single lifetime 'b. So we use an inference variable to mean "find the correct lifetime to replace the bound lifetime 'a with".

In contrast, a binder in the supertype position means something else:

fn(&'static u32) <: for<'a>fn(&'a u32)

Here we are converting into a type with a binder. In that case, we replace 'a with a placeholder, indicating that the type we are converting in must work with any lifetime we could pick (which, in this case, is not true, as it requires specifically 'static).

If you combine those things you get that (for example):

for<'a> fn(&'a u32) <: for<'b> fn(&'b u32)

because the variable we create for 'a is inferred to be equal to the placeholder we created for 'b (and universes, in turn, tell us that the variable for 'a can indeed be equal to that placeholder).

@nikomatsakis
Copy link
Contributor

One other thing:

d :< c

the inverse of <: is typically written d :> c, though usually I prefer to just write c <: d =)

In any case I think either :> or >: would be ok, but it's important to use >, since the idea is that the "subtype" is "smaller" than the "supertype" (as a subtype represents a narrower range of values).

@nikomatsakis
Copy link
Contributor

Anyway, I've scheduled this PR review for September 4th, since I don't think I'll get to it today and it looks like it's going to take some concentration. @Aaron1011 I'll try to answer your questions on Zulip then =)

@Aaron1011
Copy link
Member Author

Aaron1011 commented Aug 31, 2019

@nikomatsakis: Thanks for the feedback! I still don't have a very strong grasp of higher-ranked lifetimes, so I'm sorry if my terminology is unclear.

When I was referring to 'placeholders', I meant to refer specifically to NLLRegionVariableOrigin::Placeholder. As you mentioned, these can get replaced with existential lifetimes, but (I think) it makes sense for us to treat them like placeholder regions for the purposes of determining which span to blame.

Copy link
Contributor

@nikomatsakis nikomatsakis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @Aaron1011! This is good stuff. I left some comments and some questions. I'm pretty happy and mostly looking to leave a few comments and try to tweak the names of things. I think I see why this change results in better results, but I realized as I was trying to write it out that I didn't quite see, and hence I'm doing a local build to try and dig a bit through the logs.

Thanks for looking at this!

@@ -410,15 +410,17 @@ pub enum NLLRegionVariableOrigin {
/// from a `for<'a> T` binder). Meant to represent "any region".
Placeholder(ty::PlaceholderRegion),

Existential,
Existential {
was_placeholder: bool
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is quite the right name, per my prior comments. This existential didn't come from a placeholder, or at least not placeholder in the (rather narrow) sense that I aim to use the term. I'm trying to decide what name I would prefer, though. Perhaps something like from_hr_binders? I guess that's pretty obscure. Perhaps from_forall or from_for_binder or just from_for? It would then include a comment like:

Existential {
    /// If this is true, then this variable was created to represent a lifetime
    /// bound in a `for` binder. For example, it might have been created to
    /// represent the lifetime `'a` in a type like `for<'a> fn(&'a u32)`.
    /// Such variables are created when we are trying to figure out if there
    /// is any valid instantiation of `'a` that could fit into some scenario.
    ///
    /// This is used to inform error reporting: in the case that we are trying to
    /// determine whether there is any valid instantiation of a `'a` variable that meets
    /// some constraint C, we want to blame the "source" of that `for` type, 
    /// rather than blaming the source of the constraint C.
    from_forall: bool,
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason for me being persnickety here is that there is a subtle distinction between the region in a for<'a> binder and the placeholder that we (sometimes) instantiate it with, and I'd prefer not to blur those lines, since it can be hard enough to explain as is.

let constraint = path[i];
let mut range = 0..path.len();

let should_reverse = match from_region_origin {
Copy link
Contributor

@nikomatsakis nikomatsakis Sep 4, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should call this blame_source, and we should add a comment like this:


As noted above, when reporting an error, there is typically a chain of constraints leading from some "source" region which must outlive some "target" region. In most cases, we prefer to "blame" the constraints closer to the target -- but there is one exception. When constraints arise from higher-ranked subtyping, we generally prefer to blame the source value, as the "target" in this case tends to be some type annotation that the user gave. Therefore, if we find that the region origin is some instantiation of a higher-ranked region, we start our search from the "source" point rather than the "target", and we also tweak a few other things.

An example might be this bit of Rust code:

let x: fn(&'static ()) = |_| {};
let y: for<'a> fn(&'a ()) = x;

In MIR, this will be converted into a combination of assignments and type ascriptions. In particular, the 'static is imposed through a type ascription:

x = ...;
AscribeUserType(x, fn(&'static ())
y = x;

We wind up ultimately with constraints like

!a: 'temp1 // from the `y = x` statement
'temp1: 'temp2
'temp2: 'static // from the AscribeUserType

and here we prefer to blame the source (the y = x statement).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK I dug into what's going on. I'm not 100% convinced this heuristic of reversing source/target will apply super broadly -- it seems like we will ultimately find a more base way to express, but this explanation of what is happening should help us when trying to find that.

true
}
NLLRegionVariableOrigin::Placeholder(_)
| NLLRegionVariableOrigin::Existential { was_placeholder: true } => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to test this myself, but maybe you know the answer -- what changes when was_placeholder is ignored here? which tests?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of the new tests should start failing. I can't remember which of them actually triggered this logic, but any test that causes us to replace a NLLRegionVariableOrigin::Placeholder with a NLLRegionVariableOrigin::Existential will have an incorrect error span.

@hdhoang hdhoang added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Sep 6, 2019
@JohnCSimon
Copy link
Member

Ping from triage
@Aaron1011 Just letting you know this hasn't been touched in 10 days.
cc: @nikomatsakis

Thanks!

@bors
Copy link
Contributor

bors commented Sep 17, 2019

☔ The latest upstream changes (presumably #64535) made this pull request unmergeable. Please resolve the merge conflicts.

@nikomatsakis
Copy link
Contributor

FYI: I scheduled some time for me to do follow-up work on this tomorrow, September 19th.

@hdhoang
Copy link
Contributor

hdhoang commented Sep 27, 2019

ping from triage @Aaron1011 & @nikomatsakis, any progress update? Thanks!

@nikomatsakis
Copy link
Contributor

Ugh. I think I never made it to my scheduled time, sorry @Aaron1011

@nikomatsakis
Copy link
Contributor

I see that what I had wanted to do was mostly to look through the logs. Let me see if I can do that.

@nikomatsakis
Copy link
Contributor

r=me if rebased and with comments suggested

As described in rust-lang#57374, NLL currently produces unhelpful higher-ranked
trait bound (HRTB) errors when '-Zno-leak-check' is enabled.

This PR tackles one half of this issue - making the error message point
at the proper span. The error message itself is still the very generic
"higher-ranked subtype error", but this can be improved in a follow-up
PR.

The root cause of the bad spans lies in how NLL attempts to compute the
'blamed' region, for which it will retrieve a span for.
Consider the following code, which (correctly) does not compile:

```rust
let my_val: u8 = 25;
let a: &u8 = &my_val;
let b = a;
let c = b;
let d: &'static u8 = c;
```

This will cause NLL to generate the following subtype constraints:

d :< c
c :< b
b <: a

Since normal Rust lifetimes are covariant, this results in the following
region constraints (I'm using 'd to denote the lifetime of 'd',
'c to denote the lifetime of 'c, etc.):

'c: 'd
'b: 'c
'a: 'b

From this, we can derive that 'a: 'd holds, which implies that 'a: 'static
must hold. However, this is not the case, since 'a refers to 'my_val',
which does not outlive the current function.

When NLL attempts to infer regions for this code, it will see that the
region 'a has grown 'too large' - it will be inferred to outlive
'static, despite the fact that is not declared as outliving 'static
We can find the region responsible, 'd, by starting at the *end* of
the 'constraint chain' we generated above. This works because for normal
(non-higher-ranked) lifetimes, we generally build up a 'chain' of
lifetime constraints *away* from the original variable/lifetime.
That is, our original lifetime 'a is required to outlive progressively
more regions. If it ends up living for too long, we can look at the
'end' of this chain to determine the 'most recent' usage that caused
the lifetime to grow too large.

However, this logic does not work correctly when higher-ranked trait
bounds (HRTBs) come into play. This is because HRTBs have
*contravariance* with respect to their bound regions. For example,
this code snippet compiles:

```rust
let a: for<'a> fn(&'a ()) = |_| {};
let b: fn(&'static ()) = a;
```

Here, we require that 'a' is a subtype of 'b'. Because of
contravariance, we end up with the region constraint 'static: 'a,
*not* 'a: 'static

This means that our 'constraint chains' grow in the opposite direction
of 'normal lifetime' constraint chains. As we introduce subtypes, our
lifetime ends up being outlived by other lifetimes, rather than
outliving other lifetimes. Therefore, starting at the end of the
'constraint chain' will cause us to 'blame' a lifetime close to the original
definition of a variable, instead of close to where the bad lifetime
constraint is introduced.

This PR improves how we select the region to blame for 'too large'
universal lifetimes, when bound lifetimes are involved. If the region
we're checking is a 'placeholder' region (e.g. the region 'a' in
for<'a>, or the implicit region in fn(&())), we start traversing the
constraint chain from the beginning, rather than the end.

There are two (maybe more) different ways we generate region constraints for NLL:
requirements generated from trait queries, and requirements generated
from MIR subtype constraints. While the former always use explicit
placeholder regions, the latter is more tricky. In order to implement
contravariance for HRTBs, TypeRelating replaces placeholder regions with
existential regions. This requires us to keep track of whether or not an
existential region was originally a placeholder region. When we look for
a region to blame, we check if our starting region is either a
placeholder region or is an existential region created from a
placeholder region. If so, we start iterating from the beginning of the
constraint chain, rather than the end.
@Aaron1011
Copy link
Member Author

@nikomatsakis: Updated

@nikomatsakis
Copy link
Contributor

@bors r+

@bors
Copy link
Contributor

bors commented Oct 3, 2019

📌 Commit ba54ef8 has been approved by nikomatsakis

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Oct 3, 2019
@bors
Copy link
Contributor

bors commented Oct 3, 2019

⌛ Testing commit ba54ef8 with merge b5623b34edc2a29a7e2e5b25ccfa8c27e5994d61...

Centril added a commit to Centril/rust that referenced this pull request Oct 3, 2019
…akis

Improve HRTB error span when -Zno-leak-check is used

As described in rust-lang#57374, NLL currently produces unhelpful higher-ranked
trait bound (HRTB) errors when '-Zno-leak-check' is enabled.

This PR tackles one half of this issue - making the error message point
at the proper span. The error message itself is still the very generic
"higher-ranked subtype error", but this can be improved in a follow-up
PR.

The root cause of the bad spans lies in how NLL attempts to compute the
'blamed' region, for which it will retrieve a span for.
Consider the following code, which (correctly) does not compile:

```rust
let my_val: u8 = 25;
let a: &u8 = &my_val;
let b = a;
let c = b;
let d: &'static u8 = c;
```

This will cause NLL to generate the following subtype constraints:

d :< c
c :< b
b <: a

Since normal Rust lifetimes are covariant, this results in the following
region constraints (I'm using 'd to denote the lifetime of 'd',
'c to denote the lifetime of 'c, etc.):

'c: 'd
'b: 'c
'a: 'b

From this, we can derive that 'a: 'd holds, which implies that 'a: 'static
must hold. However, this is not the case, since 'a refers to 'my_val',
which does not outlive the current function.

When NLL attempts to infer regions for this code, it will see that the
region 'a has grown 'too large' - it will be inferred to outlive
'static, despite the fact that is not declared as outliving 'static
We can find the region responsible, 'd, by starting at the *end* of
the 'constraint chain' we generated above. This works because for normal
(non-higher-ranked) lifetimes, we generally build up a 'chain' of
lifetime constraints *away* from the original variable/lifetime.
That is, our original lifetime 'a is required to outlive progressively
more regions. If it ends up living for too long, we can look at the
'end' of this chain to determine the 'most recent' usage that caused
the lifetime to grow too large.

However, this logic does not work correctly when higher-ranked trait
bounds (HRTBs) come into play. This is because HRTBs have
*contravariance* with respect to their bound regions. For example,
this code snippet compiles:

```rust
let a: for<'a> fn(&'a ()) = |_| {};
let b: fn(&'static ()) = a;
```

Here, we require that 'a' is a subtype of 'b'. Because of
contravariance, we end up with the region constraint 'static: 'a,
*not* 'a: 'static

This means that our 'constraint chains' grow in the opposite direction
of 'normal lifetime' constraint chains. As we introduce subtypes, our
lifetime ends up being outlived by other lifetimes, rather than
outliving other lifetimes. Therefore, starting at the end of the
'constraint chain' will cause us to 'blame' a lifetime close to the original
definition of a variable, instead of close to where the bad lifetime
constraint is introduced.

This PR improves how we select the region to blame for 'too large'
universal lifetimes, when bound lifetimes are involved. If the region
we're checking is a 'placeholder' region (e.g. the region 'a' in
for<'a>, or the implicit region in fn(&())), we start traversing the
constraint chain from the beginning, rather than the end.

There are two (maybe more) different ways we generate region constraints for NLL:
requirements generated from trait queries, and requirements generated
from MIR subtype constraints. While the former always use explicit
placeholder regions, the latter is more tricky. In order to implement
contravariance for HRTBs, TypeRelating replaces placeholder regions with
existential regions. This requires us to keep track of whether or not an
existential region was originally a placeholder region. When we look for
a region to blame, we check if our starting region is either a
placeholder region or is an existential region created from a
placeholder region. If so, we start iterating from the beginning of the
constraint chain, rather than the end.
@Centril
Copy link
Contributor

Centril commented Oct 3, 2019

@bors retry rolled up.

bors added a commit that referenced this pull request Oct 3, 2019
Rollup of 7 pull requests

Successful merges:

 - #63678 (Improve HRTB error span when -Zno-leak-check is used)
 - #64931 (Reword E0392 slightly)
 - #64959 (syntax: improve parameter without type suggestions)
 - #64975 (Implement Clone::clone_from for LinkedList)
 - #64993 (BacktraceStatus: add Eq impl)
 - #64998 (Filter out RLS output directories on tidy runs)
 - #65010 (Compare `primary` with maximum of `children`s' line num instead of dropping it)

Failed merges:

r? @ghost
@bors
Copy link
Contributor

bors commented Oct 3, 2019

⌛ Testing commit ba54ef8 with merge c6293e3...

@bors bors merged commit ba54ef8 into rust-lang:master Oct 3, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants