Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ch10-03 Explanation of lifetime bounds is confusing/incorrect #3235

Open
2 tasks done
d0sboots opened this issue Jun 22, 2022 · 33 comments
Open
2 tasks done

ch10-03 Explanation of lifetime bounds is confusing/incorrect #3235

d0sboots opened this issue Jun 22, 2022 · 33 comments
Milestone

Comments

@d0sboots
Copy link

  • I have checked the latest main branch to see if this has already been fixed
  • I have searched existing issues and pull requests for duplicates

This overlaps a bit with #1710, but IMO it's a separate issue.

URL to the section(s) of the book with this problem:
https://github.com/rust-lang/book/blob/main/src/ch10-03-lifetime-syntax.md

Description of the problem:

The function signature now tells Rust that for some lifetime 'a, the function takes two parameters, both of which are string slices that live at least as long as lifetime 'a. The function signature also tells Rust that the string slice returned from the function will live at least as long as lifetime 'a.

This isn't true as I understand it. For instance, suppose the lifetime 'a only lasts until immediately after program start. Then (vacuously), both of the function's parameters will live at least as long as that lifetime (they definitely live longer), and the slice returned from the function will also live at least as long as 'a. By the description given, this should be a valid lifetime for the parameter 'a, but it's clearly not - the semantics given might be necessary, but are nowhere near sufficient.

Suggested fix:
I'm a Rust newbie, so I have no idea what the correct lifetime is. I tried finding the wording in the spec, to no avail.

@carols10cents
Copy link
Member

For instance, suppose the lifetime 'a only lasts until immediately after program start.

Can you construct an example that illustrates this?

@carols10cents carols10cents added this to the ch10 milestone Jun 28, 2022
@d0sboots
Copy link
Author

The example was listed in the book itself:

fn longest<'a>(x: &'a str, y: &'a str) -> &'a str {
    if x.len() > y.len() {
        x
    } else {
        y
    }
}

Obviously the lifetime annotations work correctly in practice. But given the description in the book, they don't: 'a could last only until program start. This is because of the 2nd sentence:

The function signature also tells Rust that the string slice returned from the function will live at least as long as lifetime 'a.

If 'a only lasts until program start, this condition is satisfied, despite it not guaranteeing anything useful about the lifetimes of x, y, or the return value. I think maybe this sentence was supposed to be: "The function signature also tells Rust that lifetime 'a will live at least as long as the string slice returned from the function."

@carols10cents
Copy link
Member

carols10cents commented Jun 28, 2022

No, can you give me an example of a main function where the lifetime of the value only lasts until program start? That is, I don't understand what you mean by "If 'a only lasts until program start", could you please construct that scenario in concrete code for me?

@d0sboots
Copy link
Author

I feel like we're talking past each other here.

I can't construct that scenario for you in code, because such a scenario can't exist. That's because there's nothing wrong with lifetimes in terms of how they actually work, so any code example would work just fine (because the only way you could interpret it would be according to the actual rules that Rust uses.)

What this bug is about is about the explanation of lifetimes, as given in the book. In other words, if we take the book's explanation as canonical for how lifetimes work, where does that lead, logically? And I'm saying that the explanation is flawed.

The example I gave has "the lifetime of 'a only lasting until program start." I don't know what that would mean, concretely. That's because the book hasn't given any formal semantics to the meaning of the lifetime parameters; up until this point it has just given rules determining how the length of them relate to the lifetime of actual variables and return values. All I know is that the scenario I laid out ("the lifetime of 'a only lasting until program start") is logically consistent with the book's explanation, despite being nonsensical in terms of practical meaning.

@d0sboots
Copy link
Author

If you're still hung up on the "only until program start" bit, here's another way of looking at it:

'a could also have a lifetime that ends immediately before the function is called. That would also be logically consistent with the explanation, and is also nonsense in terms of actually making sense for lifetimes.

@carols10cents
Copy link
Member

'a could also have a lifetime that ends immediately before the function is called. That would also be logically consistent with the explanation, and is also nonsense in terms of actually making sense for lifetimes.

It can't, though, because if it ends then you don't have a value to pass to the function. https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=b6cb1ecf171f98d721d8fcbfccbe98b6

fn main() {
    let s1 = String::from("hi");
    let s2 = String::from("bye");
    drop(s1); // This ends the lifetime of s1
    drop(s2); // This ends the lifetime of s2
    let s3 = longest(&s1, &s2); // You can try to make 'a be a lifetime that has ended, but it won't compile
}

results in

error[[E0382]](https://doc.rust-lang.org/stable/error-index.html#E0382): borrow of moved value: `s1`
 --> src/main.rs:6:22
  |
2 |     let s1 = String::from("hi");
  |         -- move occurs because `s1` has type `String`, which does not implement the `Copy` trait
3 |     let s2 = String::from("bye");
4 |     drop(s1);
  |          -- value moved here
5 |     drop(s2);
6 |     let s3 = longest(&s1, &s2);
  |                      ^^^ value borrowed here after move

error[[E0382]](https://doc.rust-lang.org/stable/error-index.html#E0382): borrow of moved value: `s2`
 --> src/main.rs:6:27
  |
3 |     let s2 = String::from("bye");
  |         -- move occurs because `s2` has type `String`, which does not implement the `Copy` trait
4 |     drop(s1);
5 |     drop(s2);
  |          -- value moved here
6 |     let s3 = longest(&s1, &s2);
  |                           ^^^ value borrowed here after move

I don't understand why you're trying to understand the formal semantics of lifetimes by using situations that can't exist?

@carols10cents
Copy link
Member

That's because the book hasn't given any formal semantics to the meaning of the lifetime parameters; up until this point it has just given rules determining how the length of them relate to the lifetime of actual variables and return values.

The formal semantics of the meaning of the lifetime parameters is that they describe the relationship of the input parameters' lifetimes to the output's lifetime, as explained here:

Lifetime annotations don’t change how long any of the references live. Rather,
they describe the relationships of the lifetimes of multiple references to each
other without affecting the lifetimes. Just as functions can accept any type
when the signature specifies a generic type parameter, functions can accept
references with any lifetime by specifying a generic lifetime parameter.

I don't understand what you're looking for, exactly, and why this paragraph isn't it.

@d0sboots
Copy link
Author

'a could also have a lifetime that ends immediately before the function is called. That would also be logically consistent with the explanation, and is also nonsense in terms of actually making sense for lifetimes.

It can't, though, because if it ends then you don't have a value to pass to the function. https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=b6cb1ecf171f98d721d8fcbfccbe98b6

fn main() {
    let s1 = String::from("hi");
    let s2 = String::from("bye");
    drop(s1); // This ends the lifetime of s1
    drop(s2); // This ends the lifetime of s2
    let s3 = longest(&s1, &s2); // You can try to make 'a be a lifetime that has ended, but it won't compile
}

This example doesn't show anything (directly) about the lifetime parameter 'a. It shows that the lifetimes of s1 and s2 can't end before the function, but the lifetimes of the input arguments are a different thing than 'a. There is no way to prove (or disprove) a particular value for a lifetime parameter with code, because AFAIK Rust never emits diagnostics that mention the value of lifetime parameters, only the lifetimes of concrete references.

The formal semantics of the meaning of the lifetime parameters is that they describe the relationship of the input parameters' lifetimes to the output's lifetime, as explained here:

Lifetime annotations don’t change how long any of the references live. Rather,
they describe the relationships of the lifetimes of multiple references to each
other without affecting the lifetimes. Just as functions can accept any type
when the signature specifies a generic type parameter, functions can accept
references with any lifetime by specifying a generic lifetime parameter.

This gets to the heart of the issue. Lifetime parameters don't have any intrinsic meaning on their own; the only meaning they have is in relating the lifetimes of concrete references. So, it is perfectly sensible to talk about hypotheticals where "'a has a lifetime that ends immediately after the program starts," or "'a has a lifetime that ends before the function is called," because that's not a claim that any specific reference has that property.

In the context of how Rust actually works, these hypothetical lifetimes would be impossible for 'a, because they would imply lifetimes for the concrete references that are also impossible. But I'm not talking about how Rust actually works, but the language in the book. If you go by the semantics as described in the book, these lifetimes are possible.

@carols10cents
Copy link
Member

carols10cents commented Jul 1, 2022

Let's try a different direction, just with generic type parameters for a moment. Given this example:

fn something<T: Copy>(input: T) -> T {
    // ...
}

This says the function accepts some type T as the parameter input and returns that same type T, and that whatever the type T ends up being, it must implement the Copy trait.

When you say:

So, it is perfectly sensible to talk about hypotheticals where "'a has a lifetime that ends immediately after the program starts," or "'a has a lifetime that ends before the function is called,"

It's akin to saying "let's talk about a hypothetical where the something function returns a String" (a type that doesn't implement the Copy trait), and I'm saying there's no reason to talk about that because it's impossible, given the constraints that the something function specifies.

Lifetime parameters get filled in with concrete lifetimes when the functions are used, and those lifetimes must be valid ones. I wonder if maybe that's the problem? That the book doesn't specifically say in this spot that all references must be valid? (It's said elsewhere).

@carols10cents
Copy link
Member

carols10cents commented Jul 1, 2022

but the lifetimes of the input arguments are a different thing than 'a

What makes you say this, exactly? The lifetime parameter 'a is absolutely describing the lifetimes of the input arguments.

Making an analogy with generics again, it sounds like here you're saying because, say, i32 also implements the PartialEq trait, then the traits implemented by the input argument if you pass an i32 value to the something function is a different thing than the generic type parameter T that implements the Copy trait. What I'm trying to say is that T: Copy is still describing the i32 argument even if they're "different".

@carols10cents
Copy link
Member

because AFAIK Rust never emits diagnostics that mention the value of lifetime parameters, only the lifetimes of concrete references.

I have an example of diagnostics that mention the value of lifetime parameters, here:

fn longest<'a, 'b>(x: &'a str, y: &'b str) -> &'a str {
    y
}

results in

error[[E0623]](https://doc.rust-lang.org/stable/error-index.html#E0623): lifetime mismatch
 --> src/lib.rs:2:5
  |
1 | fn longest<'a, 'b>(x: &'a str, y: &'b str) -> &'a str {
  |                                   -------     -------
  |                                   |
  |                                   this parameter and the return type are declared with different lifetimes...
2 |     y
  |     ^ ...but data from `y` is returned here

@d0sboots
Copy link
Author

d0sboots commented Jul 1, 2022

I agree, let's try a different direction. What does this sentence (from the book) mean, to you?

The function signature also tells Rust that the string slice returned from the function will live at least as long as lifetime 'a.

I think we might be closing in on the confusion with your example of generic types, though. You seem to think generic types and lifetime parameters work the same way. I don't. I suspect you're correct (since I'm new to Rust), so let me try to explain how I see the difference, which comes from the explanation in the book.

fn something<T: Copy>(input: T) -> T {
    // ...
}

As you said, in this case the type parameter T and the type of input must be the same. Also, T (and thus input) is bounded by needing to implement Copy.

fn longest<'a>(x: &'a str, y: &'a str) -> &'a str {
    // ...
}

The function signature now tells Rust that for some lifetime 'a, the function takes two parameters, both of which are string slices that live at least as long as lifetime 'a. The function signature also tells Rust that the string slice returned from the function will live at least as long as lifetime 'a.

There is a crucial difference here: x and y do not have lifetimes that are the same as 'a, but rather have lifetimes that are at least as long as lifetime 'a. Or at least, that is what the words from the book seem to literally say.

Similarly, right now the book says the return type will live at least as long as lifetime 'a. It's not the same as 'a.

Does this clear things up?

@d0sboots
Copy link
Author

d0sboots commented Jul 1, 2022

I have an example of diagnostics that mention the value of lifetime parameters, here:

fn longest<'a, 'b>(x: &'a str, y: &'b str) -> &'a str {
    y
}

results in

error[[E0623]](https://doc.rust-lang.org/stable/error-index.html#E0623): lifetime mismatch
 --> src/lib.rs:2:5
  |
1 | fn longest<'a, 'b>(x: &'a str, y: &'b str) -> &'a str {
  |                                   -------     -------
  |                                   |
  |                                   this parameter and the return type are declared with different lifetimes...
2 |     y
  |     ^ ...but data from `y` is returned here

This still doesn't mention the value of lifetime parameters. It highlights that the lifetime parameters are different, but it doesn't say what their values are (i.e. what the actual lifetimes of those parameters are). This might seem like a minor quibble, but if I'm right about how lifetime parameters work it's essentially impossible for diagnostics to emit a concrete value for a lifetime parameter, because they don't actually have one. (I.e. they're only used to bound the lifetimes of concrete references, which do have concrete lifetimes.)

@d0sboots
Copy link
Author

d0sboots commented Jul 1, 2022

Actually, I know lifetimes can't work the same as generics, because of this example:

fn longest<'a>(x: &'a str, y: &'a str) -> &'a str {
    if x.len() > y.len() {
        x
    } else {
        y
    }
}

If 'a were the same lifetime as x, it would also have to be the same lifetime as y. So transitively, x and y would have to be the same lifetime. But we know that it's possible to pass references with different lifetimes to this function.

@olalonde
Copy link

olalonde commented Jul 14, 2022

FWIW, I agree that those sentences are confusing. For me the more confusing sentence is:

The function signature also tells Rust that the string slice returned from the function will live at least as long as lifetime 'a.

Let's take this example (which doesn't compile due to a borrow checker error):

fn main() {
    let string1 = String::from("long string is long");
    let result;
    {
        let string2 = String::from("xyz");
        result = longest(string1.as_str(), string2.as_str());
    }
    println!("The longest string is {}", result);
}

It is explained later in the book that the concrete value of 'a is the lifetime where the lifetimes of the two arguments overlap.

So the concrete lifetime of 'a ends at the end of the inner scope. The lifetime of result starts at its declaration and ends at the end of the outer scope.

Now if that sentence is correct, then result will live at least as long as lifetime 'a, which it does. But this is actually an invalid program, so some constraint is missing. The truth is that result can live at most as long as lifetime 'a, which is why it fails to compile. I feel like this sentence really should be changed to:

The function signature also tells Rust that the string slice returned from the function will live at most as long as lifetime 'a.

@carols10cents
Copy link
Member

There is a crucial difference here: x and y do not have lifetimes that are the same as 'a, but rather have lifetimes that are at least as long as lifetime 'a. Or at least, that is what the words from the book seem to literally say.

Similarly, right now the book says the return type will live at least as long as lifetime 'a. It's not the same as 'a.

Aha, I think I see a missing piece here. Lifetimes are similar to generics, but they aren't exactly the same, that's true. The 'a generic lifetime doesn't get filled in with the exact concrete lifetime as any of the references it's associated with: 'a becomes the overlap of all the associated references' lifetimes. That's what this is attempting to say:

In practice, it means that the lifetime of the reference returned by the
longest function is the same as the smaller of the lifetimes of the values
referred to by the function arguments.

And this:

Note that the longest function doesn’t need to
know exactly how long x and y will live, only that some scope can be
substituted for 'a that will satisfy this signature.

So take this example, listing 10-23:

fn main() {
    let string1 = String::from("long string is long");  // Lifetime of string1 starts here

    {
        let string2 = String::from("xyz"); // Lifetime of string2 starts here
        let result = longest(string1.as_str(), string2.as_str());
        println!("The longest string is {}", result);
    } // Lifetime of string2 ends here
} // Lifetime of string1 ends here

When this main calls the longest function, the specific lifetime that gets substituted in for 'a will be the places where both string1 and string2 are valid. Because the lifetime of string1 completely encompasses the lifetime of string2, that means the concrete value for 'a ends up being string2's lifetime.

Another way to say this is that result (the output from longest) will be valid as long as string2 is valid -- result can live at least as long as string2.

This still doesn't mention the value of lifetime parameters.

Ah, I was confused. I thought you were looking for diagnostics that used a literal 'a. I see now that you're looking for an error message that shows the concrete lifetime that gets substituted in for 'a. And you're correct there, I can't think of error messages off the top of my head that will point out what 'a ends up resolving to.

There have been some projects attempting to provide visualization of lifetimes; as you can imagine, it gets complex. Here are some examples:

I feel like this sentence really should be changed to:

The function signature also tells Rust that the string slice returned from the function will live at most as long as lifetime 'a.

No, "at least" is correct and "at most" is incorrect. There's a long discussion in this PR, but the short story is that 'static can substitute in for 'a and then the returned references lifetime will last longer than what 'a guarantees, so "at least as long as 'a" is valid. Consider this valid implementation of longest:

fn longest<'a>(x: &'a str, y: &'a str) -> &'a str {
    if x == "hello" { // this condition is arbitrary
        "surprise!" // the lifetime of string literals is 'static
    } else if x.len() > y.len() {
        x
    } else {
        y
    }
}

@olalonde
Copy link

olalonde commented Jul 14, 2022

No, "at least" is correct and "at most" is incorrect. #1875, but the short story is that 'static can substitute in for 'a and then the returned references lifetime will last longer than what 'a guarantees, so "at least as long as 'a" is valid.

This is confusing. I mean your interpretation is trivially true, why even mention it? Of course, the return value lives at least as long as its reference parameters... it's basically impossible to construct a function where this isn't true (afaik). At least I can't imagine how it would be possible. In let x = fn(y), how can x possibly have a shorter lifetime than y?

And that sentence doesn't help the reader understand why the following code fails the borrow checking:

fn main() {
    let x: String = String::from("foo");
    let y: String = String::from("bazz");
    let z: &str = longest(&x, &y);
    drop(x);
    // borrow checker complains, z's lifetime exceeded 'a
    dbg!(z);
}

My understanding of the function signature is that the lifetime of the variable which gets assigned the return value (z) must be at most 'a. This explains why the above code fails. But the two sentences in the book do not explain that.

Anyways, I'm giving my perspective as someone who is reading the book for the first time. I will try to read the thread you linked to.

@olalonde
Copy link

Well, I'm still not convinced... Tend to agree with @mulkieran

@carols10cents
Copy link
Member

it's basically impossible to construct a function where this isn't true (afaik). At least I can't imagine how it would be
possible. In let x = fn(y), how can x possibly have a shorter lifetime than y?

There's a lot going on here. In let x = fn(y), there's no indication of the types of x and y, whether they're references or not, whether they implement Copy or not, so I don't really understand the point you're trying to make? And the example implementation of longest that I gave where in one case it returns a string literal-- the string literal absolutely lives longer than the arguments, because the lifetime 'static is the longest lifetime, so it always lives longer than any arbitrary 'a.

Here are the lifetimes annotated in the example you provided:

fn main() {
    let x: String = String::from("foo"); // Lifetime of x starts here
    let y: String = String::from("bazz"); // Lifetime of y starts here, 'a starts here
    let z: &str = longest(&x, &y);
    drop(x); // Lifetime of x ends here, 'a ends here
    // borrow checker complains, z's lifetime exceeded 'a
    dbg!(z);
} // Lifetime of y ends here

So what the signature of longest says is that the return value is only guaranteed to live as long as 'a, where 'a is the lifetime where all of the arguments annotated with 'a are valid. The return value might live longer than 'a! But the compiler only sees that the function signature guarantees it lives as long as 'a, so anything outside of 'a must be marked as invalid.

I'm still not sure what to change about the book because I'm not going to change it to something incorrect. I am interested in making it more clear, but I'm not sure what that is yet.

@d0sboots
Copy link
Author

d0sboots commented Jul 15, 2022

Yes! I feel like we're finally getting through to each other. Especially now that we're on the same page that generics and lifetimes don't work the same.

There have been some projects attempting to provide visualization of lifetimes; as you can imagine, it gets complex. Here are some examples:
...

To be clear, I'm not looking to visualize or even get concrete values for the lifetime of 'a, because as I understand it, 'a isn't even a "thing" that has a concrete lifetime. Rather, it's a set of inequalities or constraints on the lifetimes of real objects, and we give that collection of constraints the name 'a.

What I'm after is understanding the rules for determining the constraints, because I haven't seen them specified anywhere. This bug is talking about how the rules that are mentioned, are definitely incomplete:

The function signature now tells Rust that for some lifetime 'a, the function takes two parameters, both of which are string slices that live at least as long as lifetime 'a. The function signature also tells Rust that the string slice returned from the function will live at least as long as lifetime 'a.

Specified formally:
For each parameter x, the end of x's lifetime >= the end of 'a.
The end of the return value's lifetime >= the end of 'a.

...And that's it. That's all the rules we are given so far. You'll notice that there's no rules that bound the beginning of anything's lifetime, and there's no rules that bound the end of lifetimes from above. Without those rules, we (and the borrow checker) don't have enough information to rule programs as valid or invalid. I guess my point is, there are clearly additional (formal) rules, and I want to know what they are. :)

@d0sboots
Copy link
Author

Here are the lifetimes annotated in the example you provided:

fn main() {
    let x: String = String::from("foo"); // Lifetime of x starts here
    let y: String = String::from("bazz"); // Lifetime of y starts here, 'a starts here
    let z: &str = longest(&x, &y);
    drop(x); // Lifetime of x ends here, 'a ends here
    // borrow checker complains, z's lifetime exceeded 'a
    dbg!(z);
} // Lifetime of y ends here

How did you determine that 'a started and ended at those particular lines? That is the information that I think is currently missing from the book.

@olalonde
Copy link

olalonde commented Jul 15, 2022

Agree with @d0sboots comments.

So what the signature of longest says is that the return value is only guaranteed to live as long as 'a, where 'a is the lifetime where all of the arguments annotated with 'a are valid. The return value might live longer than 'a!

Well, I think we both agree on that. My contention is that the book doesn't do a good job of explaining that. That sentence you wrote is already better in my opinion.

Let's back up a bit.

The point of lifetimes (and this section of the book) is to determine when it is safe to use a reference. As the caller of a function, a return lifetime 'a, tells me that I cannot safely use the return value beyond 'a. That's what the function signature tells me. I don't care whether the value it points to could potentially live longer than 'a (although it's true that it could). I have to assume that it doesn't and that's what the borrow checker assumes as well.

In other words, even if the referenced value can live longer than 'a, it must be assumed that it lives at most 'a.

That is why I found the current phrasing confusing. While it is saying technically true (e.g. yes the value of the return could live longer than 'a), we can't assume that it does.

The function signature now tells Rust that for some lifetime 'a, the function takes two parameters, both of which are string slices that live at least as long as lifetime 'a. The function signature also tells Rust that the string slice returned from the function will live at least as long as lifetime 'a.

Those sentences are technically true. But when you read them for the first time, you are left to wonder: ok so what? This doesn't help me know when the reference will become invalid. The value can live beyond 'a, but how long? Those lifetimes don't seem very useful... scratches head

After all this discussion I now understand that lives at least as long as 'a is the same as lives at least as long as 'a (but actually don't use it beyond 'a because there's no guarantee it will live beyond it) but when I first read the book, this wasn't clear.

I maintain that this should be rephrased, the sentence you wrote is already better IMO.

@olalonde
Copy link

olalonde commented Jul 15, 2022

How did you determine that 'a started and ended at those particular lines? That is the information that I think is currently missing from the book.

It's explained a bit later in the chapter (perhaps not very formally) as being the lifetime where the lifetimes of the parameters overlap (in practice, the lifetime of the parameter with the shortest lifetime).

When we pass concrete references to longest, the concrete lifetime that is substituted for 'a is the part of the scope of x that overlaps with the scope of y. In other words, the generic lifetime 'a will get the concrete lifetime that is equal to the smaller of the lifetimes of x and y. Because we’ve annotated the returned reference with the same lifetime parameter 'a, the returned reference will also be valid for the length of the smaller of the lifetimes of x and y.

@d0sboots
Copy link
Author

How did you determine that 'a started and ended at those particular lines? That is the information that I think is currently missing from the book.

It's explained a bit later in the chapter (perhaps not very formally) as being the lifetime where the lifetimes of the parameters overlap (in practice, the lifetime of the parameter with the shortest lifetime).

In that case, can we just say that the lifetime of 'a will be the intersection of the lifetimes of all the parameters? It's short, unambiguous, and formally well-defined. If it also happens to be correct, then it seems perfect.

@fubupc
Copy link

fubupc commented Sep 1, 2022

I think the confusion comes from the term lifetime has two meanings in the book:

  1. scope of reference itself ( from reference's declaration to last used line). For example, The Borrow Checker section says:
fn main() {
    let r;                // ---------+-- 'a
                          //          |
    {                     //          |
        let x = 5;        // -+-- 'b  |
        r = &x;           //  |       |
    }                     // -+       |
                          //          |
    println!("r: {}", r); //          |
}                         // ---------+

Here, we’ve annotated the lifetime of r with 'a and the lifetime of x with 'b

Here lifetime 'a obviously means scope of reference itself (r).

  1. scope of referent. For example:

The function signature also tells Rust that the string slice returned from the function will live at least as long as lifetime 'a .

At the first glance, I thought that the string slice means the reference itself returned from the function, in that case the reference itself live at least as long as lifetime 'a does not make sense (should be at most as long?). But if it means the scope/lifetime of referent of the returned reference then at least as long make sense now.

@ilyvion
Copy link
Contributor

ilyvion commented Mar 12, 2023

No, "at least" is correct and "at most" is incorrect. There's a long discussion in this PR, but the short story is that 'static can substitute in for 'a and then the returned references lifetime will last longer than what 'a guarantees, so "at least as long as 'a" is valid. Consider this valid implementation of longest:

fn longest<'a>(x: &'a str, y: &'a str) -> &'a str {
    if x == "hello" { // this condition is arbitrary
        "surprise!" // the lifetime of string literals is 'static
    } else if x.len() > y.len() {
        x
    } else {
        y
    }
}

I think this whole confusion stems from what viewpoint you're taking.

From inside the function as a return value, you care about whether a return value lives "at least as long as 'a," which is why returning something that is 'static (or any 'x where 'x : 'a, really) works out, as seen in the quoted code above.

But from the outside, the opposite is true. You care about whether a value lives "at most as long as 'a," because once you try using 'a for longer than that, you get a compilation error.

But the compiler only sees that the function signature guarantees it lives as long as 'a, so anything outside of 'a must be marked as invalid.

Substitute "longer" for "outside" and we're all on the same page here.

Another way to say this is that result (the output from longest) will be valid as long as string2 is valid -- result can live at least as long as string2.

This implies that result can live longer than string2 (that's what "at least as long" means!) but it can't. It can live at most as long as string2.

@olalonde
Copy link

olalonde commented Mar 12, 2023

So what the signature of longest says is that the return value is only guaranteed to live as long as 'a, where 'a is the lifetime where all of the arguments annotated with 'a are valid. The return value might live longer than 'a! But the compiler only sees that the function signature guarantees it lives as long as 'a, so anything outside of 'a must be marked as invalid.

I'm still not sure what to change about the book because I'm not going to change it to something incorrect. I am interested in making it more clear, but I'm not sure what that is yet.

@carols10cents Correct me if I'm wrong, but I think the confusion is that we're discussing two different things:

  1. Verifying the implementation of the function (with regards to its lifetimes)

Honestly, I find that part really hard to get wrong unless you are really trying, e.g.

fn longest<'a>(x: &'a str, y: &'a str) -> &'a str {
    &String::from("foo")
}
  1. Verifying callers of the function

Or more precisely, what is the valid lifetime of the return value of the function (not inside the function implementation but where the function is called).

That's the part I find more interesting and more important to understand.

fn longest<'a>(x: &'a str, y: &'a str) -> &'a str {
    if x.len() > y.len() {
        "x is the longest"
    } else {
        "y is the longest"
    }
}

fn main() {
    let x = String::from("abc");
    let s_longest;
    {
        let y = String::from("defg");
        s_longest = longest(&x, &y);
    }
    println!("{}", s_longest);
}

In this example, the lifetime annotations longest<'a>(x: &'a str, y: &'a str) -> &'a str tell us that s_longest can live at most 'a (the shortest lifetime of x and y, which is y in this case) and it's why the borrow checker complains when s_longest is used past the lifetime of y. It doesn't matter that the function implementation returns a 'static (which btw does indeed live longer than 'a).

In that chapter of the book, I feel like we should be explaining how lifetime annotations work from the perspective of a function caller, not a function implementer. Or at least we should clearly delineate the two perspectives.

I hope this clarify the confusion a bit.

@kevyuu
Copy link

kevyuu commented Mar 12, 2023

I just read the book, and I am confused on this part as well. Now i finally understand what ‘at least as long’ means.

The part i think isn’t clear is for the input parameter caller have to guarantee that the item that x and y referring live at least as long as some lifetime a, and the callee will guarantee that it return a reference to an item that lives as long as some lifetime a(in other word caller is guaranteed that the ref return will be valid at least as long as lifetime a, so it is safe to use it in that lifetime). So ‘at least as long as’ have different guarantor for input and output. Maybe including this (what caller must guarantee and what is guaranteed by the callee, and the lifetime is the lifetime of the item itself instead of variable that hold the reference) on the book will make it clearer. When i first read this section, i thought this ‘at least as long’ as is all guaranteed by the caller, hence the confusion.

And i am not sure why string slices itself that must live at least as long as lifetime a instead of the string that the slice referred.

@arnabc1984
Copy link

arnabc1984 commented Jun 26, 2024

Rust noob alert. I do think the book's explanation is correct.
But, when I read through that section, I found it easier to say in my head (for example for the code below):
"All this means is, the reference return type of the function, here &'a str doesn't know where it gets it's value from, it can be from x or from y.
So both, the references x and y must live atleast as long as the returned reference is in use outside this method/function."


fn longest<'a>(x: &'a str, y: &'a str) -> &'a str {
    if x.len() > y.len() {
        x
    } else {
        y
    }
}

I think, the wording in the book was a bit confusing, but correct nonetheless.
I found the Compiler's language in throwing errors much more accurate.
If you remove the lifetime annotation 'a from one the parameters, say from x, the compiler clearly says almost the same thing I mentioned above.
I think it would help if we reworded couple of lines in that section explaining the lifetime annotation's meaning.
Thanks for the book. On chapter 18 today and loving the journey!

@fubupc
Copy link

fubupc commented Aug 11, 2024

I'm still not sure what to change about the book because I'm not going to change it to something incorrect. I am interested in making it more clear, but I'm not sure what that is yet.

@carols10cents How about add some supplymentary explanation to "bridge the understanding gap" like:

The function signature now tells Rust that for some lifetime 'a, the function takes two parameters, both of which are string slices that live at least as long as lifetime 'a. The function signature also tells Rust that the string slice returned from the function will live at least as long as lifetime 'a. However, as the caller, you are only guaranteed the returned reference is valid for at most 'a.

What's confusing me is there seems be a gap between the intension:

We want the signature to express the following constraint: the returned reference will be valid as long as both the parameters are valid.

and what the lifetime annotation provides (according to this explanation):

The function signature now tells Rust that for some lifetime 'a, the function takes two parameters, both of which are string slices that live at least as long as lifetime 'a. The function signature also tells Rust that the string slice returned from the function will live at least as long as lifetime 'a.

The phrase some lifetime 'a seems to suggest that it can be any arbitrary lifetime 'a, as long as it fulfills the constraint that lifetimes of parameters and returned reference >= 'a. But this means it doesn't impose any constraints at all because we can always choose a very small lifetime for 'a to fulfills the constraint.

@fubupc
Copy link

fubupc commented Aug 12, 2024

Another explanation confusing me is this definition/description:

... every reference in Rust has a lifetime, which is the scope for which that reference is valid.

It seems to imply an underlying assumption: that the program has already passed the borrow check. However, when we are in the process of borrow checking (a program which maybe correct or not), we don't yet know if the reference is valid at some point - for example, if the subject of the reference might have been moved or dropped. In this context, what does "lifetime of reference" means conceptually? Is it "the span of code/time before its last use" (ignore more complex NLL cases)?

This is an example to show what i mean:

fn main() {
    let r;                // ---------+-- 'a --+-- 'c
                          //          |        |
    {                     //          |        |
        let x = 5;        // -+-- 'b  |        |
        r = &x;           //  |       |        |
    }                     // -+ ------|--------+
                          //          |
    println!("r: {}", r); //          |
}                         // ---------+

According to definition/description of "the scope for which that reference is valid", reference r's lifetime should be the span of code annotated by 'c, right? But according to the later section The Borrow Checker, it's 'a instead.

@fubupc
Copy link

fubupc commented Aug 13, 2024

Hi, finally I had (hopefully) some better understanding of lifetime annotations after reading 1, 2, 3, 4, 5, 6, 7, 8, 9, etc.

Preliminaries:

  1. "xxx (reference) lives/outlives 'a" means that the subject of reference xxx lives/outlives 'a, rather than xxx itself (scope) or its use (see 4).

  2. Slice is an ambiguous term; e.g., &str and str are sometimes both referred to as string slices. In "... the string slice returned from the function will live ...", it likely means &str, but due to point 1, this distinction does not really matter here.

  3. Given "xxx (reference) lives at least 'a", we can use xxx at most 'a for 100% safety.

  4. A function signature is a contract that contains both constraints and guarantees, varying based on whether you're the caller or the callee. Specifically, parameters are a guarantee to the callee but a constraint to the caller, while the return value is a guarantee to the caller but a constraint to the callee.

  5. Like generic type parameters, lifetime parameters are part of function signature, and they are unknown at definition but determined only at the call site.

Now let's apply these concepts to longest. Hereafter, we denote the returned reference as r. The lifetime annotations represent a contract: 'a is an unknown lifetime determined at function call. For every such 'a, if x and y outlive 'a, then r will also outlive 'a. (PS: As mentioned here , this implies that r will always live at least as long as x or y.)

This contract can be viewed from two perspectives:

  • Callee's perspective: It's guaranteed that x and y outlive 'a. The callee itself must ensure that r will also outlive 'a.

  • Caller’s perspective: It's guaranteed that r outlives 'a. The caller itself must ensure that the actual parameters x and y outlive 'a.

Then the borrow checker can verify it from two perspectives:

  • For the callee: It verifies that the reference returned by the function body indeed outlives the abstract lifetime parameter 'a. Note that this needs to hold for all possible concrete 'a, which in practice might be ensured through subtyping.

  • For the caller: It tries to find some (not necessarily all) concrete lifetime for 'a, which encompasses all the uses of the returned reference, while ensuring that the passed-in x and y both outlive 'a.

@mulkieran
Copy link
Contributor

Just noting, though, that there's a whole new formulation for lifetimes working its way into reality in the Rust compiler, here is a fairly recent link: https://blog.rust-lang.org/inside-rust/2023/10/06/polonius-update.html .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants