Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Values, variables, pointers, and references #821

Closed
wants to merge 13 commits into from
Closed

Values, variables, pointers, and references #821

wants to merge 13 commits into from

Conversation

chandlerc
Copy link
Contributor

@chandlerc chandlerc commented Sep 9, 2021

Flesh out and solidify the design around values, variables, and
pointers. Explicitly discuss the use cases of references in C++ and
propose specific approaches to address those use cases.

This is something that we've been discussing across the team for a long
time, and while there are definitely still challenges in this space we
will need to address going forward, I want to try to codify where we are
at and provide for a few fundamentals that haven't really been spelled
out previously.

That said, I've been staring at this document for far too long in
a draft, and so I may be missing parts that are confusing or need work,
so any help from folks to make this a coherent story is definitely
appreciated. The current structure and wording is heavily informed by
several reviews and suggestions from @zygoloid, @josh11b, and @wolffg
with much appreciation. =]

Some core examples of the consequence of this proposal:

  • Using let where we currently use var to declare a locally scoped
    immutable view of a value:

    let index: i32 = 42;
    
  • Specifying the expected semantics of parameters to by default
    be these immutable views of values like let. These should behave
    like C++ const references but allowing copies under as-if.

  • Specifying that var creates an L-value and binds names to it.

  • Defining that var patterns are allowed to nest within let to
    mark a part of a pattern as an L-value:

    let (x: i64, var y: i64) = (1, 2);
    
    // Ok to mutate `y`:
    y += F();
    

    When the entire declaration is a var the let can be omitted.

    This works with function parameters as well to mark consuming an
    input into a locally mutable L-value:

    fn RegisterName(var name: String) {
      // `name` is a local L-value in this function and can be mutated.
    }
    
  • Implementing operators by rewriting into method calls through an
    interface, which can then use [addr me: Self*] to implicitly obtain
    a mutable pointer to an object for mutating operators.

  • Providing user-defined pointer-like types and the implementation of
    both the *-operator and -> member access in terms of rewriting
    into member calls through an interface and then forming L-values.

  • Providing indexed access through rewrites into method calls as well.

Beyond these use cases, thread-safe interfaces and more complex lifetime
based dispatch are deferred for future work.

See the proposal for details here, and looking forward to feedback!

@chandlerc chandlerc added the proposal A proposal label Sep 9, 2021
@chandlerc chandlerc requested a review from a team September 9, 2021 09:42
@google-cla google-cla bot added the cla: yes PR meets CLA requirements according to bot. label Sep 9, 2021
@chandlerc chandlerc closed this Sep 9, 2021
pointers. Explicitly discuss the use cases of references in C++ and
propose specific approaches to address those use cases.

This is something that we've been discussing across the team for a long
time, and while there are definitely still challenges in this space we
will need to address going forward, I want to try to codify where we are
at and provide for a few fundamentals that haven't really been spelled
out previously.

That said, I've been staring at this document for *far* too long in
a draft, and so I may be missing parts that are confusing or need work,
so any help from folks to make this a coherent story is definitely
appreciated. The current structure and wording is heavily informed by
several reviews and suggestions from @zygoloid, @josh11b, and @wolffg
with much appreciation. =]

Some core examples of the consequence of this proposal:

- Using `let` where we currently use `var` to declare a locally scoped
  immutable view of a value:

  ```
  let index: i32 = 42;
  ```

- Specifying the expected semantics of parameters to by default
  be these immutable views of values like `let`. These should behave
  like C++ `const` references but allowing copies under *as-if*.

- Specifying that `var` creates an *L-value* and binds names to it.

- Defining `var` is being allowed to nest within `let` to mark a part of
  a pattern as an L-value:

  ```
  let (x: i64, var y: i64) = (1, 2);

  // Ok to mutate `y`:
  y += F();
  ```

  When the entire declaration is a `var` the `let` can be omitted.

  This works with function parameters as well to mark *consuming* an
  input into a locally mutable L-value:

  ```
  fn RegisterName(var name: String) {
    // `name` is a local L-value in this function and can be mutated.
  }
  ```

- Implementing operators by rewriting into method calls through an
  interface, which can then use `[addr me: Self*]` to implicitly obtain
  a mutable pointer to an object for mutating operators.

- Providing user-defined pointer-like types and the implementation of
  both the `*`-operator and `->` member access in terms of rewriting
  into member calls through an interface and then forming L-values.

- Providing indexed access through rewrites into method calls as well.

Beyond these use cases, thread-safe interfaces and more complex lifetime
based dispatch are deferred for future work.

See the proposal for details here, and looking forward to feedback!
@chandlerc chandlerc reopened this Sep 9, 2021
@chandlerc chandlerc marked this pull request as ready for review September 9, 2021 10:43
@github-actions github-actions bot added the proposal rfc Proposal with request-for-comment sent out label Sep 9, 2021
today. These overlap, but are meaningfully distinct.

1. An _immutable view_ of a value
2. The _thread-safe interface_ of a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you should use "safe" or "thread-safe" in contexts where you are talking about the "thread-compatible" contract. Those are explicitly different things. Thread-compatible types don't have thread-safe interfaces, they have const and non-const interfaces and a contract that says what safe usage of those interfaces are. Those interfaces are only ever conditionally safe, not safe in the sense of the normal usage of that term.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Referring to this as the thread-safe interface is common, and I think was popularized by this talk by Herb Sutter: https://channel9.msdn.com/posts/C-and-Beyond-2012-Herb-Sutter-You-dont-know-blank-and-blank (summary at 27:30 onwards). I think the viewpoint is that if the const interface is the only interface you use on an object, then it is (or should be!) thread-safe, even if the non-const interface is not.

If you want to say that an interface is only thread-safe if concurrent usage of other (non-thread-safe) interfaces on the same object would be safe, then even code that protected all member accesses with a mutex wouldn't be thread-safe (because you might concurrently destroy the object), so I think it's more useful to say that an interface is thread-safe if concurrent use of that interface (and no others) is safe, even though this is non-composable (an object might have two interfaces that are individually thread-safe but that can't be used safely at the same time, such as a getter that doesn't take a lock and a setter that does).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was at least one place where I was using "safe" in a much more dubious way that I've removed. It could easily imply that it ways checked to be safe by ensuring that no other interface was being concurrently used. I agree that usage was confusing, and I'll try to see if I repeated this anywhere else.

I largely agree with @zygoloid about "thread-safe interface" being an OK term, but if it is really confusing folks, I can try to come up with different terms...

performing this refactoring, there is a need to translate between local
variables and parameters in both directions. In order to ensure these
translations are unsurprising and don't face significant expressive gaps or
behavioral differences, it is important to have strong conceptual integrity
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this "conceptual integrity" or "semantic consistency"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think someone suggested "conceptual integrity" here. I'm not sure what the difference between these two terms would be -- one indicates a consistent set of concepts, the other a consistent set of semantics (likely through a consistent set of concepts)?

Anyways, if semantic cosistency reads better to you, happy to use it.

proposals/p0821.md Outdated Show resolved Hide resolved
Comment on lines 77 to 80
evolutionary space for safe primitives to be added. There is no specific goal to
radically change the overarching patterns that emerge currently in C++ API
design. At most, the hope is to simplify and address their shortcomings, not to
shift to a completely new model. For example, moving to a model of everything
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like this design is a significant divergence from C++. Which is not to say I disagree with it or that it is as different as the Java model, but I think it would be more honest to describe this as an experiment with doing things differently in the hope that it is sufficiently better to be worth the interop and retraining costs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I was trying to talk about the patterns of API design, but you're absolutely right that the building blocks used within those APIs are very different. I've tried to be more explicit now, does this help?

```

This _immutable view_ can be thought of as requiring that the semantics of the
program be exactly the same whether it is implemented in terms of a view of the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what "view" means here. You are defining the phrase "immutable view" and I feel like I understand "immutable" already pretty well so I was reading this text to understand the "view" part.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is really an issue of my example. I've changed it to more clearly model a view. Does this help?

- The view must not be used to mutate the value, or those mutations would be
lost if made to a copy.

Put differently, it makes a copy valid under the as-if rules of C++.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like this statement is a bit subtle with respect to causation. I think you are saying: "we want an immutable view to enforce the conditions of the as-if rules, so that a copy would be valid in C++" not "the as-if rules are applicable, so a copy is valid in C++", but you might also mean "we want to use an immutable view in the cases where the as-if rules would allow a copy in C++". I was unsure what "it" refers to here and that made me unsure of how to interpret the word "makes".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "C++" here was just to identify what I meant by "the as-if" rule. I've used a link instead to make this less ambiguous.

And I've removed the "it" which was also confusing. Is this better?

Comment on lines 93 to 103
- _immutable views_
- _thread-safe interfaces_
- _smart pointers_
- _consuming input_
- _lifetime overloading_
- _mutable operands_
- _user-defined dereference_
- _indexed access syntax_
- _member and subobject accessors_
- _non-null pointers_
- _syntax-free dereferencing_
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that listing the use cases up front is providing much value. Perhaps the section headings can be tweaked so that the table of contents is filling this role?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm open to suggestions here, listing them was one from @wolffg.

While some correspond nicely to section headings, others don't. I spent some time thinking about how to restructure the sections to arrange for there to always be a section heading that matched and wasn't able to come up with anything satisfying. I'm not at all opposed to such a structure if you or others see a good way to get from here to there though.

proposals/p0821.md Outdated Show resolved Hide resolved
Comment on lines 116 to 125
```
void SomeFunction(...) {
// ...

constinit const int id = ...;

// Cannot mutate `id` here accidentally.
// ...
}
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I expect that you are correct that this example technically has the immutable view semantics, but is unfamiliar at least to me and so does not serve the expository function that it is meant to serve. Const references and pass-by-copy/value are where I expect users to be looking for immutable view semantics, even though C++ doesn't deliver these specific semantics. I think it might be more honest to say this is a place where we are diverging from C++ because this is what we think users want, rather than asking them to choose between const reference and pass-by-copy/value.

Anchoring on C++ here is awkward because Rust, for types without interior mutability, has immutable borrows which are a bit closer I think.

proposals/p0821.md Outdated Show resolved Hide resolved

fn Example(a: Point, b: Point, dest: Point*) -> Float {
if (...) {
// Rewritten to: (*dest).(Assignable.Assign)(a);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically this should be rewritten in such a way as to support different RHS types.

Suggested change
// Rewritten to: (*dest).(Assignable.Assign)(a);
// Rewritten to: (*dest).(Assignable(typeof(a)).Assign)(a);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My hope was to just use simpler homogenous interfaces for exposition here, and let the actual operator proposal fully dig into this. Does that make sense to you?

Copy link
Contributor

@zygoloid zygoloid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The direction of this proposal looks good to me.

I'm somewhat nervous about conflating mutability and addressability. I'm also a little worried that not having const as an immutable view of mutable data will create migration complexity. But I think we can use this as a basis for exploring those questions.

Detailed review on the assumption that the bulk of the proposal wording will end up in the design documents largely unchanged.

proposals/p0821.md Outdated Show resolved Hide resolved
Comment on lines +107 to +108
There are two different semantic models that underpin how `const` is used in C++
today. These overlap, but are meaningfully distinct.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are both about a const view of an object (eg, a pointer or reference to a const-qualified type), rather than about an actually-immutable const object. It might be worth teasing those apart so it's clear you're only talking about the former here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually-ummutable const objects are also mostly used to have an immutable view of some value?

I tried seperating this out anyways, and it made it a bit more complex. Not sure what change you're thinking would help here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think you're right; const objects fit the immutable view model. I think the cases I'm trying to tease apart are an immutable view of a value, and a handle to an object where we cannot modify the object via that handle. The latter isn't a view of a value, though, because the value of the object may change over time, and because we do have pointer identity.

I wonder if perhaps "thread-safe" is too narrow of a term for the second category. That is, I wonder if the two models are:

  • immutable view of an unchanging value: const is a promise that the value will not change, neither by being modified through this handle nor by being modified by another handle.
  • immutable interface to a potentially changing value: const is an interface guarantee by which a client of a handle promises that they will not change the value through that handle, but the value may change in other ways. Often the const interface is thread-safe, but this is also used to model immutability in single-threaded situations. This is a special case of a more general desire for a type to provide different interfaces to different clients.

proposals/p0821.md Show resolved Hide resolved
proposals/p0821.md Outdated Show resolved Hide resolved
proposals/p0821.md Outdated Show resolved Hide resolved
Comment on lines +768 to +770
the function body. There is no such easy assurance for return values. As a
consequence, this proposal suggests return values are copied initially for
safety and predictability.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've not thought this all the way through, bear with me... do we want to guarantee that a copy happens, or should we give a weaker guarantee that there is no lifetime issue, but that we may or may not make a copy?

What I'm thinking is: we could say that a function that returns by value may return a reference to anything that it knows lives at least as long as any of its parameters (including me). Then at the call site, we can check whether the returned value outlives any of the function parameters and make a copy if so. Then:

// F doesn't make a copy; returned value lives at least as long as parameter s
fn F(s: String) -> String { return s; }
// G makes a copy: v doesn't live long enough
fn G(s: String) -> String {
  let v: String = "foo" + s + "bar";
  return v;
}
fn Print(s: String);
fn H() {
  // Parameter of `F` can be kept alive until `;`, so we know
  // the return value lives at least that long and don't need 
  // to make any copies.
  Print(F("hello"));
  // `x` outlives the function argument, so we make a copy here.
  let x: String = F("hello");
  Print(x);
}

I suppose we can avoid making a copy even in the second case in H by lifetime-extending the "hello" temporary. I'm not sure that's a good idea; it might be too unpredictable.

One problem with this is that the function return is creating an immutable view, and we need the callee to know how long it's promising that returned value will remain immutable for. I suppose this is nothing new; this is analogous to a classic C++ issue:

const string &s = v[i];
v.push_back("x");
use(s);

... where this either works or fails depending on whether v[i] produces a reference to an existing object (eg, v is vector<string>) or ends up binding s to a temporary (eg, v is vector<const char*>). We might want some simple syntax to force a copy and end a chain of immutable views. (You could use var for that, but that also implies mutability, which might be undesirable.)

There's also a calling convention complexity issue with this kind of approach: if F can either return a handle to some existing object or copy to some caller-provided storage, then the caller always needs to provide the storage and may need to perform a branch to tell whether it should provide a copy. That seems like something we could handle but I'm not sure whether it'll be worthwhile unless we get to avoid a lot of copies.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically, yes.

This is exactly the direction I'd like to explore here, as an incremental performance improvement by reducing copies. This is what I'm alluding to in the paragraph below around "lifetime tracking" system. Ideally such a system would let the cases such as the ones you mention Just Work by letting the caller copy only when necessary, and the API use annotations to give it the maximum knowledge of how late it can wait to create such a copy.

I also agree about forcing a copy at some point. A var is I tihnk going to work out reasonably well in practice, but we could also have a copy operation (in the expression space) to make it easier to chain into something that doesn't need mutability and to avoid creating a statement. Technically, I think we can already do this:

fn Copy[T:! Type](var v: T) -> T { return v; }

But it seems likely better to give the language more visibility into it rather than doing it like this.

Anyways, all of this is for future work IMO. How much of this should I record as ideas so we don't lose them?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Filed as #828. I think filing the issue is enough for now, but a link to it from here might be useful I suppose.

Comment on lines +837 to +841
One use case not obviously or fully addressed by the tools proposed here is
overloading function calls by observing the lifetime of arguments. The use case
here would be selecting different implementation strategies for the same
function or operation based on whether an argument lifetime happens to be ending
and viable to move-from.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The example of this that immediately springs to mind is overloading operator+ for strings to reuse storage where possible. But I think that's because that was the example from the C++ paper that introduced the facility. I can't say that I've ever seen this kind of overloading actually be done in practice outside of that example -- and even in the case of that example it's not obvious to me that the optimization is worth the complexity, and maybe the problem is that we're providing the wrong interface. Do we have any more such examples?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That optimization becomes more important as the size of the data increases. TensorFlow provides this optimization transparently for its (typically large) tensor values, but not at the C++ API layer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The few places where I have seen this optimization in C++ APIs where it was very important, it was exactly as Josh said -- due to the size of the data being large.

However, in those cases (in LLVM mostly), relying on operator overloading was too subtle and the APIs evolved to be very explicit even at the call site, and forced separate code paths when needed for different scenarios. It wasn't a big ergonomic burden given the importance of not adding a heap allocation.

This is part of why I'm a bit skeptical about addressing this with overloading. But I also don't want to absolutely preclude revisiting it -- I think this too is part of the experiment I'm suggesting.

proposals/p0821.md Show resolved Hide resolved
Comment on lines 887 to 889
Should we immediately provide the escape hatch of an unsafe address-of operation
on immutable views? Even if "no", we can always revisit this later and add the
operation.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the alternative right now? You'd presumably either change a let to var, or add a new local var, and then use the address of the variable, I suppose. I think we should not add this for now, but once we are writing Carbon code we should be on the lookout for that pattern and take it as evidence to reconsider.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is the alternative, and I agree. I've moved this to an alternative considered.

Comment on lines +904 to +905
- Pointers are expected to be deeply familiar to C++ programmers and easily
[interoperate with C++ code](/docs/project/goals.md#interoperability-with-and-migration-from-existing-c-code).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For C++ interoperability and migration, I don't have a great picture of how we'll map C++ pointer and reference types, and especially pointers and references to const-qualified types, into Carbon. In some cases we'll want an immutable view, but in other cases we'll want a weaker "can't be modified through this handle but can change while this handle exists" view.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My idea is that the default migration and interop of references (including const references) in C++ would be to pointers in Carbon. For migrating the const-qualified interface shift, something like a facet type but still using pointers.

Then we can look for specific patterns that can be reliably recognized and instead migrated to the immutable value views. For example, by-value parameters that are clearly never mutated. Or const-reference parameters without const_casts. Maybe some others. But the fallback for references would always be pointers here.

Would it be useful to write this up in the proposal? In how much detail?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think including something like your above comment in the proposal would be helpful from an anchoring perspective, even if it's marked as provisional.

Copy link
Contributor Author

@chandlerc chandlerc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just responding to threads where I have questions or there is more discussion, and landing easy suggestions. I'll then make a second pass trying to make the more substantial changes needed and responding to other threads.

Comment on lines 93 to 103
- _immutable views_
- _thread-safe interfaces_
- _smart pointers_
- _consuming input_
- _lifetime overloading_
- _mutable operands_
- _user-defined dereference_
- _indexed access syntax_
- _member and subobject accessors_
- _non-null pointers_
- _syntax-free dereferencing_
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm open to suggestions here, listing them was one from @wolffg.

While some correspond nicely to section headings, others don't. I spent some time thinking about how to restructure the sections to arrange for there to always be a section heading that matched and wasn't able to come up with anything satisfying. I'm not at all opposed to such a structure if you or others see a good way to get from here to there though.

proposals/p0821.md Outdated Show resolved Hide resolved
proposals/p0821.md Show resolved Hide resolved
proposals/p0821.md Outdated Show resolved Hide resolved
proposals/p0821.md Outdated Show resolved Hide resolved

fn Example(a: Point, b: Point, dest: Point*) -> Float {
if (...) {
// Rewritten to: (*dest).(Assignable.Assign)(a);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My hope was to just use simpler homogenous interfaces for exposition here, and let the actual operator proposal fully dig into this. Does that make sense to you?

Comment on lines +768 to +770
the function body. There is no such easy assurance for return values. As a
consequence, this proposal suggests return values are copied initially for
safety and predictability.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically, yes.

This is exactly the direction I'd like to explore here, as an incremental performance improvement by reducing copies. This is what I'm alluding to in the paragraph below around "lifetime tracking" system. Ideally such a system would let the cases such as the ones you mention Just Work by letting the caller copy only when necessary, and the API use annotations to give it the maximum knowledge of how late it can wait to create such a copy.

I also agree about forcing a copy at some point. A var is I tihnk going to work out reasonably well in practice, but we could also have a copy operation (in the expression space) to make it easier to chain into something that doesn't need mutability and to avoid creating a statement. Technically, I think we can already do this:

fn Copy[T:! Type](var v: T) -> T { return v; }

But it seems likely better to give the language more visibility into it rather than doing it like this.

Anyways, all of this is for future work IMO. How much of this should I record as ideas so we don't lose them?

proposals/p0821.md Outdated Show resolved Hide resolved
with pointers. While this has some ergonomic cost, it seems minimal and isolated
to a relatively rare use case.

### Indexed access syntax
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, I agree this is an important use case.

I would definitely consider a design which allowed those kinds of different variations to all be expressed. Not sure if it would be through prioritization or letting the type choose which style of interface it wants to expose for its indexed syntax.

I'm very interested in all three of the options you mention. I'd at a minimum like for types to be able to choose between the first and last options. I think the middle option would be very interesting to investigate to understand the value compared to the third.

How much should this happen in this proposal? (Also happy to grab some open discussion time to dive deep here.)

Comment on lines +837 to +841
One use case not obviously or fully addressed by the tools proposed here is
overloading function calls by observing the lifetime of arguments. The use case
here would be selecting different implementation strategies for the same
function or operation based on whether an argument lifetime happens to be ending
and viable to move-from.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The few places where I have seen this optimization in C++ APIs where it was very important, it was exactly as Josh said -- due to the size of the data being large.

However, in those cases (in LLVM mostly), relying on operator overloading was too subtle and the APIs evolved to be very explicit even at the call site, and forced separate code paths when needed for different scenarios. It wasn't a big ergonomic burden given the importance of not adding a heap allocation.

This is part of why I'm a bit skeptical about addressing this with overloading. But I also don't want to absolutely preclude revisiting it -- I think this too is part of the experiment I'm suggesting.

Copy link
Contributor Author

@chandlerc chandlerc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think either responded or responded-and-updated-proposal for all the comment threads now. If I missed anything, let me know!

performing this refactoring, there is a need to translate between local
variables and parameters in both directions. In order to ensure these
translations are unsurprising and don't face significant expressive gaps or
behavioral differences, it is important to have strong conceptual integrity
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think someone suggested "conceptual integrity" here. I'm not sure what the difference between these two terms would be -- one indicates a consistent set of concepts, the other a consistent set of semantics (likely through a consistent set of concepts)?

Anyways, if semantic cosistency reads better to you, happy to use it.

Comment on lines 77 to 80
evolutionary space for safe primitives to be added. There is no specific goal to
radically change the overarching patterns that emerge currently in C++ API
design. At most, the hope is to simplify and address their shortcomings, not to
shift to a completely new model. For example, moving to a model of everything
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I was trying to talk about the patterns of API design, but you're absolutely right that the building blocks used within those APIs are very different. I've tried to be more explicit now, does this help?

Comment on lines +107 to +108
There are two different semantic models that underpin how `const` is used in C++
today. These overlap, but are meaningfully distinct.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually-ummutable const objects are also mostly used to have an immutable view of some value?

I tried seperating this out anyways, and it made it a bit more complex. Not sure what change you're thinking would help here?

```

This _immutable view_ can be thought of as requiring that the semantics of the
program be exactly the same whether it is implemented in terms of a view of the
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is really an issue of my example. I've changed it to more clearly model a view. Does this help?

- The view must not be used to mutate the value, or those mutations would be
lost if made to a copy.

Put differently, it makes a copy valid under the as-if rules of C++.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "C++" here was just to identify what I meant by "the as-if" rule. I've used a link instead to make this less ambiguous.

And I've removed the "it" which was also confusing. Is this better?

*dest = b;
}

// Rewritten to: return a.(Subtractable(Float).Subtract)(b);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hopefully fixed now?

fundamental scaling problem in this style of overloading: it creates a
combinatorial explosion of possible overloads. Consider a function with N
parameters that would benefit from lifetime overloading. If each one benefits
_independently_ from the others, we would need N\*N overloads to express all the
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

proposals/p0821.md Show resolved Hide resolved
Comment on lines 887 to 889
Should we immediately provide the escape hatch of an unsafe address-of operation
on immutable views? Even if "no", we can always revisit this later and add the
operation.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is the alternative, and I agree. I've moved this to an alternative considered.

Comment on lines +904 to +905
- Pointers are expected to be deeply familiar to C++ programmers and easily
[interoperate with C++ code](/docs/project/goals.md#interoperability-with-and-migration-from-existing-c-code).
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My idea is that the default migration and interop of references (including const references) in C++ would be to pointers in Carbon. For migrating the const-qualified interface shift, something like a facet type but still using pointers.

Then we can look for specific patterns that can be reliably recognized and instead migrated to the immutable value views. For example, by-value parameters that are clearly never mutated. Or const-reference parameters without const_casts. Maybe some others. But the fallback for references would always be pointers here.

Would it be useful to write this up in the proposal? In how much detail?

- Pointers are expected to be deeply familiar to C++ programmers and easily
[interoperate with C++ code](/docs/project/goals.md#interoperability-with-and-migration-from-existing-c-code).

## Alternatives considered
Copy link
Contributor

@josh11b josh11b Sep 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like this document does only a little to explain what is wrong with the C++ model for our purposes. This document is mostly "see how this other thing addresses many of the same use cases" but with a model that is different enough that it is definitely going to be an interop and training issue (for example, what happens to types with const instance variables?). The only explanation I recall is: we don't want both pointer and references since that introduces complexity in the same place we are going to want to add safety. I feel like there are a lot more changes that deserve some explanation of how this is an improvement for our purposes.

Comment on lines +777 to +781
Beyond properties, Carbon is expected to explore some lifetime tracking system,
and when that happens it should be considered for enabling non-copy returns of
immutable values with a tracked lifetime. These might provide for more general
or complex forms of read-only _member and subobject accessors_ than can be
represented through properties.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do non-copy returns need to be coupled to lifetime tracking? In particular, suppose I want to expose mutable access to some complex sub-object that's part of my logical state (i.e. that's copied when I am copied), but I also want clients to be able to read it when I'm immutable. IIUC, under this proposal that would look something like:

fn Foo[me: Self]() -> FooType;
fn MutableFoo[addr me: Self*]() -> FooType*;

This paragraph suggests that in the future there could be a way to modify Foo so that it doesn't copy the underlying object, so long as its lifetime is properly annotated. But why is lifetime-tracking more important for Foo than for MutableFoo?

Copy link
Contributor

@zygoloid zygoloid Sep 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When you write MutableFoo it's clear that there's a lifetime issue: you're returning a pointer, and that pointer needs to still point to something when it's used. But when you write Foo, you're notionally returning by value, and the fact that we may choose to do that without making a copy is an implementation detail. While that implementation detail is exposed to the programmer (there are -- presumably -- restrictions on concurrent mutation of the FooType object just like there are when passing a parameter by value), the fact that it's up to the implementation to make this decision to some extent shifts the burden for checking the lifetime rules from the programmer to the implementation.

As an extreme example:

fn Foo[me: Self]() -> FooType {
  var x: FooType = ...;
  return x;
}

... obviously should not return a non-copied handle to a stack variable whose lifetime ends when the function returns.

interface that it implements, and for APIs to use this narrow interface to only
interact with the underlying type in particular ways. While they are presented
as a way to make code _generic_ over multiple types, they can also be used to
simply enforce constraints on the interface exposed.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a problem with this approach, though: generic interfaces can carry a nontrivial performance penalty (e.g. the time costs of dynamic dispatch, and possibly the storage costs of witness table pointers) that isn't needed when we're merely subsetting the API of a specific, known type. For example, in C++ if a class contains an array of pointers to mutable T, it can define an accessor that exposes it as an array of pointers to read-only T, with zero space or time overhead. I don't see how we can achieve that in Carbon if we're modeling "read-only T" as a generic interface.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only using dynamic trait objects involves dynamic dispatch and witness table pointers, not generics generally. Normally you would expect generics to act more like templates for code generation purposes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But wouldn't we need something like a dynamic trait object to support use cases like the one I mentioned?

Comment on lines +138 to +145
OtherFunction(other_id);

// We can also pass ephemeral values:
OtherFunction(other_id + 2);

// Or values that may be backed by read-only memory:
static const int fixed_id = 42;
OtherFunction(fixed_id);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
OtherFunction(other_id);
// We can also pass ephemeral values:
OtherFunction(other_id + 2);
// Or values that may be backed by read-only memory:
static const int fixed_id = 42;
OtherFunction(fixed_id);
SomeFunction(other_id);
// We can also pass ephemeral values:
SomeFunction(other_id + 2);
// Or values that may be backed by read-only memory:
static const int fixed_id = 42;
SomeFunction(fixed_id);

Comment on lines +716 to +718
Pointer traversal sometimes has a noticeable cost. Having explicit pointers will
show developers in the codebase where they are explicitly traversing memory and
allow them to optimize them when necessary.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a tension between this paragraph and the preceding one, because this argument seems like it applies just as much when the data is immutable, whereas the previous paragraph suggests that pointers can only point to mutable data.

@DarshalShetty DarshalShetty mentioned this pull request Feb 23, 2022
proves this is an important pattern to support without the contortions of
manually creating a local copy (or changing to pointers).

### References in addition to pointers
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's a doc I wrote up a long time ago: https://docs.google.com/document/d/1grW9NXTZl1UsdytoE-Q2N3WwRQEDQUPg9HWOzyiQcjA/edit#

(No conclusions or deep analysis, really just writing down some alternatives.)

@github-actions github-actions bot added the inactive Issues and PRs which have been inactive for at least 90 days. label May 25, 2022
@github-actions github-actions bot closed this Jun 9, 2022
@chandlerc chandlerc reopened this Jun 29, 2022
@github-actions github-actions bot closed this Jul 13, 2022
@zygoloid zygoloid reopened this Jul 18, 2022
@carbon-language carbon-language deleted a comment from github-actions bot Jul 18, 2022
@carbon-language carbon-language deleted a comment from github-actions bot Jul 18, 2022
@carbon-language carbon-language deleted a comment from github-actions bot Jul 18, 2022
@jonmeow jonmeow added long term Issues expected to take over 90 days to resolve. and removed inactive Issues and PRs which have been inactive for at least 90 days. labels Jul 18, 2022
temporary. However, the rules for parameters and locals are the same in C++ and
so this would create serious lifetime bugs. This is fixed in C++ by applying
_lifetime extension_ to the temporary. The result is that `const` references are
quite different from other references, but they are also quite useful: they are

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why const& is "quite different from other references" - all declarations of reference type do lifetime extension of prvalues, excepting non-const lvalue since that's an error.

int&& x = 0;

also does lifetime extension.

would become Carbon code such as:

```
fn LogSize(large_data: Container) {
Copy link

@strega-nil-ms strega-nil-ms Jul 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to take Container by value; how does this not require the caller to copy a Container (or, alternatively, how does a function take ownership of a Container)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems specifically here that Carbon is avoiding ownership types a la Rust/C++, which I don't personally love... How do you plan on doing ownership?

@github-actions
Copy link

We triage inactive PRs and issues in order to make it easier to find active work. If this PR should remain active, please comment or remove the inactive label.
This PR is labeled inactive because the last activity was over 90 days ago. This PR will be closed and archived after 14 additional days without activity.

@github-actions github-actions bot added the inactive Issues and PRs which have been inactive for at least 90 days. label Oct 21, 2022
@github-actions
Copy link

github-actions bot commented Nov 4, 2022

We triage inactive PRs and issues in order to make it easier to find active work. If this PR should remain active or becomes active again, please reopen it.
This PR was closed and archived because there has been no new activity in the 14 days since the inactive label was added.

@github-actions github-actions bot closed this Nov 4, 2022
@chandlerc chandlerc removed proposal rfc Proposal with request-for-comment sent out inactive Issues and PRs which have been inactive for at least 90 days. long term Issues expected to take over 90 days to resolve. labels Nov 15, 2022
chandlerc added a commit to chandlerc/carbon-lang that referenced this pull request Dec 24, 2022
@Pixep
Copy link
Contributor

Pixep commented Mar 29, 2023

Migrated to #2006

Pixep added a commit to Pixep/carbon-lang that referenced this pull request Mar 29, 2023
jonmeow pushed a commit that referenced this pull request Mar 30, 2023
Replaces references to PR #821 (closed) by #2006 (draft) which tracks the same feature definition.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes PR meets CLA requirements according to bot. proposal A proposal
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants