Skip to content

Commit

Permalink
Values, variables, pointers, and references (#2006)
Browse files Browse the repository at this point in the history
Introduce a concrete design for how Carbon values, objects, storage,
variables,
and pointers will work. This includes fleshing out the design for:

- The expression categories used in Carbon to represent values and
objects,
how they interact, and terminology that anchors on their expression
nature.
-   An expression category model for readonly, abstract values that can
    efficiently support function inputs.
- A customization system for value expression representations,
especially as
    seen on function boundaries in the calling convention.
- An expression category model for references instead of a type system
model.
-   How patterns match different expression categories.
-   How initialization works in conjunction with function returns.
- Specific pointer syntax, semantics, and library customization
mechanisms.
- A `const` type qualifier for use when the value expression category
system
    is too abstracted from the underlying objects in storage.

---------

Co-authored-by: Geoff Romer <gromer@google.com>
Co-authored-by: josh11b <josh11b@users.noreply.github.com>
Co-authored-by: Adrien Leravat <Pixep@users.noreply.github.com>
Co-authored-by: Jon Ross-Perkins <jperkins@google.com>
Co-authored-by: Richard Smith <richard@metafoo.co.uk>
  • Loading branch information
6 people authored Aug 8, 2023
1 parent 049fbc1 commit 0d1e6bd
Show file tree
Hide file tree
Showing 12 changed files with 2,446 additions and 198 deletions.
208 changes: 134 additions & 74 deletions docs/design/README.md

Large diffs are not rendered by default.

4 changes: 2 additions & 2 deletions docs/design/control_flow/return.md
Original file line number Diff line number Diff line change
Expand Up @@ -80,8 +80,8 @@ fn MaybeDraw(should_draw: bool) -> () {

### `returned var`

[Variables](../variables.md) may be declared with a `returned` statement. Its
syntax is:
[Local variables](../values.md#binding-patterns-and-local-variables-with-let-and-var)
may be declared with a `returned` statement. Its syntax is:

> `returned` _var statement_
Expand Down
103 changes: 70 additions & 33 deletions docs/design/expressions/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -61,10 +61,27 @@ graph BT
unqualifiedName["x"]
click unqualifiedName "https://github.com/carbon-language/carbon-lang/blob/trunk/docs/design/expressions/README.md#unqualified-names"
top((" "))
memberAccess>"x.y<br>
x.(...)"]
x.(...)<br>
x->y<br>
x->(...)"]
click memberAccess "https://github.com/carbon-language/carbon-lang/blob/trunk/docs/design/expressions/member_access.md"
constType["const T"]
click pointer-type "https://github.com/carbon-language/carbon-lang/blob/trunk/docs/design/expressions/type_operators.md"
pointerType>"T*"]
click pointer-type "https://github.com/carbon-language/carbon-lang/blob/trunk/docs/design/expressions/type_operators.md"
%% FIXME: Need to switch unary operators from a left/right associativity to
%% a "repeated" marker, as we only have one direction for associativity and
%% that is wrong in this specific case.
pointer>"*x<br>
&x<br>"]
click pointer "https://github.com/carbon-language/carbon-lang/blob/trunk/docs/design/expressions/pointer.md"
negation["-x"]
click negation "https://github.com/carbon-language/carbon-lang/blob/trunk/docs/design/expressions/arithmetic.md"
Expand Down Expand Up @@ -124,15 +141,22 @@ graph BT
expressionEnd["x;"]
memberAccess --> parens & braces & unqualifiedName
negation --> memberAccess
complement --> memberAccess
top --> parens & braces & unqualifiedName
constType --> top
pointerType --> constType
as --> pointerType
memberAccess --> top
pointer --> memberAccess
negation --> pointer
complement --> pointer
unary --> negation & complement
%% Use a longer arrow here to put `not` next to `and` and `or`.
not -----> memberAccess
multiplication & modulo & as & bitwise_and & bitwise_or & bitwise_xor & shift --> unary
not -------> memberAccess
as & multiplication & modulo & bitwise_and & bitwise_or & bitwise_xor & shift --> unary
addition --> multiplication
comparison --> modulo & addition & as & bitwise_and & bitwise_or & bitwise_xor & shift
comparison --> as & addition & modulo & bitwise_and & bitwise_or & bitwise_xor & shift
logicalOperand --> comparison & not
and & or --> logicalOperand
logicalExpression --> and & or
Expand Down Expand Up @@ -179,12 +203,14 @@ keyword and is not preceded by a period (`.`).

### Qualified names and member access

A _qualified name_ is a word that appears immediately after a period. Qualified
names appear in the following contexts:
A _qualified name_ is a word that appears immediately after a period or
rightward arrow. Qualified names appear in the following contexts:

- [Designators](/docs/design/classes.md#literals): `.` _word_
- [Simple member access expressions](member_access.md): _expression_ `.`
_word_
- [Simple pointer member access expressions](member_access.md): _expression_
`->` _word_

```
var x: auto = {.hello = 1, .world = 2};
Expand All @@ -194,6 +220,10 @@ var x: auto = {.hello = 1, .world = 2};
x.hello = x.world;
^^^^^ ^^^^^ qualified name
^^^^^^^ ^^^^^^^ member access expression
x.hello = (&x)->world;
^^^^^ qualified name
^^^^^^^^^^^ pointer member access expression
```

Qualified names refer to members of an entity determined by the context in which
Expand Down Expand Up @@ -231,6 +261,7 @@ complex than a single _word_, a compound member access expression can be used,
with parentheses around the member name:

- _expression_ `.` `(` _expression_ `)`
- _expression_ `->` `(` _expression_ `)`

```
interface I { fn F[self: Self](); }
Expand All @@ -241,34 +272,40 @@ impl X as I { fn F[self: Self]() {} }
fn Q(x: X) { x.(I.F)(); }
```

Either simple or compound member access can be part of a _pointer_ member access
expression when an `->` is used instead of a `.`, where _expression_ `->` _..._
is syntactic sugar for `(` `*` _expression_ `)` `.` _..._.

## Operators

Most expressions are modeled as operators:

| Category | Operator | Syntax | Function |
| ---------- | ------------------------------- | --------- | --------------------------------------------------------------------- |
| Arithmetic | [`-`](arithmetic.md) (unary) | `-x` | The negation of `x`. |
| Bitwise | [`^`](bitwise.md) (unary) | `^x` | The bitwise complement of `x`. |
| Arithmetic | [`+`](arithmetic.md) | `x + y` | The sum of `x` and `y`. |
| Arithmetic | [`-`](arithmetic.md) (binary) | `x - y` | The difference of `x` and `y`. |
| Arithmetic | [`*`](arithmetic.md) | `x * y` | The product of `x` and `y`. |
| Arithmetic | [`/`](arithmetic.md) | `x / y` | `x` divided by `y`, or the quotient thereof. |
| Arithmetic | [`%`](arithmetic.md) | `x % y` | `x` modulo `y`. |
| Bitwise | [`&`](bitwise.md) | `x & y` | The bitwise AND of `x` and `y`. |
| Bitwise | [`\|`](bitwise.md) | `x \| y` | The bitwise OR of `x` and `y`. |
| Bitwise | [`^`](bitwise.md) (binary) | `x ^ y` | The bitwise XOR of `x` and `y`. |
| Bitwise | [`<<`](bitwise.md) | `x << y` | `x` bit-shifted left `y` places. |
| Bitwise | [`>>`](bitwise.md) | `x >> y` | `x` bit-shifted right `y` places. |
| Conversion | [`as`](as_expressions.md) | `x as T` | Converts the value `x` to the type `T`. |
| Comparison | [`==`](comparison_operators.md) | `x == y` | Equality: `true` if `x` is equal to `y`. |
| Comparison | [`!=`](comparison_operators.md) | `x != y` | Inequality: `true` if `x` is not equal to `y`. |
| Comparison | [`<`](comparison_operators.md) | `x < y` | Less than: `true` if `x` is less than `y`. |
| Comparison | [`<=`](comparison_operators.md) | `x <= y` | Less than or equal: `true` if `x` is less than or equal to `y`. |
| Comparison | [`>`](comparison_operators.md) | `x > y` | Greater than: `true` if `x` is greater than to `y`. |
| Comparison | [`>=`](comparison_operators.md) | `x >= y` | Greater than or equal: `true` if `x` is greater than or equal to `y`. |
| Logical | [`and`](logical_operators.md) | `x and y` | A short-circuiting logical AND: `true` if both operands are `true`. |
| Logical | [`or`](logical_operators.md) | `x or y` | A short-circuiting logical OR: `true` if either operand is `true`. |
| Logical | [`not`](logical_operators.md) | `not x` | Logical NOT: `true` if the operand is `false`. |
| Category | Operator | Syntax | Function |
| ---------- | ----------------------------------- | --------- | --------------------------------------------------------------------- |
| Pointer | [`*`](pointer_operators.md) (unary) | `*x` | Pointer dereference: the object pointed to by `x`. |
| Pointer | [`&`](pointer_operators.md) (unary) | `&x` | Address-of: a pointer to the object `x`. |
| Arithmetic | [`-`](arithmetic.md) (unary) | `-x` | The negation of `x`. |
| Bitwise | [`^`](bitwise.md) (unary) | `^x` | The bitwise complement of `x`. |
| Arithmetic | [`+`](arithmetic.md) | `x + y` | The sum of `x` and `y`. |
| Arithmetic | [`-`](arithmetic.md) (binary) | `x - y` | The difference of `x` and `y`. |
| Arithmetic | [`*`](arithmetic.md) | `x * y` | The product of `x` and `y`. |
| Arithmetic | [`/`](arithmetic.md) | `x / y` | `x` divided by `y`, or the quotient thereof. |
| Arithmetic | [`%`](arithmetic.md) | `x % y` | `x` modulo `y`. |
| Bitwise | [`&`](bitwise.md) | `x & y` | The bitwise AND of `x` and `y`. |
| Bitwise | [`\|`](bitwise.md) | `x \| y` | The bitwise OR of `x` and `y`. |
| Bitwise | [`^`](bitwise.md) (binary) | `x ^ y` | The bitwise XOR of `x` and `y`. |
| Bitwise | [`<<`](bitwise.md) | `x << y` | `x` bit-shifted left `y` places. |
| Bitwise | [`>>`](bitwise.md) | `x >> y` | `x` bit-shifted right `y` places. |
| Conversion | [`as`](as_expressions.md) | `x as T` | Converts the value `x` to the type `T`. |
| Comparison | [`==`](comparison_operators.md) | `x == y` | Equality: `true` if `x` is equal to `y`. |
| Comparison | [`!=`](comparison_operators.md) | `x != y` | Inequality: `true` if `x` is not equal to `y`. |
| Comparison | [`<`](comparison_operators.md) | `x < y` | Less than: `true` if `x` is less than `y`. |
| Comparison | [`<=`](comparison_operators.md) | `x <= y` | Less than or equal: `true` if `x` is less than or equal to `y`. |
| Comparison | [`>`](comparison_operators.md) | `x > y` | Greater than: `true` if `x` is greater than to `y`. |
| Comparison | [`>=`](comparison_operators.md) | `x >= y` | Greater than or equal: `true` if `x` is greater than or equal to `y`. |
| Logical | [`and`](logical_operators.md) | `x and y` | A short-circuiting logical AND: `true` if both operands are `true`. |
| Logical | [`or`](logical_operators.md) | `x or y` | A short-circuiting logical OR: `true` if either operand is `true`. |
| Logical | [`not`](logical_operators.md) | `not x` | Logical NOT: `true` if the operand is `false`. |

The binary arithmetic and bitwise operators also have
[compound assignment](/docs/design/assignment.md) forms. These are statements
Expand Down
38 changes: 29 additions & 9 deletions docs/design/expressions/indexing.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,14 +23,17 @@ SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
## Overview

Carbon supports indexing using the conventional `a[i]` subscript syntax. When
`a` is an l-value, the result of subscripting is always an l-value, but when `a`
is an r-value, the result can be an l-value or an r-value, depending on which
`a` is a
[durable reference expression](/docs/design/values.md#durable-reference-expressions),
the result of subscripting is also a durable reference expression, but when `a`
is a [value expression](/docs/design/values.md#value-expressions), the result
can be a durable reference expression or a value expression, depending on which
interface the type implements:

- If subscripting an r-value produces an r-value result, as with an array, the
type should implement `IndexWith`.
- If subscripting an r-value produces an l-value result, as with C++'s
`std::span`, the type should implement `IndirectIndexWith`.
- If subscripting a value expression produces a value expression, as with an
array, the type should implement `IndexWith`.
- If subscripting a value expression produces a durable reference expression,
as with C++'s `std::span`, the type should implement `IndirectIndexWith`.

`IndirectIndexWith` is a subtype of `IndexWith`, and subscript expressions are
rewritten to method calls on `IndirectIndexWith` if the type is known to
Expand All @@ -39,6 +42,19 @@ implement that interface, or to method calls on `IndexWith` otherwise.
`IndirectIndexWith` provides a final blanket `impl` of `IndexWith`, so a type
can implement at most one of those two interfaces.

The `Addr` methods of these interfaces, which are used to form durable reference
expressions on indexing, must return a pointer and work similarly to the
[pointer dereference customization interface](/docs/design/values.md#dereferencing-customization).
The returned pointer is then dereferenced by the language to form the reference
expression referring to the pointed-to object. These methods must return a raw
pointer, and do not automatically chain with customized dereference interfaces.

**Open question:** It's not clear that the lack of chaining is necessary, and it
might be more expressive for the pointer type returned by the `Addr` methods to
be an associated type with a default to allow types to produce custom
pointer-like types on their indexing boundary and have them still be
automatically dereferenced.

## Details

A subscript expression has the form "_lhs_ `[` _index_ `]`". As in C++, this
Expand All @@ -61,13 +77,15 @@ interface IndirectIndexWith(SubscriptType:! type) {
```

A subscript expression where _lhs_ has type `T` and _index_ has type `I` is
rewritten based on the value category of _lhs_ and whether `T` is known to
rewritten based on the expression category of _lhs_ and whether `T` is known to
implement `IndirectIndexWith(I)`:

- If `T` implements `IndirectIndexWith(I)`, the expression is rewritten to
"`*((` _lhs_ `).(IndirectIndexWith(I).Addr)(` _index_ `))`".
- Otherwise, if _lhs_ is an l-value, the expression is rewritten to "`*((`
_lhs_ `).(IndexWith(I).Addr)(` _index_ `))`".
- Otherwise, if _lhs_ is a
[_durable reference expression_](/docs/design/values.md#durable-reference-expressions),
the expression is rewritten to "`*((` _lhs_ `).(IndexWith(I).Addr)(` _index_
`))`".
- Otherwise, the expression is rewritten to "`(` _lhs_ `).(IndexWith(I).At)(`
_index_ `)`".

Expand Down Expand Up @@ -136,3 +154,5 @@ Carbon API.

- Proposal
[#2274: Subscript syntax and semantics](https://github.com/carbon-language/carbon-lang/pull/2274)
- Proposal
[#2006: Values, variables, and pointers](https://github.com/carbon-language/carbon-lang/pull/2006)
19 changes: 18 additions & 1 deletion docs/design/expressions/member_access.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,9 +29,12 @@ SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
## Overview

A _qualified name_ is a [word](../lexical_conventions/words.md) that is preceded
by a period. The name is found within a contextually determined entity:
by a period or a rightward arrow. The name is found within a contextually
determined entity:

- In a member access expression, this is the entity preceding the period.
- In a pointer member access expression, this is the entity pointed to by the
pointer preceding the rightward arrow.
- For a designator in a struct literal, the name is introduced as a member of
the struct type.

Expand All @@ -43,10 +46,12 @@ A member access expression is either a _simple_ member access expression of the
form:

- _member-access-expression_ ::= _expression_ `.` _word_
- _member-access-expression_ ::= _expression_ `->` _word_

or a _compound_ member access of the form:

- _member-access-expression_ ::= _expression_ `.` `(` _expression_ `)`
- _member-access-expression_ ::= _expression_ `->` `(` _expression_ `)`

Compound member accesses allow specifying a qualified member name.

Expand All @@ -66,14 +71,26 @@ class Cog {
fn GrowSomeCogs() {
var cog1: Cog = Cog.Make(1);
var cog2: Cog = cog1.Make(2);
var cog_pointer: Cog* = &cog2;
let cog1_size: i32 = cog1.size;
cog1.Grow(1.5);
cog2.(Cog.Grow)(cog1_size as f64);
cog1.(Widget.Grow)(1.1);
cog2.(Widgets.Cog.(Widgets.Widget.Grow))(1.9);
cog_pointer->Grow(0.75);
cog_pointer->(Widget.Grow)(1.2);
}
```

Pointer member access expressions are those using a `->` instead of a `.` and
their semantics are exactly what would result from first dereferencing the
expression preceding the `->` and then forming a member access expression using
a `.`. For example, a simple pointer member access expression _expression_ `->`
_word_ becomes `(` `*` _expression_ `)` `.` _word_. More details on this syntax
and semantics can be found in the [pointers](/docs/design/values.md#pointers)
design. The rest of this document describes the semantics using `.` alone for
simplicity.

A member access expression is processed using the following steps:

- First, the word or parenthesized expression to the right of the `.` is
Expand Down
57 changes: 57 additions & 0 deletions docs/design/expressions/pointer_operators.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
# Pointer operators

<!--
Part of the Carbon Language project, under the Apache License v2.0 with LLVM
Exceptions. See /LICENSE for license information.
SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
-->

<!-- toc -->

## Table of contents

- [Overview](#overview)
- [Details](#details)
- [Precedence](#precedence)
- [Alternatives considered](#alternatives-considered)
- [References](#references)

<!-- tocstop -->

## Overview

Carbon provides the following operators related to pointers:

- `&` as a prefix unary operator takes the address of an object, forming a
pointer to it.
- `*` as a prefix unary operator dereferences a pointer.

Note that [member access expressions](member_access.md) include an `->` form
that implicitly performs a dereference in the same way as the `*` operator.

## Details

The semantic details of pointer operators are collected in the main
[pointers](/docs/design/values.md#pointers) design. The syntax and precedence
details are covered here.

The syntax tries to remain as similar as possible to C++ pointer types as they
are commonly written in code and are expected to be extremely common and a key
anchor of syntactic similarity between the languages.

### Precedence

These operators have high precedence. Only [member access](member_access.md)
expressions can be used as an unparenthesized operand to them.

The two prefix operators `&` and `*` are generally above the other unary and
binary operators and can appear inside them as unparenthesized operands. For the
full details, see the [precedence graph](README.md#precedence).

## Alternatives considered

- [Alternative pointer syntaxes](/proposals/p2006.md#alternative-pointer-syntaxes)

## References

- [Proposal #2006: Values, variables, and pointers](/proposals/p2006.md)
Loading

0 comments on commit 0d1e6bd

Please sign in to comment.