Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extension methods #1122

Closed
wants to merge 8 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
82 changes: 81 additions & 1 deletion docs/design/classes.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@ SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
- [Member functions](#member-functions)
- [Class functions](#class-functions)
- [Methods](#methods)
- [Extension methods](#extension-methods)
- [Name lookup in member function definitions](#name-lookup-in-member-function-definitions)
- [Nominal data classes](#nominal-data-classes)
- [Member type](#member-type)
Expand Down Expand Up @@ -74,6 +75,8 @@ SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
- [No `static` variables](#no-static-variables)
- [Computed properties](#computed-properties)
- [Interfaces implemented for data classes](#interfaces-implemented-for-data-classes)
- [Alternatives considered](#alternatives-considered)
- [References](#references)

<!-- tocstop -->

Expand Down Expand Up @@ -889,10 +892,12 @@ var c: Circle = {.center = Point.Origin(), .radius = 1.5 };
Assert(Math.Abs(c.Diameter() - 3.0) < 0.001);
c.Expand(0.5);
Assert(Math.Abs(c.Diameter() - 4.0) < 0.001);
// ❌ Cannot call a method directly.
Circle.Expand(&c, 1.1);
```

- Methods are called using using the dot `.` member syntax, `c.Diameter()` and
`c.Expand(`...`)`.
`c.Expand(`...`)`, and cannot be called directly.
- `Diameter` computes and returns the diameter of the circle without modifying
the `Circle` instance. This is signified using `[me: Self]` in the method
declaration.
Expand All @@ -910,6 +915,71 @@ the `me` parameter must be in the same list in square brackets `[`...`]`. The
`me` parameter may appear in any position in that list, as long as it appears
after any names needed to describe its type.

#### Extension methods

A function can have a `me` parameter in its implicit parameter list regardless
of the scope in which it is declared. Any function with a `me` parameter is a
[method](#methods), but only those functions declared inside a class are member
functions. Methods that are neither member functions nor
[`impl` members](generics/details.md#qualified-member-names) nor adapter
members, and methods whose `me` parameter does not accept an argument of type
`Self` or, for `addr me`, `Self*`, are called
[_extension methods_](https://en.wikipedia.org/wiki/Extension_method).

Extension methods are not found by the name lookup used in
[simple member access](expressions/member_access.md), so compound member access
jonmeow marked this conversation as resolved.
Show resolved Hide resolved
notation must be used to call an extension method:

```
class Rectangle {
var width: i32;
var height: i32;
}
fn Area[me: Rectangle]() -> i32 { return me.width * me.height; }

var r: Rectangle = {.width = 6, .height = 9};
// ✅ Finds function `Area` defined above.
var answer: i32 = r.(Area)();
// ❌ No function `Area` declared in `Rectangle`.
var bad_answer: i32 = r.Area();
```

**Note:** It's also possible to alias an extension method into a class and call
it using simple member access notation.

The type of `me` is not required to be a class type:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think i32 is going to be a class type though....

Maybe "There are no restrictions on the types that can be used with me parameters."?


```
namespace PairUtils;
fn PairUtils.Sum[U:! Type, T:! AddableWith(U), me: (T, U)]() -> T.Result {
let (t: T, u: U) = me;
return t + u;
}
var five: i32 = (2, 3).(PairUtils.Sum)();
```

The presence of a `me` parameter does not affect whether a `Self` type is
available, and the type of `me` does not affect the type of `Self` nor which
jonmeow marked this conversation as resolved.
Show resolved Hide resolved
members are accessible:

```
class Counter {
var count: i32;
}
class Extender {
// ✅ `Self*` here is `Extender*`, even though `me` is of type `Counter*`.
fn Add[addr me: Counter*](inc: i32, ext: Self*) {
me->count += inc;
// ✅ Can access private member of `Extender`.
++ext->times_added_to_counter;
}
private var times_added_to_counter: i32 = 0;
}
fn Run(c: Counter, e: Extender) {
c.(Extender.Add)(5, &e);
}
```

#### Name lookup in member function definitions

When defining a member function lexically inline, we delay type checking of the
Expand Down Expand Up @@ -1923,3 +1993,13 @@ comparable to `{.x = 3.14, .y = 2}`. The trick is how to declare the criteria
that "`T` is comparable to `U` if they have the same field names in the same
order, and for every field `x`, the type of `T.x` implements `ComparableTo` for
the type of `U.x`."

## Alternatives considered

- **TODO:** Fill this in.
- [Disallow extension methods](/proposals/p1122.md#alternatives-considered)

## References

- Proposal
[#1122: Extension methods](https://github.com/carbon-language/carbon-lang/pull/1122).
12 changes: 6 additions & 6 deletions docs/design/expressions/member_access.md
Original file line number Diff line number Diff line change
Expand Up @@ -543,16 +543,16 @@ access or as the target of an `alias` declaration.

```carbon
class C {
fn StaticMethod();
fn StaticMemberFunction();
var field: i32;
class Nested {}
}
fn CallStaticMethod(c: C) {
// ✅ OK, calls `C.StaticMethod`.
C.StaticMethod();
fn CallStaticMemberFunction(c: C) {
// ✅ OK, calls `C.StaticMemberFunction`.
C.StaticMemberFunction();

// ✅ OK, evaluates expression `c` then calls `C.StaticMethod`.
c.StaticMethod();
// ✅ OK, evaluates expression `c` then calls `C.StaticMemberFunction`.
c.StaticMemberFunction();

// ❌ Error: name of instance member `C.field` can only be used in a
// member access or alias.
Expand Down
240 changes: 240 additions & 0 deletions proposals/p1122.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,240 @@
# Extension methods

<!--
Part of the Carbon Language project, under the Apache License v2.0 with LLVM
Exceptions. See /LICENSE for license information.
SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
-->

[Pull request](https://github.com/carbon-language/carbon-lang/pull/1122)

<!-- toc -->

## Table of contents

- [Problem](#problem)
- [Background](#background)
- [Problems with extension methods](#problems-with-extension-methods)
- [Proposal](#proposal)
- [Details](#details)
- [`me` versus `Self`](#me-versus-self)
- [Rationale based on Carbon's goals](#rationale-based-on-carbons-goals)
- [Alternatives considered](#alternatives-considered)
- [Require `me` to have a type involving `Self`](#require-me-to-have-a-type-involving-self)
- [Disallow `me` where `Self` is not in scope](#disallow-me-where-self-is-not-in-scope)

<!-- tocstop -->

## Problem

[Extension methods](https://en.wikipedia.org/wiki/Extension_method) provide a
convenient way for one part of a program to extend an interface provided by
another part of the same program. This is a convenient but non-essential
feature.

In most languages with extension methods, they are invoked with a syntax that
exactly matches normal method call syntax. This creates serious
[problems](#problems-with-extension-methods). However, Carbon's notation for
`me` parameters and for compound member accesses are able to provide the
functionality of extension methods without the costs.

It is currently unclear whether Carbon's design permits extension methods: while
there is an obvious syntax that would provide them, we have no explicit rule or
decision that says whether they are permitted.

## Background

[Wikipedia](https://en.wikipedia.org/wiki/Extension_method) provides a good
description of the state of extension methods in popular languages. C++ has no
corresponding feature. In Rust, all methods, whether in an inherent `impl` or a
trait `impl`, effectively declare extension methods, as they are found by member
access and aren't provided as part of the definition of the type.

### Problems with extension methods

The major problem with extension methods is that they create an evolutionary
problem: if library A provides a type, and library B provides an extension
method for that type, and a consumer C of library A and B calls that extension
method, then the pure addition of a matching method in library A may result in
ambiguities. Alternatively, if the extension method is preferred over the method
in the type, then a pure addition of an extension method may result in
ambiguities.

Worse, if two separate libraries provide an extension method for the same type
with the same name, any consumer of both libraries will have problems accessing
either extension method, and a pure addition of an extension method in one
library risks introducing ambiguity with an extension method in a different
library.

A separate but related concern is that lookup for extension methods may in
general need to look in a large number of unrelated places. If the receiver type
can be generic, non-trivial inference and checking steps may be required for
each potential candidate to determine which operation should be used. This is
reminiscent of the high compile-time costs of ADL for overloaded operators in
C++, and with good reason: ad-hoc out-of-class operator overloads have a lot of
the same properties as extension methods.

## Proposal

Carbon does not restrict which functions can have a `me` parameter, nor what
types `me` can have.

## Details

We give the name "extension method" to a method that can't ever be called using
a simple member access notation `a.Function()`. Whether a method is an extension
method or not makes no difference to the rules, but the term is expected to be
useful conversationally.

No changes are made to the language rules, but for clarity, here are the
consequences of the current rules:

- The terms "method" and "member function" are now completely orthogonal: we
have both member functions that are not methods and methods that are not
member functions.
```
fn NotMethodNotMemberFunction();
fn MethodButNotMemberFunction[me: i32]();
class ClassWithMemberFunctions {
fn NotMethodButMemberFunction();
fn MethodAndMemberFunctionToo[me: Self]();
}
```
- Any method can be called using compound member access notation
`a.(Function)()`, where the name `Function` may be qualified if necessary.
jonmeow marked this conversation as resolved.
Show resolved Hide resolved
- Methods can only be called using member access notation. For example,
jonmeow marked this conversation as resolved.
Show resolved Hide resolved
`MethodButNotMemberFunction(5)` is an error, as is
`ClassWithMemberFunctions.MethodAndMemberFunctionToo(class_instance)`.
- The existence and value of the name `Self` depends on the lexical context in
which `Self` appears. Whether `me` is declared and the type of `me` is not
relevant.
- If the type of `me` is parameterized, a method, including one declared
inside a class, can be used for any type satisfying the parameters:
```
class SomethingMachine {
fn DoSomething[template T:! Type, me: T]();
}
// ✅ Calls the above method with `T` deduced as `IntLiteral(47)`.
47.(SomethingMachine.DoSomething)();
```

Note that this approach avoids both the evolutionary problems and the
compile-time performance problems by using normal name lookup to resolve the
name of extension methods, rather than augmenting member access to additionally
look for non-member names.

### `me` versus `Self`

The two most common declarations of `me` parameter in methods are expected to be
`me: Self` and `addr me: Self*`. As a result, at least in cases where `Self` is
in scope, there is likely to be a strong association between `me` and `Self` in
the minds of readers. This association would be even stronger if we chose to
rename `me` to `self`, which is likely to be something we consider going
forward.

However, because `me` is sometimes a value and sometimes a pointer, it is
already the case that a Carbon developer cannot reason about the basic nature of
`me` without reference to its declaration, so allowing additional variations in
the type of `me` doesn't introduce as high a cost as introducing more variations
in the type of `this` would in C++.

This proposal does not attempt to provide any guarantees that `me` and `Self`
are similar or even compatible, though we expect that in practice people will
avoid defining methods where the types are substantially different. There are
various anticipated use cases that seem valuable to permit:

- `me: T`
- `addr me: PointerLike(T)`
- `addr me: P`

Where:

- `T` is `Self`, a base class of `Self`, a derived class of `Self`, an adaptor
for one of these, or a type parameter with a constraint that restricts to
types like these.
- `PointerLike(T)` is `T*` or some other pointer-like type wrapper, such as
`SharedPtr(T)` or `NonNull(T*)` or `ImmutablePtr(T)`.
- `P` is some parameterized type with a constraint that restricts it to
forming types like `PointerLike(T)`.

We expect well-written Carbon code will not stray outside these bounds.

## Rationale based on Carbon's goals

- [Software and language evolution](/docs/project/goals.md#software-and-language-evolution)
- Avoids creating problems for library evolution by not introducing any
new name lookup rules.
- [Code that is easy to read, understand, and write](/docs/project/goals.md#code-that-is-easy-to-read-understand-and-write)
- Improves ergonomics by allowing local, ad-hoc introduction of extension
methods without the syntactic overhead of an adaptor.
Comment on lines +168 to +169
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm unconvinced by the ergonomic argument.

This is noted by the example at https://en.wikipedia.org/wiki/Extension_method#Current_C#_solutions, where they compare string y = Utility.Reverse(x); with the more desirable string y = x.Reverse();. It seems like this proposal would provide string y = x.(Utility.Reverse)();, but the key advantage noted in the article is eliding the Utility container class (and also, unless qualified function calls turn out to be common, I think the extra parens will be awkward in practice).

Put differently, I feel like the main advantage of extension methods in other languages would be that they're seamless. However, the whole argument against that here is that you want to avoid new name lookup rules. In which case, I don't think extension methods are a reasonable comparison (and you might also want to consider the alternative: provide extension methods similar to how they exist elsewhere).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An ergonomic disadvantage is also that, assuming I'm understanding correctly that this only moves the location of Utility.Reverse in the call, I think there'll be limited adoption. i.e., it's not so unambiguous an advantage that developers will flock to it (I don't understand why I would use this, but I assume other developers would think the opposite), and there'll just be multiple ways of offering member-method-like functions for classes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can get pretty close to the C# syntax, but it requires an explicit opt-in:

alias Reverse = Utility.Reverse;
// ...
var y: String = x.(Reverse)();

And yeah, I also don't expect developers to flock to this. But we need some rule in this space, and the only question seems to be how much of a barrier and what amount of language complication we want to put in front of people -- even the most restrictive of the alternatives below still permits writing extension methods, with more cumbersome syntax. The simple, orthogonal thing to do is to not restrict which functions can have a me.

There is one way in which this isn't just moving the location of Utility.Reverse in the call: only me has the special addr me behavior that implicitly takes the address of the value before the ..

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One ergonomic advantage of method call syntax is that, if the codebase uses a natural language like English that has SVO word order, the method can sometimes be named so that the call syntax matches that order, which can make the code clearer. For example, I find x.Contains(y) substantially clearer than Contains(x, y), because it matches the order of the English phrase "x contains y". That not only reduces my mental effort in parsing the code, it makes me more confident in the meaning of the code: x.Contains(y) is very unlikely to mean "y contains x", whereas Contains(x, y) plausibly could.

Extension methods make that option available even when the "subject" and "verb" are defined by different libraries. However, that may be undermined by the need to qualify the method name in most cases: x.(Utility.Contains)(y) doesn't read very naturally, although it's probably still less ambiguous about the roles of x and y. And of course this advantage doesn't apply at all if the codebase's natural language doesn't use SVO order.

Copy link
Contributor

@jonmeow jonmeow Mar 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be clear, I'm not objecting to extension methods (as a concept), I'm objecting that the syntactic overhead undermines the ergonomic benefit of this particular approach. Some of the question may be from the intuition: is compound member access going to be something that's really common in Carbon, or something that developers rarely see? My assumption is the latter, but maybe others differ.

If compound member access is common, then it reduces the cognitive costs of the feature by making it something people are accustomed to; if rare, then developers will be learning the feature when they see it (and perhaps forgetting it quickly due to disuse), and leading to slower understanding of code and likely more bugs.

Contains(x, y) may not read as great, but it's going to be familiar to C++ developers as a function call, even versus x.(Contains)(y).

In other words -- maybe similar things would exist regardless, as noted in the alternatives, and determined developers would surely find them. But pushing developers towards a smaller set of common approaches has readability benefits too; I'd argue it's lower cognitive load for developers.

- Allowing methods to be declared anywhere is a simpler rule to remember
jonmeow marked this conversation as resolved.
Show resolved Hide resolved
and understand than introducing some kind of restriction that the
receiver type must be in some way related to the `class` or `impl` in
which it is declared, or restricting to only classes but not restricting
to related classes. However, this is a departure from the model in C++
where methods must be member functions, which may introduce some initial
cognitive cost.
- A uniform name lookup rule is easier to understand than other approaches
towards extension methods.
- [Interoperability with and migration from existing C++ code](/docs/project/goals.md#interoperability-with-and-migration-from-existing-c-code)
- Calling extension methods from C++ may present a challenge, as there is
jonmeow marked this conversation as resolved.
Show resolved Hide resolved
no C++ syntax to do so, but this challenge already exists when calling
methods of an external impl from C++, so this cost is likely minimal.

This approach is also motivated by
[MacLennan's](https://csis.pace.edu/~bergin/slides/Maclennan.html) principles of
orthogonality, regularity, and simplicity.

## Alternatives considered

We could disallow extension methods, in various different ways.

### Require `me` to have a type involving `Self`

We could restrict `me` parameters to only locations where `Self` is in scope,
and require its type to be compatible with `Self` in some sense.

This rule can be worked around with an adapter:

```
adapter ExtensionMethodWrapper for i32 {
fn ExtensionMethodWrapper[me: Self]();
}
alias ExtensionMethod = ExtensionMethodWrapper.ExtensionMethod;
// ✅ OK, unless we add some other rule to prevent it.
5.(ExtensionMethod)();
```

Advantages:

- Every method is a method on the current `Self` type, making the recipient
easier to reason about.

Disadvantages:

- Gets in the way of some use cases, such as providing a method whose receiver
type is a derived-class type.
- Requires additional syntactic overhead to customize what names can appear
after a `.`, such as defining an `adapter`.
- Unclear in exactly what sense we would require type compatibility. This
especially applies when the type of `me` is parameterized.

### Disallow `me` where `Self` is not in scope

We could restrict `me` parameters to only functions that are defined in the
scope of a type.

With this restriction, one can still write extension methods as described in
this proposal with a local syntactic workaround:

```
class ExtensionMethodWrapper {
fn ExtensionMethod[me: i32]();
}
alias ExtensionMethod = ExtensionMethodWrapper.ExtensionMethod;
// ✅ OK, unless we add some other rule to prevent it.
5.(ExtensionMethod)();
```

As a consequence, this doesn't provide any real guarantees about the receiver
type of a method nor where a method called on an object may be declared.