Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add first-class span types proposal #7904

Merged
merged 2 commits into from
Feb 20, 2024
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
285 changes: 285 additions & 0 deletions proposals/first-class-span-types.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,285 @@
# First-class Span Types

## Summary

We introduce first-class support for `Span<T>` and `ReadOnlySpan<T>` in the language, including new implicit conversion types and consider them in more places,
allowing more natural programming with these integral types.

## Motivation

Since their introduction in C# 7.2, `Span<T>` and `ReadOnlySpan<T>` have worked their way into the language and base class library (BCL) in many key ways. This is great for
developers, as their introduction improves performance without costing developer safety. However, the language has held these types at arm's length in a few key ways,
which makes it hard to express the intent of APIs and leads to a significant amount of surface area duplication for new APIs. For example, the BCL has added a number of new
[tensor primitive APIs](https://github.com/dotnet/runtime/issues/94553) in .NET 9, but these APIs are all offered on `ReadOnlySpan<T>`. Because C# doesn't recognize the
relationship between `ReadOnlySpan<T>`, `Span<T>`, and `T[]`, it means that any developers looking to use those APIs with anything other than a `ReadOnlySpan<T>` have to explicitly
convert to a `ReadOnlySpan<T>`. Further, it also means that they don't have IDE tooling guiding them to use these APIs, since nothing will indicate to the IDE that it is valid
to pass them after conversion. There are also issues with generic inference in these scenarios. In order to provide maximum usability for this style of API, the BCL will have to
define an entire set of `Span<T>` and `T[]` overloads, which is a lot of duplicate surface area to maintain for no real gain. This proposal seeks to address the problem by
having the language more directly recognize these types and conversions.

## Detailed Design

### Implicit Span Conversions

We add a new type of implicit conversion to the list in [§10.2.1](https://github.com/dotnet/csharpstandard/blob/draft-v8/standard/conversions.md#1021-general), an
_implicit span conversion_. This conversion is defined as follows:

------

An implicit span conversion permits `array_types`, `System.Span<T>`, `System.ReadOnlySpan<T>`, and `string` to be converted between each other as follows:
* From any `array_type` with element type `Ei` to `System.Span<Ei>`
333fred marked this conversation as resolved.
Show resolved Hide resolved
* From any `array_type` with element type `Ei` to `System.ReadOnlySpan<Ui>`, provided that `Ei` is covariance-convertible ([§18.2.3.3](https://github.com/dotnet/csharpstandard/blob/draft-v8/standard/interfaces.md#18233-variance-conversion)) to `Ui`
* From `System.Span<Ti>` to `System.ReadOnlySpan<Ui>`, provided that `Ti` is covariance-convertible ([§18.2.3.3](https://github.com/dotnet/csharpstandard/blob/draft-v8/standard/interfaces.md#18233-variance-conversion)) to `Ui`
* From `System.ReadOnlySpan<Ti>` to `System.ReadOnlySpan<Ui>`, provided that `Ti` is covariance-convertible ([§18.2.3.3](https://github.com/dotnet/csharpstandard/blob/draft-v8/standard/interfaces.md#18233-variance-conversion)) to `Ui`
* From `string` to `System.ReadOnlySpan<char>`

------

We also add _implicit span conversion_ to the list of standard implicit conversions
([§10.5.4](https://github.com/dotnet/csharpstandard/blob/draft-v8/standard/conversions.md#1054-user-defined-implicit-conversions)). This allows overload resolution to consider
them when performing argument resolution, as in the previously-linked API proposal.

We also add _implicit span conversion_ to the list of acceptable implicit conversions on the first parameter of an extension method when determining applicability
([12.8.9.3](https://github.com/dotnet/csharpstandard/blob/draft-v8/standard/expressions.md#12893-extension-method-invocations)) (change in bold):

> An extension method `Cᵢ.Mₑ` is ***eligible*** if:
>
> - `Cᵢ` is a non-generic, non-nested class
> - The name of `Mₑ` is *identifier*
> - `Mₑ` is accessible and applicable when applied to the arguments as a static method as shown above
> - An implicit identity, reference ~~or boxing~~**, boxing, or span** conversion exists from *expr* to the type of the first parameter of `Mₑ`.
333fred marked this conversation as resolved.
Show resolved Hide resolved

#### Variance

The goal of the variance section in _implicit span conversion_ is to replicate some amount of covariance for `System.ReadOnlySpan<T>`. Runtime changes would be required to fully
implement variance through generics here (see https://github.com/dotnet/csharplang/blob/main/proposals/ref-struct-interfaces.md for using `ref struct` types in generics), but we can
allow a limited amount of covariance through use of a proposed .NET 9 API: https://github.com/dotnet/runtime/issues/96952. This will allow the language to treat `System.ReadOnlySpan<T>`
as if the `T` was declared as `out T` in some scenarios. We do not, however, plumb this variant conversion through _all_ variance scenarios, and do not add it to the definition of
variance-convertible in [§18.2.3.3](https://github.com/dotnet/csharpstandard/blob/draft-v8/standard/interfaces.md#18233-variance-conversion). If in the future, we change the runtime
to more deeply understand the variance here, we can take the minor breaking change to fully recognize it in the language.

Practically, this will also mean that in pattern matching for generic scenarios, we'd have behavior as follows:

```cs
using System;

M<object[]>(["0"]); // Does not print
M<ReadOnlySpan<string>>(["1"]); // Does not print
M<Span<object>>(["2"]); // Does not print
M<ReadOnlySpan<object>>(["3"]); // Prints

void M<T>(T t) where T : allows ref struct
{
if (t is ReadOnlySpan<object> r) Console.WriteLine(r[0]);
}
```

In array variance scenarios, this pattern would return true for all reference type arrays:

```cs
using System;

M<object[]>(["0"]); // Prints
M<string[]>(["1"]); // Prints

void M<T>(T t)
{
if (t is object[] r) Console.WriteLine(r[0]);
}
```

There is also an open question below about participation in delegate signature matching.

### Type inference

We update the type inferences section of the specification as follows (changes in **bold**).

> #### 12.6.3.9 Exact inferences
>
> An *exact inference* *from* a type `U` *to* a type `V` is made as follows:
>
> - If `V` is one of the *unfixed* `Xᵢ` then `U` is added to the set of exact bounds for `Xᵢ`.
> - Otherwise, sets `V₁...Vₑ` and `U₁...Uₑ` are determined by checking if any of the following cases apply:
> - `V` is an array type `V₁[...]` and `U` is an array type `U₁[...]` of the same rank
> - **`V` is an array type `V₁[]` and `U` is a `Span<U₁>` or `ReadOnlySpan<U₁>`**
> - **`V` is a `Span<V₁>` and `U` is a `Span<U₁>` or `ReadOnlySpan<U₁>`**
> - **`V` is a `ReadOnlySpan<V₁>` and `U` is a `ReadOnlySpan<U₁>`**
> - `V` is the type `V₁?` and `U` is the type `U₁`
> - `V` is a constructed type `C<V₁...Vₑ>` and `U` is a constructed type `C<U₁...Uₑ>`
> If any of these cases apply then an *exact inference* is made from each `Uᵢ` to the corresponding `Vᵢ`.
> - Otherwise, no inferences are made.
>
> #### 12.6.3.10 Lower-bound inferences
>
> A *lower-bound inference from* a type `U` *to* a type `V` is made as follows:
>
> - If `V` is one of the *unfixed* `Xᵢ` then `U` is added to the set of lower bounds for `Xᵢ`.
> - Otherwise, if `V` is the type `V₁?` and `U` is the type `U₁?` then a lower bound inference is made from `U₁` to `V₁`.
> - Otherwise, sets `U₁...Uₑ` and `V₁...Vₑ` are determined by checking if any of the following cases apply:
> - `V` is an array type `V₁[...]` and `U` is an array type `U₁[...]`of the same rank
> - **`V` is an array type `V₁[]` and `U` is a `Span<U₁>` or `ReadOnlySpan<U₁>`**
> - **`V` is a `Span<V₁>` and `U` is a `Span<U₁>` or `ReadOnlySpan<U₁>`**
> - **`V` is a `ReadOnlySpan<V₁>` and `U` is a `ReadOnlySpan<U₁>`**
> - `V` is one of `IEnumerable<V₁>`, `ICollection<V₁>`, `IReadOnlyList<V₁>>`, `IReadOnlyCollection<V₁>` or `IList<V₁>` and `U` is a single-dimensional array type `U₁[]`
> - `V` is a constructed `class`, `struct`, `interface` or `delegate` type `C<V₁...Vₑ>` and there is a unique type `C<U₁...Uₑ>` such that `U` (or, if `U` is a type `parameter`, its effective base class or any member of its effective interface set) is identical to, `inherits` from (directly or indirectly), or implements (directly or indirectly) `C<U₁...Uₑ>`.
> - (The “uniqueness” restriction means that in the case interface `C<T>{} class U: C<X>, C<Y>{}`, then no inference is made when inferring from `U` to `C<T>` because `U₁` could be `X` or `Y`.)
> If any of these cases apply then an inference is made from each `Uᵢ` to the corresponding `Vᵢ` as follows:
> - If `Uᵢ` is not known to be a reference type then an *exact inference* is made
> - Otherwise, if `U` is an array type then ~~a *lower-bound inference* is made~~ **inference depends on the type of `V`**:
> - **If `V` is a `Span<Vᵢ>`, then an *exact inference* is made**
> - **If `V` is an array type or a `ReadOnlySpan<Vᵢ>`, then a *lower-bound inference* is made**
> - **Otherwise, if `U` is a `Span<Uᵢ>` then inference depends on the type of `V`**:
> - **If `V` is a `Span<Vᵢ>`, then an *exact inference* is made**
> - **If `V` is a `ReadOnlySpan<Vᵢ>`, then a *lower-bound inference* is made**
> - **Otherwise, if `U` is a `ReadOnlySpan<Uᵢ>` and `V` is a `ReadOnlySpan<Vᵢ>` a *lower-bound inference* is made**:
> - Otherwise, if `V` is `C<V₁...Vₑ>` then inference depends on the `i-th` type parameter of `C`:
> - If it is covariant then a *lower-bound inference* is made.
> - If it is contravariant then an *upper-bound inference* is made.
> - If it is invariant then an *exact inference* is made.
> - Otherwise, no inferences are made.
>
> #### 12.6.3.11 Upper-bound inferences
>
> An *upper-bound inference from* a type `U` *to* a type `V` is made as follows:
>
> - If `V` is one of the *unfixed* `Xᵢ` then `U` is added to the set of upper bounds for `Xᵢ`.
> - Otherwise, sets `V₁...Vₑ` and `U₁...Uₑ` are determined by checking if any of the following cases apply:
> - `U` is an array type `U₁[...]` and `V` is an array type `V₁[...]` of the same rank
> - **`U` is an array type `U₁[]` and `V` is a `Span<V₁>` or `ReadOnlySpan<V₁>`**
> - **`U` is a `Span<V₁>` and `V` is a `Span<V₁>` or `ReadOnlySpan<V₁>`**
> - **`U` is a `ReadOnlySpan<V₁>` and `V` is a `ReadOnlySpan<V₁>`**
> - `U` is one of `IEnumerable<Uₑ>`, `ICollection<Uₑ>`, `IReadOnlyList<Uₑ>`, `IReadOnlyCollection<Uₑ>` or `IList<Uₑ>` and `V` is a single-dimensional array type `Vₑ[]`
> - `U` is the type `U1?` and `V` is the type `V1?`
> - `U` is constructed class, struct, interface or delegate type `C<U₁...Uₑ>` and `V` is a `class, struct, interface` or `delegate` type which is `identical` to, `inherits` from (directly or indirectly), or implements (directly or indirectly) a unique type `C<V₁...Vₑ>`
> - (The “uniqueness” restriction means that given an interface `C<T>{} class V<Z>: C<X<Z>>, C<Y<Z>>{}`, then no inference is made when inferring from `C<U₁>` to `V<Q>`. Inferences are not made from `U₁` to either `X<Q>` or `Y<Q>`.)
> If any of these cases apply then an inference is made from each `Uᵢ` to the corresponding `Vᵢ` as follows:
> - If `Uᵢ` is not known to be a reference type then an *exact inference* is made
> - Otherwise, if `V` is an array type then ~~an *upper-bound inference* is made~~ **inference depends on the type of `U`**:
> - **If `U` is a `Span<Uᵢ>`, then an *exact inference* is made**
> - **If `U` is an array type or a `ReadOnlySpan<Uᵢ>`, then a *upper-bound inference* is made**
> - **Otherwise, if `V` is a `Span<Vᵢ>` then inference depends on the type of `U`**:
> - **If `U` is a `Span<Uᵢ>`, then an *exact inference* is made**
> - **If `U` is a `ReadOnlySpan<Uᵢ>`, then an *upper-bound inference* is made**
> - **Otherwise, if `V` is a `ReadOnlySpan<Vᵢ>` and `U` is a `ReadOnlySpan<Uᵢ>` an *upper-bound inference* is made**:
> - Otherwise, if `U` is `C<U₁...Uₑ>` then inference depends on the `i-th` type parameter of `C`:
> - If it is covariant then an *upper-bound inference* is made.
> - If it is contravariant then a *lower-bound inference* is made.
> - If it is invariant then an *exact inference* is made.
> - Otherwise, no inferences are made.

### Breaking changes

As any proposal that changes conversions of existing scenarios, this proposal does introduce some new breaking changes. Here's a few examples:

#### User-defined conversions through inheritance

By adding _implicit span conversions_ to the list of standard implicit conversions, we can potentially change behavior when user-defined conversions are involved in a type hierarchy.
This example shows that change, in comparison to an integer scenario that already behaves as the new C# 13 behavior will.

```cs
Span<string> span = [];
var d = new Derived();
d.M(span); // Base today, Derived tomorrow
int i = 1;
d.M(i); // Derived today, demonstrates new behavior

class Base
{
public void M(Span<string> s)
{
Console.WriteLine("Base");
}

public void M(int i)
{
Console.WriteLine("Base");
}
}

class Derived : Base
{
public static implicit operator Derived(ReadOnlySpan<string> r) => new Derived();
public static implicit operator Derived(long l) => new Derived();

public void M(Derived s)
{
Console.WriteLine("Derived");
}
}
```

#### Extension method lookup

By allowing _implicit span conversions_ in extension method lookup, we can potentially change what extension method is resolved by overload resolution.

```cs
namespace N1
{
using N2;

public class C
{
public static void M()
{
Span<string> span = new string[0];
span.Test(); // Prints N2 today, N1 tomorrow
}
}

public static class N1Ext
{
public static void Test(this ReadOnlySpan<string> span)
{
Console.WriteLine("N1");
}
}
}

namespace N2
{
public static class N2Ext
{
public static void Test(this Span<string> span)
{
Console.WriteLine("N2");
}
}
}
```

## Open questions

### Delegate signature matching

Should we allow variance conversion in delegate signature matching? For example:

```cs
using System;

Span<string> M1() => throw null!;
void M2(ReadOnlySpan<object> r) {}

delegate ReadOnlySpan<string> D1();
delegate void D2(ReadOnlySpan<string> r);

// Should these work?
D1 d1 = M1; // Convert Span<string>() to ReadOnlySpan<string>()
D2 d2 = M2; // Convert void(ReadOnlySpan<object>) to void(ReadOnlySpan<string>)

// These work today
string[] M3() => throw null!;
void M4(object[] a) {}

delegate object[] D3();
delegate void D4(string[] a);

D3 d3 = M3; // Convert string[]() to object[]()
D4 d4 = M4; // Convert void(object[]) to void(string[])
```

These conversions may not be possible to do without creating a wrapper lambda without runtime changes; the existing variant delegate conversions are possible to emit
without needing to create wrappers. We don't have precedent in the language for silent wrappers like this, and generally require users to create such wrapper lambdas themselves.

## Alternatives

Keep things as they are.