-
Notifications
You must be signed in to change notification settings - Fork 1k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add first-class span types proposal (#7904)
* Add first-class span types proposal * PR Feedback
- Loading branch information
Showing
1 changed file
with
285 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,285 @@ | ||
# First-class Span Types | ||
|
||
## Summary | ||
|
||
We introduce first-class support for `Span<T>` and `ReadOnlySpan<T>` in the language, including new implicit conversion types and consider them in more places, | ||
allowing more natural programming with these integral types. | ||
|
||
## Motivation | ||
|
||
Since their introduction in C# 7.2, `Span<T>` and `ReadOnlySpan<T>` have worked their way into the language and base class library (BCL) in many key ways. This is great for | ||
developers, as their introduction improves performance without costing developer safety. However, the language has held these types at arm's length in a few key ways, | ||
which makes it hard to express the intent of APIs and leads to a significant amount of surface area duplication for new APIs. For example, the BCL has added a number of new | ||
[tensor primitive APIs](https://github.com/dotnet/runtime/issues/94553) in .NET 9, but these APIs are all offered on `ReadOnlySpan<T>`. Because C# doesn't recognize the | ||
relationship between `ReadOnlySpan<T>`, `Span<T>`, and `T[]`, it means that any developers looking to use those APIs with anything other than a `ReadOnlySpan<T>` have to explicitly | ||
convert to a `ReadOnlySpan<T>`. Further, it also means that they don't have IDE tooling guiding them to use these APIs, since nothing will indicate to the IDE that it is valid | ||
to pass them after conversion. There are also issues with generic inference in these scenarios. In order to provide maximum usability for this style of API, the BCL will have to | ||
define an entire set of `Span<T>` and `T[]` overloads, which is a lot of duplicate surface area to maintain for no real gain. This proposal seeks to address the problem by | ||
having the language more directly recognize these types and conversions. | ||
|
||
## Detailed Design | ||
|
||
### Implicit Span Conversions | ||
|
||
We add a new type of implicit conversion to the list in [§10.2.1](https://github.com/dotnet/csharpstandard/blob/draft-v8/standard/conversions.md#1021-general), an | ||
_implicit span conversion_. This conversion is defined as follows: | ||
|
||
------ | ||
|
||
An implicit span conversion permits `array_types`, `System.Span<T>`, `System.ReadOnlySpan<T>`, and `string` to be converted between each other as follows: | ||
* From any single-dimensional `array_type` with element type `Ei` to `System.Span<Ei>` | ||
* From any single-dimensional `array_type` with element type `Ei` to `System.ReadOnlySpan<Ui>`, provided that `Ei` is covariance-convertible ([§18.2.3.3](https://github.com/dotnet/csharpstandard/blob/draft-v8/standard/interfaces.md#18233-variance-conversion)) to `Ui` | ||
* From `System.Span<Ti>` to `System.ReadOnlySpan<Ui>`, provided that `Ti` is covariance-convertible ([§18.2.3.3](https://github.com/dotnet/csharpstandard/blob/draft-v8/standard/interfaces.md#18233-variance-conversion)) to `Ui` | ||
* From `System.ReadOnlySpan<Ti>` to `System.ReadOnlySpan<Ui>`, provided that `Ti` is covariance-convertible ([§18.2.3.3](https://github.com/dotnet/csharpstandard/blob/draft-v8/standard/interfaces.md#18233-variance-conversion)) to `Ui` | ||
* From `string` to `System.ReadOnlySpan<char>` | ||
|
||
------ | ||
|
||
We also add _implicit span conversion_ to the list of standard implicit conversions | ||
([§10.5.4](https://github.com/dotnet/csharpstandard/blob/draft-v8/standard/conversions.md#1054-user-defined-implicit-conversions)). This allows overload resolution to consider | ||
them when performing argument resolution, as in the previously-linked API proposal. | ||
|
||
We also add _implicit span conversion_ to the list of acceptable implicit conversions on the first parameter of an extension method when determining applicability | ||
([12.8.9.3](https://github.com/dotnet/csharpstandard/blob/draft-v8/standard/expressions.md#12893-extension-method-invocations)) (change in bold): | ||
|
||
> An extension method `Cᵢ.Mₑ` is ***eligible*** if: | ||
> | ||
> - `Cᵢ` is a non-generic, non-nested class | ||
> - The name of `Mₑ` is *identifier* | ||
> - `Mₑ` is accessible and applicable when applied to the arguments as a static method as shown above | ||
> - An implicit identity, reference ~~or boxing~~ **, boxing, or span** conversion exists from *expr* to the type of the first parameter of `Mₑ`. | ||
#### Variance | ||
|
||
The goal of the variance section in _implicit span conversion_ is to replicate some amount of covariance for `System.ReadOnlySpan<T>`. Runtime changes would be required to fully | ||
implement variance through generics here (see https://github.com/dotnet/csharplang/blob/main/proposals/ref-struct-interfaces.md for using `ref struct` types in generics), but we can | ||
allow a limited amount of covariance through use of a proposed .NET 9 API: https://github.com/dotnet/runtime/issues/96952. This will allow the language to treat `System.ReadOnlySpan<T>` | ||
as if the `T` was declared as `out T` in some scenarios. We do not, however, plumb this variant conversion through _all_ variance scenarios, and do not add it to the definition of | ||
variance-convertible in [§18.2.3.3](https://github.com/dotnet/csharpstandard/blob/draft-v8/standard/interfaces.md#18233-variance-conversion). If in the future, we change the runtime | ||
to more deeply understand the variance here, we can take the minor breaking change to fully recognize it in the language. | ||
|
||
Practically, this will also mean that in pattern matching for generic scenarios, we'd have behavior as follows: | ||
|
||
```cs | ||
using System; | ||
|
||
M<object[]>(["0"]); // Does not print | ||
M<ReadOnlySpan<string>>(["1"]); // Does not print | ||
M<Span<object>>(["2"]); // Does not print | ||
M<ReadOnlySpan<object>>(["3"]); // Prints | ||
void M<T>(T t) where T : allows ref struct | ||
{ | ||
if (t is ReadOnlySpan<object> r) Console.WriteLine(r[0]); | ||
} | ||
``` | ||
|
||
In array variance scenarios, this pattern would return true for all reference type arrays: | ||
|
||
```cs | ||
using System; | ||
|
||
M<object[]>(["0"]); // Prints | ||
M<string[]>(["1"]); // Prints | ||
void M<T>(T t) | ||
{ | ||
if (t is object[] r) Console.WriteLine(r[0]); | ||
} | ||
``` | ||
|
||
There is also an open question below about participation in delegate signature matching. | ||
|
||
### Type inference | ||
|
||
We update the type inferences section of the specification as follows (changes in **bold**). | ||
|
||
> #### 12.6.3.9 Exact inferences | ||
> | ||
> An *exact inference* *from* a type `U` *to* a type `V` is made as follows: | ||
> | ||
> - If `V` is one of the *unfixed* `Xᵢ` then `U` is added to the set of exact bounds for `Xᵢ`. | ||
> - Otherwise, sets `V₁...Vₑ` and `U₁...Uₑ` are determined by checking if any of the following cases apply: | ||
> - `V` is an array type `V₁[...]` and `U` is an array type `U₁[...]` of the same rank | ||
> - **`V` is an array type `V₁[]` and `U` is a `Span<U₁>` or `ReadOnlySpan<U₁>`** | ||
> - **`V` is a `Span<V₁>` and `U` is a `Span<U₁>` or `ReadOnlySpan<U₁>`** | ||
> - **`V` is a `ReadOnlySpan<V₁>` and `U` is a `ReadOnlySpan<U₁>`** | ||
> - `V` is the type `V₁?` and `U` is the type `U₁` | ||
> - `V` is a constructed type `C<V₁...Vₑ>` and `U` is a constructed type `C<U₁...Uₑ>` | ||
> If any of these cases apply then an *exact inference* is made from each `Uᵢ` to the corresponding `Vᵢ`. | ||
> - Otherwise, no inferences are made. | ||
> | ||
> #### 12.6.3.10 Lower-bound inferences | ||
> | ||
> A *lower-bound inference from* a type `U` *to* a type `V` is made as follows: | ||
> | ||
> - If `V` is one of the *unfixed* `Xᵢ` then `U` is added to the set of lower bounds for `Xᵢ`. | ||
> - Otherwise, if `V` is the type `V₁?` and `U` is the type `U₁?` then a lower bound inference is made from `U₁` to `V₁`. | ||
> - Otherwise, sets `U₁...Uₑ` and `V₁...Vₑ` are determined by checking if any of the following cases apply: | ||
> - `V` is an array type `V₁[...]` and `U` is an array type `U₁[...]`of the same rank | ||
> - **`V` is an array type `V₁[]` and `U` is a `Span<U₁>` or `ReadOnlySpan<U₁>`** | ||
> - **`V` is a `Span<V₁>` and `U` is a `Span<U₁>` or `ReadOnlySpan<U₁>`** | ||
> - **`V` is a `ReadOnlySpan<V₁>` and `U` is a `ReadOnlySpan<U₁>`** | ||
> - `V` is one of `IEnumerable<V₁>`, `ICollection<V₁>`, `IReadOnlyList<V₁>>`, `IReadOnlyCollection<V₁>` or `IList<V₁>` and `U` is a single-dimensional array type `U₁[]` | ||
> - `V` is a constructed `class`, `struct`, `interface` or `delegate` type `C<V₁...Vₑ>` and there is a unique type `C<U₁...Uₑ>` such that `U` (or, if `U` is a type `parameter`, its effective base class or any member of its effective interface set) is identical to, `inherits` from (directly or indirectly), or implements (directly or indirectly) `C<U₁...Uₑ>`. | ||
> - (The “uniqueness” restriction means that in the case interface `C<T>{} class U: C<X>, C<Y>{}`, then no inference is made when inferring from `U` to `C<T>` because `U₁` could be `X` or `Y`.) | ||
> If any of these cases apply then an inference is made from each `Uᵢ` to the corresponding `Vᵢ` as follows: | ||
> - If `Uᵢ` is not known to be a reference type then an *exact inference* is made | ||
> - Otherwise, if `U` is an array type then ~~a *lower-bound inference* is made~~ **inference depends on the type of `V`**: | ||
> - **If `V` is a `Span<Vᵢ>`, then an *exact inference* is made** | ||
> - **If `V` is an array type or a `ReadOnlySpan<Vᵢ>`, then a *lower-bound inference* is made** | ||
> - **Otherwise, if `U` is a `Span<Uᵢ>` then inference depends on the type of `V`**: | ||
> - **If `V` is a `Span<Vᵢ>`, then an *exact inference* is made** | ||
> - **If `V` is a `ReadOnlySpan<Vᵢ>`, then a *lower-bound inference* is made** | ||
> - **Otherwise, if `U` is a `ReadOnlySpan<Uᵢ>` and `V` is a `ReadOnlySpan<Vᵢ>` a *lower-bound inference* is made**: | ||
> - Otherwise, if `V` is `C<V₁...Vₑ>` then inference depends on the `i-th` type parameter of `C`: | ||
> - If it is covariant then a *lower-bound inference* is made. | ||
> - If it is contravariant then an *upper-bound inference* is made. | ||
> - If it is invariant then an *exact inference* is made. | ||
> - Otherwise, no inferences are made. | ||
> | ||
> #### 12.6.3.11 Upper-bound inferences | ||
> | ||
> An *upper-bound inference from* a type `U` *to* a type `V` is made as follows: | ||
> | ||
> - If `V` is one of the *unfixed* `Xᵢ` then `U` is added to the set of upper bounds for `Xᵢ`. | ||
> - Otherwise, sets `V₁...Vₑ` and `U₁...Uₑ` are determined by checking if any of the following cases apply: | ||
> - `U` is an array type `U₁[...]` and `V` is an array type `V₁[...]` of the same rank | ||
> - **`U` is an array type `U₁[]` and `V` is a `Span<V₁>` or `ReadOnlySpan<V₁>`** | ||
> - **`U` is a `Span<V₁>` and `V` is a `Span<V₁>` or `ReadOnlySpan<V₁>`** | ||
> - **`U` is a `ReadOnlySpan<V₁>` and `V` is a `ReadOnlySpan<V₁>`** | ||
> - `U` is one of `IEnumerable<Uₑ>`, `ICollection<Uₑ>`, `IReadOnlyList<Uₑ>`, `IReadOnlyCollection<Uₑ>` or `IList<Uₑ>` and `V` is a single-dimensional array type `Vₑ[]` | ||
> - `U` is the type `U1?` and `V` is the type `V1?` | ||
> - `U` is constructed class, struct, interface or delegate type `C<U₁...Uₑ>` and `V` is a `class, struct, interface` or `delegate` type which is `identical` to, `inherits` from (directly or indirectly), or implements (directly or indirectly) a unique type `C<V₁...Vₑ>` | ||
> - (The “uniqueness” restriction means that given an interface `C<T>{} class V<Z>: C<X<Z>>, C<Y<Z>>{}`, then no inference is made when inferring from `C<U₁>` to `V<Q>`. Inferences are not made from `U₁` to either `X<Q>` or `Y<Q>`.) | ||
> If any of these cases apply then an inference is made from each `Uᵢ` to the corresponding `Vᵢ` as follows: | ||
> - If `Uᵢ` is not known to be a reference type then an *exact inference* is made | ||
> - Otherwise, if `V` is an array type then ~~an *upper-bound inference* is made~~ **inference depends on the type of `U`**: | ||
> - **If `U` is a `Span<Uᵢ>`, then an *exact inference* is made** | ||
> - **If `U` is an array type or a `ReadOnlySpan<Uᵢ>`, then a *upper-bound inference* is made** | ||
> - **Otherwise, if `V` is a `Span<Vᵢ>` then inference depends on the type of `U`**: | ||
> - **If `U` is a `Span<Uᵢ>`, then an *exact inference* is made** | ||
> - **If `U` is a `ReadOnlySpan<Uᵢ>`, then an *upper-bound inference* is made** | ||
> - **Otherwise, if `V` is a `ReadOnlySpan<Vᵢ>` and `U` is a `ReadOnlySpan<Uᵢ>` an *upper-bound inference* is made**: | ||
> - Otherwise, if `U` is `C<U₁...Uₑ>` then inference depends on the `i-th` type parameter of `C`: | ||
> - If it is covariant then an *upper-bound inference* is made. | ||
> - If it is contravariant then a *lower-bound inference* is made. | ||
> - If it is invariant then an *exact inference* is made. | ||
> - Otherwise, no inferences are made. | ||
### Breaking changes | ||
|
||
As any proposal that changes conversions of existing scenarios, this proposal does introduce some new breaking changes. Here's a few examples: | ||
|
||
#### User-defined conversions through inheritance | ||
|
||
By adding _implicit span conversions_ to the list of standard implicit conversions, we can potentially change behavior when user-defined conversions are involved in a type hierarchy. | ||
This example shows that change, in comparison to an integer scenario that already behaves as the new C# 13 behavior will. | ||
|
||
```cs | ||
Span<string> span = []; | ||
var d = new Derived(); | ||
d.M(span); // Base today, Derived tomorrow | ||
int i = 1; | ||
d.M(i); // Derived today, demonstrates new behavior | ||
class Base | ||
{ | ||
public void M(Span<string> s) | ||
{ | ||
Console.WriteLine("Base"); | ||
} | ||
|
||
public void M(int i) | ||
{ | ||
Console.WriteLine("Base"); | ||
} | ||
} | ||
|
||
class Derived : Base | ||
{ | ||
public static implicit operator Derived(ReadOnlySpan<string> r) => new Derived(); | ||
public static implicit operator Derived(long l) => new Derived(); | ||
|
||
public void M(Derived s) | ||
{ | ||
Console.WriteLine("Derived"); | ||
} | ||
} | ||
``` | ||
|
||
#### Extension method lookup | ||
|
||
By allowing _implicit span conversions_ in extension method lookup, we can potentially change what extension method is resolved by overload resolution. | ||
|
||
```cs | ||
namespace N1 | ||
{ | ||
using N2; | ||
|
||
public class C | ||
{ | ||
public static void M() | ||
{ | ||
Span<string> span = new string[0]; | ||
span.Test(); // Prints N2 today, N1 tomorrow | ||
} | ||
} | ||
|
||
public static class N1Ext | ||
{ | ||
public static void Test(this ReadOnlySpan<string> span) | ||
{ | ||
Console.WriteLine("N1"); | ||
} | ||
} | ||
} | ||
|
||
namespace N2 | ||
{ | ||
public static class N2Ext | ||
{ | ||
public static void Test(this Span<string> span) | ||
{ | ||
Console.WriteLine("N2"); | ||
} | ||
} | ||
} | ||
``` | ||
|
||
## Open questions | ||
|
||
### Delegate signature matching | ||
|
||
Should we allow variance conversion in delegate signature matching? For example: | ||
|
||
```cs | ||
using System; | ||
|
||
Span<string> M1() => throw null!; | ||
void M2(ReadOnlySpan<object> r) {} | ||
|
||
delegate ReadOnlySpan<string> D1(); | ||
delegate void D2(ReadOnlySpan<string> r); | ||
|
||
// Should these work? | ||
D1 d1 = M1; // Convert Span<string>() to ReadOnlySpan<string>() | ||
D2 d2 = M2; // Convert void(ReadOnlySpan<object>) to void(ReadOnlySpan<string>) | ||
// These work today | ||
string[] M3() => throw null!; | ||
void M4(object[] a) {} | ||
|
||
delegate object[] D3(); | ||
delegate void D4(string[] a); | ||
|
||
D3 d3 = M3; // Convert string[]() to object[]() | ||
D4 d4 = M4; // Convert void(object[]) to void(string[]) | ||
``` | ||
|
||
These conversions may not be possible to do without creating a wrapper lambda without runtime changes; the existing variant delegate conversions are possible to emit | ||
without needing to create wrappers. We don't have precedent in the language for silent wrappers like this, and generally require users to create such wrapper lambdas themselves. | ||
|
||
## Alternatives | ||
|
||
Keep things as they are. |