Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discussion thread for pattern-matching #10153

Closed
gafter opened this issue Mar 28, 2016 · 147 comments
Closed

Discussion thread for pattern-matching #10153

gafter opened this issue Mar 28, 2016 · 147 comments

Comments

@gafter
Copy link
Member

gafter commented Mar 28, 2016

This is a discussion thread for pattern-matching, as specified in https://github.com/dotnet/roslyn/blob/features/patterns/docs/features/patterns.md

Please check the spec before adding your comment - it might have been addressed in the latest draft.

@orthoxerox
Copy link
Contributor

Open questions:
Pattern matching with arrays.
Pattern matching with List, Dictionary

Every time I try to come up with a reasonably expressive syntax for this, any sample pattern explodes into a huge mess. On one hand, every extracted element must be assignable to an identifier, but on the other hand, we might want to apply a pattern to it which might not allow us to get one. I feel like restricting the pattern to extracting head(s) and a tail is a reasonable solution.

For example, what would "the token stream should contain a reserved word class, then an identifier" pattern look like? reservedWordClass::className::rest? This is missing all the checks, but adding them turns it into

ReservedWordToken{Keyword is "class"}::IdentifierToken className::rest

We're lucky that we can abandon the reserved word token and that identifiers are identified using inheritance and not a tag field. Otherwise this could look like

Token{Kind is Kind.Reserved, Keyword is "class"} :: Token{Kind is Kind.Identifier} identifier :: rest

At this point it still doesn't look too bad, but any further features turn the patterns into an undecipherable mess worse than programs by someone trying to use every single feature of scalaz, e.g. adding regular expression-like features or recursive matches.

The changes to the grammar look like this:

complex-pattern
    : type identifier
    | recursive-pattern
    | recursive-pattern identifier
    | property-pattern
    | property-pattern identifier
    | list-pattern
    ;

list-pattern
    : cons-pattern
    | empty-list-pattern
    ;

cons-pattern
    : car-pattern '::' cdr-pattern
    ;

car-pattern
    : complex-pattern
    | identifier
    ;

cdr-pattern
    : list-pattern
    | identifier
    ;

empty-list-pattern
    : '[]'
    ;

P.S. Does anyone else experience semantic satiation after reading the word 'pattern' thousands of times?

@HaloFour
Copy link

@orthoxerox I know that the "cons" operator :: is common among functional languages but I am quite concerned about opening the operator floodgates for the C# language.

I do think that "cons" patterns would apply well to C#, particularly for IEnumerable<T>, but I also like the idea of a collection-or-array initializer-like syntax, something like:

if (list is { ReservedWordToken{ Keyword is "class" }, IdentifierToken className }) { }
if (list is [ ReservedWordToken{ Keyword is "class" }, IdentifierToken className ]) { }

I like the former because it feels like a collection initializer, but I think it would preclude the option of having an implicitly typed property pattern. The latter smells like it has something to do with arrays and it is the list pattern in F# but it's not the initializer syntax.

As for the verbosity of any of these, I think that's just inherent with subpatterns, whether you're dealing with property patterns or list/array/cons/whatever patterns.

@Miista
Copy link

Miista commented Mar 29, 2016

Every time I try to come up with a reasonably expressive syntax for this, any sample pattern explodes into a huge mess.

This is a lot how I feel when looking at the current spec.

In my opinion, it feels like the spec is trying too hard to do too much. It tries to deal with both how the object was constructed, what the object looks like internally, and conditionals.

is operator

The usage of is feels weird. The user-defined operator given in the spec would allow:

var c = Cartesian(3, 4);
if (c is Polar(var R, *)) Console.WriteLine(R);

However, the usage of is implies some sort of relationship between c, Cartesian, and Polar.
But Polar is not of type Cartesian nor is it an instance of Cartesian, and c is not of type Polar nor is it an instance of Polar.
Polar isn't even a subclass of Cartesian so this logically makes no sense.

What this tries to achieve has nothing to do with matching on the structure of c. What you really want to know here is whether the expression c.X != 0 || c.Y != 0 is true. So put that in a match expression (as below). This would also avoid the whole thing with user-defined is operator.

var c = Cartesian(3, 4);
c match {
    case Cartesian(var x, var y) when x != 0 || y != 0 => {
      var r = Math.Sqrt(x*x + y*y);
      Console.WriteLine(r);
    }
}

If Polar instead was a subclass of Cartesian that exposed R as a property it would make more sense to use the is operator but in that case the following would be a more clear way of achieving the same thing:

public class Polar : Cartesian
{
    public double R { get; }
    public double Theta { get; }

    public Polar(int x, int y) : base(x, y) { ... }
}

var c = Polar(3, 4);
if (c is Polar p)
{
    Console.WriteLine(p.R);
}

So to sum up; pattern matching with a user-defined is operator is just extremely confusing.

As for ReservedWordToken{Keyword is "class"}::IdentifierToken className::rest; again, the is operator is being abused. Now using is means to check for equality. What's next?!

@HaloFour
Copy link

I think allowing "property patterns" to omit the type name when the compiler already knows the type would be useful both in keeping the verbosity down as well as allowing the property pattern to work with anonymous types:

Rectangle rectangle = new Rectangle(100, 100);
var anonymous = new { Width = 100, Height = 100 };

if (rectangle is Rectangle { Width is 100, Height is var height }) { }
if (rectangle is { Width is 100, Height is var height }) { }
if (anonymous is { Width is 100, Height is var height }) { }

object obj = rectangle;
if (obj is Rectangle { Width is 100, Height is var height }) { }
if (obj is { Width is 100, Height is var height }) { } // compiler error

This syntax might also be able to apply to dynamic types. What I don't know is whether the compiler can emit code to attempt to bind to the properties to test for their existence:

dynamic dyn = rectangle;
if (dyn is { Width is 100, Height is var height }) {
    // succeeds if both the Width and Height properties exist
    // Width must also be successfully compared to the constant pattern 100
    // height is of type dynamic and could be any value
}

@Miista
Copy link

Miista commented Mar 29, 2016

I do think that "cons" patterns would apply well to C#, particularly for IEnumerable, but I also like the idea of a collection-or-array initializer-like syntax, something like:

if (list is { ReservedWordToken{ Keyword is "class" }, IdentifierToken className }) { }
if (list is [ ReservedWordToken{ Keyword is "class" }, IdentifierToken className ]) { }

In that case I would much rather see people use a match/switch expression and put "Keyword is "class"" in the guard clause.

@alrz
Copy link
Contributor

alrz commented Mar 29, 2016

but I think it would preclude the option of having an implicitly typed property pattern

@HaloFour That couldn't be ambiguous. I'm assuming the syntax proposed in #5811 works,

list-subpattern:
pattern

property-subpattern:
identifier is pattern

o is { 1, 2 }
o is { P is 1 }

@orthoxerox I don't know why there is a need for "cons" operator, because IEnumerable<T> or List<T> are not linked lists unlike FSharpList<T>. I think you should probably use slice patterns (#120), e.g.

list is { int first, .. int[:] rest }

or something like that.

@HaloFour
Copy link

@alrz

If that's not considered ambiguous either visually or by the parser then it works for me. Would it be worth having different patterns for arrays, lists, enumerables, etc. or just have the compiler emit what it thinks would be best given what it knows about the type of the operand?

Also, 🍝:

// list must contain at least two elements
// remaining elements will be copied/sliced to rest, which can be empty
if (list is { 1, 2, params int[] rest }) { ... }

Perhaps that could be params IEnumerable<int> also.

@alrz
Copy link
Contributor

alrz commented Mar 29, 2016

@HaloFour What difference does it make? All you want is to match the target with this pattern, the type which the pattern will be matched against doesn't matter (to user) — analogous to array and collection initializers.

In your example, you should use int[:] as it is still a slice of the whole array (or any other collection types that slices would support). It would be equivalent to:

// matches an array that begins with values `1` and `2` with Length >= 2
if(arr is { 1, 2, *** }) { 
  int[:] rest = arr[2:]; // skips the first two items
}

It has nothing to do with parameters though. A syntax similar to this .. is used in ES6 I guess.

@orthoxerox
Copy link
Contributor

@HaloFour @alrz perhaps operator :: should be called "uncons" instead. I am afraid it (or slice patterns) won't work well with IEnumerable<T>, though. If you use multiple patterns in a switch, will they spawn multiple IEnumerator<T>s? What will be the type of rest in them? It can't be an IEnumerable<T>, because IEnumerator<T> cannot produce one. It can't be an IEnumerator<T>, either, because you cannot apply a list pattern to it, and if you could, it would mutate it.

Lists, on the other hand, can be pattern-matched reasonably efficiently without side effects (if you have side effects on the indexer, blame no one but yourself) by introducing an array slice/list view/etc type that can be the type of rest, e.g.:

    public class ListView<T> : IReadOnlyList<T>
    {
        IReadOnlyList<T> innerList;
        int offset;

        public ListView(IReadOnlyList<T> innerList, int offset)
        {
            var innerView = innerList as ListView<T>;

            if (innerView != null) {
                this.innerList = innerView.InnerList;
                this.offset = offset + innerView.Offset;
            } else {
                this.innerList = innerList;
                this.offset = offset;
            }
        }

        public T this[int index] {
            get { return innerList[index + offset]; }
        }

        public int Count {
            get { return innerList.Count - offset; }
        }

        public IEnumerator<T> GetEnumerator() {
            if (offset == 0)
                return innerList.GetEnumerator();

            for (int i = offset; i < innerList.Count; i++) {
                yield return innerList[i];
            }
        }

        System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator() {
            return (System.Collections.IEnumerator)GetEnumerator();
        }
    }

@alrz
Copy link
Contributor

alrz commented Mar 29, 2016

I am afraid it (or slice patterns) won't work well with IEnumerable<T>, though.

@orthoxerox In F# :: pattern is only allowed for FSharpList<T>, as it has Head and Tail. That woudn't make sense for any other types, of course.

@HaloFour
Copy link

@alrz

In your example, you should use int[:] as it is still a slice of the whole array (or any other collection types that slices would support)

Slices aren't on the list of features to be implemented for C# 7.0, so I doubt that a syntax that involves that feature would be considered for list patterns in this time frame.

@orthoxerox

While there could definitely be mutation side effects the compiler could minimize exposure to them by memoizing the results of the IEnumerable<T> for the duration of the match. The compiler could also emit an implementation of IEnumerable<T> that wraps a partially enumerated IEnumerator<T>.

I'm not saying that this is a good idea, just that it's possible. Mutability issues aside an IEnumerable<T> does seem pretty similar to a sequence in functional languages given that they are traversed like a linked-list.

@orthoxerox
Copy link
Contributor

@HaloFour well, everything's possible, but I think this kind of matching obscures the concept of IEnumerable/IEnumerator, which is something that emits items on demand and forgets about them. Implicitly wrapping one in a partially rewindable cache of items is in my opinion worse than wrapping a list in a list view.

I understand the rationale. I've had to write algorithms that chained together functions that transformed enumerables (e.g., "not a number"::"minus"::"positive number" -> "not a number"::"negative number") and I really wanted to have a built-in declarative way to express it, but I now think having an explicit EnumerableProcessor that has a small cache inside is a better option.

@DavidArno
Copy link

This spec still makes little sense. It starts with:

relational-expression
    : relational-expression 'is' complex-pattern
    | relational-expression 'is' type
    ;

Which is a recursive definition (relational-expression always contains another relational-expression, with no terminal definition). Then straight after is the statement:

It is a compile-time error if e does not designate a value or does not have a type.

Yet e isn't defined in the definition.

@gafter
Copy link
Member Author

gafter commented Mar 29, 2016

static class Sequence
{
    public static operator is(IEnumerable data, out object first, out IEnumerable rest)
    {
        if (data.Any())
        {
            first = data.FirstOrDefault();
            rest = data.Skip(1);
            return true;
        }
        else
        {
            first = null;
            rest = null;
            return false;
        }
    }
}

@HaloFour
Copy link

@gafter Ick, that could be made a lot better. But algorithmic issues aside, I hope that is operators can be made generic, and I think this is another example where having to define a new type just to be a home for this pattern is a real waste.

public static class SequenceExtensions {
    public static operator bool is Sequence<T>(IEnumerable<T> data, out F first, out IEnumerable<T> rest) {
        IEnumerator<T> enumerator = data.GetEnumerator();
        if (enumerator.MoveNext()) {
            first = enumerator.Current;
            rest = new EnumeratorEnumerable(enumerator);
            return true;
        }
        enumerator.Dispose();
        first = default(T);
        rest = null;
        return false;
    }
}

@gafter
Copy link
Member Author

gafter commented Mar 29, 2016

I wouldn't put just operator is in there. I'd also provide Sequence.Of(first, rest).

@HaloFour
Copy link

@gafter Well thank goodness you guys didn't go that route when designing LINQ, otherwise Enumerable would instead be how many static classes? 😉

Are you also suggesting that an operator like this would obviate the need for a separate form of list/cons pattern? I'd be really worried that any custom operator couldn't be as efficient or careful at stepping into the enumerable without causing side effects or other issues.

@alrz
Copy link
Contributor

alrz commented Mar 29, 2016

I'm still wondering what would be the benefit of head-tail deconstruction of an IEnumerable<T> (and not an immutable list). If you use this pattern in a recursive function, there are chances that you blow up the GC.

@HaloFour
Copy link

@alrz Because it's an incredibly common type? I think that it could be done as efficiently as walking an enumerable, particularly if the compiler can keep track of how many elements it has already retrieved in order to prevent re-enumeration. Given the following (using array/list pattern, but this is just for illustration):

IEnumerable<int> GetInfiniteZeros() {
    while (true) {
        Console.Write("zero ");
        yield return 0;
    }
}

var numbers = GetInfiniteZeroes();
switch (numbers) {
    case { 1, 2, 3, 4, 5 }:
        Console.WriteLine("foo");
        break;
    case { 9, 8, 7, 6, 5, 4, 3, 2, 1 }:
        Console.WriteLine("bar");
        break;
    case { 0, 0, 0 }:
        Console.WriteLine("baz");
        break;
}

should output:

zero zero zero baz

I think for any of these patterns to be worthwhile that they should work with the common existing types. Even if IEnumerable<T> is a bit too extreme I think that standard arrays and anything that implements IList<T> needs to be supported.

Update: Changed the output, I think that the compiler could eliminate potential patterns earlier on as soon as a single element doesn't match the subpattern.

@alrz
Copy link
Contributor

alrz commented Mar 29, 2016

@HaloFour That is not a head-tail deconstruction. I'm not against a pattern that could be used for anIEnumerable<T> like #5811, but I don't see any reason to do a head-tail deconstruction on it.

F# :: operator, simply use Head and Tail properties so the implementation is straightforward. I do think that slices (#120) also can be useful for deconstructing arrays but not a general IEnumerable<T>.

@HaloFour
Copy link

@alrz

You mean like the following (again, syntax is strictly illustrative)?

int Sum(IEnumerable<int> numbers) {
    return numbers match (
        case { }: 0,
        case { var value, ** IEnumerable<int> rest }: value + Sum(rest)
    );
}

@alrz
Copy link
Contributor

alrz commented Mar 29, 2016

@HaloFour No, I'm not talking about the syntax (that is what you changed here). If you want to skip the rest that is ok to use sequence wildcards. But when you want another IEnumerable for the rest that doesn't make sense anymore (using any syntax). IEnumerable is not designed for functional style programming. If you had an immutable list then you might define your Sum as a recursive function and use head-tail deconstructions.

let sum (source : int list) : int =
  match source with
  | [] -> 0
  | head::tail -> head + sum tail

let sum (source: int seq) : int = 
  use e = source.GetEnumerator() 
  let mutable acc = 0
  while e.MoveNext() do
     acc <- acc + e.Current
  acc

But when your type is not suited for this style you should consider using imperative version.

@orthoxerox
Copy link
Contributor

@gafter @HaloFour

Using data.Skip(1) in a loop is O(N^2), isn't it? And EnumeratorEnumerable will either have to cache the elements or it won't be able to spawn multiple IEnumerators.

@alrz IEnumerable is well-suited for functional-style programming, you just have to use higher order functions and not recursive functions with pattern matching. (And your functional sum will blow up the stack, unless F# has tail recursion modulo cons now.)

@gafter
Copy link
Member Author

gafter commented Apr 20, 2016

@DavidArno Those "separate types" are the types that you would use to hold the data in a discriminated union, not merely placeholders for the pattern-matching operation. A typical DU language feature defines the union'ed types as part of the union:

using static DU;

class DU
{
    public class Option1 : DU
    {
        public string Value1, Value2;
        public static void operator is(Option1 self, out string Value1, out string Value2)
        {
            Value1 = self.Value1;
            Value2 = self.Value2;
        }
    }
    public class Option2 : DU
    {
        public int Value1, Value2;
        public static void operator is(Option2 self, out int Value1, out int Value2)
        {
            Value1 = self.Value1;
            Value2 = self.Value2;
        }
    }
}
class Program
{
    static void Main(string[] args)
    {
        Dispatch(new Option1 { Value1 = "hello", Value2 = "there" });
        Dispatch(new Option2 { Value1 = 1, Value2 = 2 });
    }
    static void Dispatch(DU du)
    {
        switch (du)
        {
            case Option1(string s1, string s2):
                Console.WriteLine("option1");
                break;
            case Option2(int s1, int s2):
                Console.WriteLine("option2");
                break;
        }
    }
}

@DavidArno
Copy link

@gafter

A typical DU language feature defines the union'ed types as part of the union:

Surely that's the complete opposite of F#? For example on F# for fun and profit's DU page, it states:

Other named types (such as Person or IntOrBool) must be pre-defined outside the union type. You can't define them "inline" and write something like this:

type MixedType = 
  | P of  {first:string; last:string}  // error

or

type MixedType = 
  | U of (I of int | B of bool)  // error

@DavidArno
Copy link

@gafter

Having said that, with code generators, I can define something like:

[union] partial struct IntOrBool
{
    class Int { int i; }
    class Bool { bool b; }
}

It then makes sense to put the type declarations inside the union. That can then be turned into some form of union type declaration by a code generator. The Int and Bool classes can have their own is definitions, which then allows pattern matching.

@gafter
Copy link
Member Author

gafter commented Apr 20, 2016

@DavidArno From https://msdn.microsoft.com/en-us/library/dd233226.aspx we see an example in F#

type Shape =
    | Rectangle of width : float * length : float
    | Circle of radius : float
    | Prism of width : float * float * height : float

As you can see, the discriminated union has three "cases" named Rectangle, Circle, and Prism. The equivalent thing in C# would be separate named types, but ideally with a better syntax than would be possible without language support, perhaps something like:

enum class Shape
{
    Rectangle(float Width, float Length),
    Circle(float Radius),
    Prism((float, float) width, float height)
}

@DavidArno
Copy link

@gafter,

Yep, I think you are right: whilst those "cases" in F# aren't real types (as far as I can work out), types would be the way to go with C#. Using the code generation feature planned for C# 7, it'll be possible to define something like:

[union] public partial class Shape
{
    public partial class Square { double sideLength; }
    public partial class Circle { double radius; }
}

and generate the DU from that, so that it can be pattern-matched in a similar way to eg F#.

The only concern I then have around that, for C# 7, the issue of completeness comes into it. For example, you or I could examine the following to know it's complete, but I don't see how the compiler could do so:

double AreaOfShape(Shape shape) =>
    shape match (
        case Circle(var radius) : radius * radius * Math.Pi
        case Square(var side) : side * side
    );

Having the compiler complain that it's not complete will be a pain. Not having it complain though could be a source of bugs. I guess that's why what to do about completeness is an open question as there's no easy answer.

@alrz
Copy link
Contributor

alrz commented Apr 21, 2016

@DavidArno No, in case of DUs, they are still derived types of a common base in F#. But active patterns are just currying functions.

@DavidArno
Copy link

DavidArno commented Apr 22, 2016

@alrz,

A poke around with ILSpy has shown me you are completely correct. Interestingly, F# not only generates subtypes of the union for each of the cases, it also generates a static Tags class with int constants of the same names as that types, which presumably is the labels that F# for fun and profit's' website talks about. I am now that little bit wiser 😀

@timgoodman
Copy link

timgoodman commented May 9, 2016

I really like the pattern matching spec for the most part, but I think I agree that the "user-defined operator is" is a bit confusing. Am I correct in understanding that for the Cartesian c, "c is Polar" is false but "c is Polar(*, *)" is true?

If so, that seems surprising to me. My intuition would have been that "c is Polar" would be equivalent to "c is Polar(*, *)", and then that "c is Polar(5, *)" would be true for a subset of those cases. (That said, maybe my intuition here is lacking due to insufficient experience. In F# or Scala is it the case that adding only wildcard arguments causes a match to succeed that would otherwise fail?)

In other words, I would have thought the current rule -- "is" only considers reference conversions and boxing/unboxing -- would still be true when there's a type to the right of "is", and that the positional pattern syntax would just be adding additional conditions on top of the reference-conversion check. But it seems instead the proposal only retains the above rule when the positional arguments are omitted.

@gafter gafter modified the milestones: 3.0, 2.0 (RC) Jun 26, 2016
@Unknown6656
Copy link

I have (independently from the current feature progress) opened an issue about more uses of the is-operator ( #12212 ) to hint some features, I personally would like to see implemented.

@ngbrown
Copy link

ngbrown commented Aug 25, 2016

The changes to switch seem to have made it essentially no different than a sequence of if, else if, else if, else statements.

What would be the advantage over if statements? The benefits that I see for if is that it's clear that the order matters, it has better scoping for any variables introduced in the block, and it doesn't have to deal with the break; statement (which the IDE doesn't autocomplete for you).

I really hope that the C# team reconsiders and leaves switch statements the way they were. switch was also supposed to have speed advantages over if by using jumptables or dictionary lookups.

@orthoxerox
Copy link
Contributor

@ngbrown Completeness check is one possible future benefit for switches. I think non-pattern switches are still optimized as jumptables.

@Pzixel
Copy link

Pzixel commented Aug 26, 2016

How does pattern matching works with structs? For classes, okay, we can just write as and check if value is not null, but for structs you never can do it, you have to write following

if (boxedValue is MyCoolStruct)
{
   var value = (MyCoolStruct) boxedValue;
}

especially when we don't know if we have here struct or class:

if (boxedValue is TStructOrClass)
{
   var value = (TStructOrClass) boxedValue;
}

How is it handled in new language version? I mean when I write
if (boxedValue is TStructOrClass value)
how does compiler tranforms it?

@RuguKi
Copy link

RuguKi commented Oct 20, 2016

Hi

While testing I came across 2 things related with is

  1. Although if (i is int j) ; is valid and works fine

I get a CS0642 Possible mistaken empty statement warning.

It makes sense as there is nothing to do if true but I actually do what I want within the if block itself. I wonder if this warning could be avoided somehow.

  1. Considering a simple function like this:
    void TestFunc(int i)
    {
        if (i is int j) ;
        j *= 5;
        Console.WriteLine(j);
    }

It works fine. j still can be used. But when I change the input type to object:

    void TestFunc(object i)
    {
        if (i is int j) ;
        j *= 5;
        Console.WriteLine(j);
    }

I get: CS0165 Use of unassigned local variable 'j' error.

I thought that even if if block evaluates to false, variable j would still be declared and could be used later. (Even if the value is 0 this way)
Interestingly, trying to define int j; before j *= 5; causes CS0128 A local variable named 'j' is already defined in this scope

I don't know if these are known issues or even issues at all but just wanted to share.

@Pzixel
Copy link

Pzixel commented Oct 20, 2016

@uTeisT I think I can answer to your questions:

  1. any conditional operator with empty body should be warned by compiler. If you want to avoid it, use empty braces {}. It's a common way to do it.
  2. in your example I assume that compilers implicitly uses i (becuase it is declared in if so its scope is limited by scope of if operator so it doesn't exists in the rest of the method. Thus, in first example it just doesn't perform converstion at all, becuase int-to-int conerstion cannot fail.
    When you are passing object, compiler cannot gurarantee that object is int, and it tries to work with int j, which is outscoped at line j *= 5.

I think, in both cases compiler should error Use of unassigned local variable 'j'

@RuguKi
Copy link

RuguKi commented Oct 20, 2016

@Pzixel 1) Using {} would deal with the warning that's true but in this case we don't need a body anyways. While simplifying the usage, such warnings could also be disabled maybe, idk.
2) I see your point (Btw, I think you meant j instead of i) so maybe that's a bug and it should also give a compile error. Because I can still use j out of the if scope.

Using VS15P5 btw, if that helps.

@HaloFour
Copy link

@uTeisT

The scoping rules were changed, j would remain in scope following the if statement. However it would not be definitely assigned which is why you can't use it as you are trying to. It'd be no different than the following:

int j;
if (o is int) {
    j = (int)i;
}
j *= 5; // compiler error, j is not definitely assigned

@Joe4evr
Copy link

Joe4evr commented Oct 20, 2016

@uTeisT Another way to deal with that is to just invert the check: if (!(i is int j)) return;

@Pzixel
Copy link

Pzixel commented Oct 20, 2016

@uTeisT Eric treats itself the empty statement as design mistake, so I anyway recommend to not use it at all.

@HaloFour I think this is mistake. I understand why they decided do to such thing, but IMHO use-after-use and use-out-of-scope should be disallowed at all costs.

@gafter
Copy link
Member Author

gafter commented Oct 20, 2016

In your example, the pattern-matching operation must always succeed:

    void TestFunc(int i)
    {
        if (i is int j) { }
        Console.WriteLine(j);
    }

But this results in a definite assignment error on the use of j. That is because the current spec says that pattern variables are definitely assigned when true for the pattern-matching operation. The implication is that they are not definitely assigned when then pattern match fails (even though that cannot happen here), and so they are not definitely assigned after the if statement.

We could modify the specification so that a pattern-matching operation that always succeeds results in the pattern variables being definitely assigned. That would allow this code to compile without error.

I've filed this as a separate issue #14651

@andrew-vdb
Copy link

//Old style

object value = getValue();
var x = value.ToString(); //or x = (string) value;
x = x.Replace("abc","xyz");

//New style

object value = getValue();
switch(value)
{
case string x:
x = x.Replace("abc","xyz");
break;
}

Is the new style more efficient because there is no unboxing?
or unboxing is hidden by compiler?

@bbarry
Copy link

bbarry commented Mar 13, 2017

@andrew-vandenbrink

Neither of those is boxing.

  • var x = value.ToString(); - This is a virtual method call.
  • var x = (string) value; - This is a "reference conversion." This particular one is called a "cast" operation or an "explicit conversion". Some people (incorrectly) call this "direct" casting because it looks syntactically like that operation in C++. C# doesn't make that particular distinction anywhere. I've also heard it called (incorrect and potentially confusing) an "unsafe" cast because in the case where it fails it would throw an InvalidCastException (or in the case of user defined conversions potentially some other type of exception).
  • case string x: - This is also a "reference conversion." In this case it is called a "type pattern." In this form of conversion, if the conversion succeeds, the labeled block will execute; otherwise it will be skipped.

There are several other reference conversion operations available between object and string:

  • var x = value as string; - This is an "as" operation. Some call this (incorrectly) an "indirect" cast (because it is different than the C++ style "direct" cast). Others call it (incorrectly) a "safe" cast (because it doesn't throw). Like a type pattern, if this succeeds, the variable contains the instance, but otherwise (since there is no conditional block around here) the value is null.
  • var c = value is string; - This is an "is" operation, or a "type check." It returns true or false depending on if there is a reference conversion (or when value types are involved if there is a boxing or unboxing conversion).

the new style:

object value = getValue();
switch(value)
{
case string x:
// use x here
break;
}

is roughly equivalent to the old style:

object value = getValue();
var x = value as string;
if (x != null)  { // use x here

If a type pattern would need to box/unbox:

object value = getValue();
switch(value)
{
case int y: // unboxing is the conversion from the reference type `object` to the value type `int`
// use y here
break;
}

then it would perform a [un]boxing conversion. Roughly equivalent old style:

object value = getValue();
if (value is int)  { 
  var y = (int) value;
  // use y here

@andrew-vdb
Copy link

Using string is bad example of intended question that I would ask, thank you for the example with int. Looks like there is still unboxing, too bad, why the compiler would not just read it as int as at that point you are sure its int (like in dynamic type)

@gafter
Copy link
Member Author

gafter commented Aug 8, 2018

Further discussion can be moved to the csharplang repo, please.

@gafter gafter closed this as completed Aug 8, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests