Discussion: Scope of pattern variables, and tuple decomposition #4781

gafter · 2015-08-25T18:27:28Z

I’m doing a draft reimplementation of pattern matching (#206) now, and trying to figure out the right scoping of pattern variables. I had expected to use whatever we came up with for declaration expressions when those were under development, but it turns out that doesn’t make sense in many situations. For example, given the statement

M(x is int i ? i : 12);

There is no point in the variable i being in scope outside this statement, as it won’t be definitely assigned anywhere else. My current thinking is that a pattern variable introduced in an expression would be limited in scope to the nearest enclosing statement, with the single exception that a pattern variable introduced in the condition of an if statement would not be in scope in the else clause.

For a field initializer, the scope would be limited to that declaration. For an expression-bodied property or method, limited to that declaration. For a ctor-initializer, limited to that ctor-initializer. The scope of a pattern variable introduced in a catch clause is the catch block.

For decomposing a tuple (#347), I anticipate introducing a separate statement form, something like

pattern := expression ;

For example

var tuple = (12, "foo");
(int x, string y) := tuple;

This has the same semantics as the expression

expression is pattern

With the exceptions that

It is a statement form, not an expression form;
The compiler must be able to prove that the match always succeeds (otherwise it is a compile-time error); and
Variables introduced in the pattern remain in scope within the enclosing block

I welcome feedback and comments on this while is it fresh on the drawing board.
@MadsTorgersen @ljw1004 @semihokur @AnthonyDGreen @terrajobst @jaredpar @Pilchie @mattwar @stephentoub @vancem

The text was updated successfully, but these errors were encountered:

svick · 2015-08-25T19:42:47Z

What's the reasoning behind using :=? Wouldn't just = (e.g. (int x, string y) = tuple;) work?

Or is it meant to differentiate between pattern decomposition and normal assignment?

Also does this mean that (using examples from #206) Polar(var R, *) := Cartesian(3, 4); would work? Because that would be awesome.

gafter · 2015-08-25T19:54:55Z

@svick Yes, this is different from "normal" assignment. You're suggesting it could be thought of as an assignment expression inside an expression-statement, but the left-hand-side isn't an lvalue, the assignment itself doesn't appear to have a well-defined type, and the left-hand-side appears to be a type expression (from the tuples proposal #3913) rather than a valid syntactic form for an (lvalue) expression. This would fill in the "Deconstruction syntax" for #3913.

No, Polar(var R, *) := Cartesian(3, 4); would not work because the kinds of things the compiler would need to know to prove it will succeed are outside the scope of what we are likely to specify. Anything that depends on what user-definable code inside the constructor and/or decomposition operator (e.g. Polar.operator is) do would not be assumed to always succeed.

MgSam · 2015-08-25T20:25:04Z

Speaking purely from a usability perspective, adding := to the language just to deconstruct tuples seems odd and likely confusing. You would almost certainly have to add a compiler error/warning for the extremely common case when people type (int x, string y) = tuple; - at which point one would ask why isn't that the default anyway.

I understand it doesn't behave like a normal assignment, but it should still re-use the same = operator.

dsaf · 2015-08-25T20:27:21Z

Looking at current version of the language, this is similar to foreach. Reusing for's variable is a bad practice. Passing a variable into using is also not recommended.

Keeping things local reduces cognitive load. Even in current C# I sometime use curly brackets to make a local variable even more local:

public float BigCalculationThatShouldntBeSplit() 
{
    float result = 0;

    {
        float variableWithSomeMeaning = 0;
        //first leg of calculation
    }

    {
        float variableWithOtherMeaning = 0;
       //second leg of calculation
    }

    return result;
}

dsaf · 2015-08-25T20:51:46Z

@MgSam

Speaking purely from a usability perspective, adding := to the language just to deconstruct tuples seems odd and likely confusing.

What if it will be allowed to define custom operators like in F# and then := simply becomes a variadic generic extension operator on the new ValueTuple<> type (no idea if this is possible)?

public static ..T := (this (..T args) tuple)
{
    foreach (let arg in args)
    {
        yield return arg;
    }
}

gafter · 2015-08-25T20:59:42Z

There is also the question of how to handle the scoping of pattern variables appearing in the various parts of a query expression. I think they remain local to that expression.

MgSam · 2015-08-25T21:07:33Z

@dsaf Adding custom operators seems outside the scope of the discussion on pattern matching. It also hasn't even been seriously considered for C# 7.0 at this point so I don't think any proposal should incorporate that.

@gafter I think some more examples on how the proposed scoping rules would behave (especially in contrast to the discarded declaration expressions proposal) would be helpful.

gafter · 2015-08-25T21:14:33Z

@MgSam It is very simple - the variable simply would not be in scope outside the immediately enclosing statement.

M(x is int i ? i : 12); // i is scoped to this statement
Console.WriteLine(i); // error: i not found

In an if statement, it isn't even available in the else clause

if (x is int i)
{
    // int i is in scope
}
else if (x is string i) // int i not in scope here, so we can reuse the name
{
    // string i is in scope
}

Other cases

In a field initializer, it is scoped to the field initializer
In an expression-bodied method or property, scoped to the body
In an expression-bodied lambda, scoped to the body
In a ctor-initializer, scoped to the ctor-initializer
In an attribute, scope to the attribute (but an error because not constant)
In an expression part of a query expression, scoped to that expression part

HaloFour · 2015-08-25T21:16:46Z

@gafter

Paren in the wrong spot there?

I recall the argument that erupted from the scoping of declaration expressions. I was under the impression that was part of the reason that the feature was dropped/postponed.

AdamSpeight2008 · 2015-08-25T21:17:24Z

:= is used in named parameters, so to be consistent the identifier should be on the left of :=

gafter · 2015-08-25T21:19:35Z

@HaloFour The scoping rules for declaration expressions didn't make sense for pattern matching. That was part of the reason we put the feature off. I'm trying to restart the discussion in the context of pattern matching.

gafter · 2015-08-25T21:20:34Z

@AdamSpeight2008 There is no identifier in the syntax for the decomposition operator. The syntax is

pattern := expression ;

AdamSpeight2008 · 2015-08-25T21:22:57Z

There is also the question of how to handle the scoping of pattern variables appearing in the various parts of a query expression.
They are different variables, but have the same name, since it gets expanded out into function calls.
Would they just be considered as Lets

@gafter So I couldn't decompose a tuple into different variables.

( x, y, z ) = someTuple // x, y, z arerr already defined.
( x:= , y:= , z:= ) = someTuple // x,y,z are defined here

gafter · 2015-08-25T21:31:30Z

@AdamSpeight2008 I think you're talking about range variables, which have nothing to do with the current discussion.

AdamSpeight2008 · 2015-08-25T21:46:53Z

@gafter

M(x is int i ? i : 12); // i is scoped to this statement
This looks messy and confusing.
x is int a boolean expression
i ? i : 12 a inline condition
How can that work? I'd be spending more of my time trying to understand the syntax, rather doing work.

It needs to be a lot clearer, in its intent.

For example

case ( int, int ) into x, y when ( ( x > 0 ) && ( y > 0 ) ) :
  // foo

This checks the types, decomposes the value into variable named x and y (both of type int).
Then checks the values of them (a guard expression). If it evaluates to true then it proceed to run the code after // foo
Or in code.

If ( (t.Item1 is int) && (t.Item2 is int) )
{
  // (into) section 
  int x = t.Item1;
  int y = t.item2;
  // (when) section
    if( ( x > 0 ) && { y > 0 )
  {
  }
}

gafter · 2015-08-25T22:18:57Z

@AdamSpeight2008

I'd be spending more of my time trying to understand the syntax, rather doing work.

Yes, when you start seeing a new syntax you have to take time to learn it instead of forging ahead without understanding. Hopefully you would not have to learn a new syntax every time you see it for the rest of your life. This particular new syntax can always be distinguished by the identifier appearing after the type in an is.

The syntax you propose does not extend naturally to recursive pattern matching. I have no idea how it would be used in a method call per the original example. It also detaches the type names used to declare the pattern variables from the appearance of the pattern variables, making the syntax more difficult to comprehend, especially as the use cases get more complex.

dsaf · 2015-08-25T22:37:08Z

@AdamSpeight2008 In my opinion the syntax is as clear as it gets. But I guess you will be able to use the expression form of the switch to achieve the type of readability you are looking for:

M(switch(x) {
    case int i:
        return i;
    default:
        return 12; });

AdamSpeight2008 · 2015-08-25T22:59:44Z

In Nemerle is would be something like

M( match(x)
   { 
     | ( int ) into i => i;
     | _ => 12;
   } )

@dsaf The syntax is ambiguous in terms of possible meaning. if (x is int i)

it looks like a syntax error. eg a missing operator; maybe ==
Now something that didn't compile before, does compile.
Why isn't i a boolean? Since the type signature is (obj,int)->bool.

gafter · 2015-08-25T23:20:39Z

@AdamSpeight2008

it looks like a syntax error

Every new feature uses syntax that used to be an error. That's how we manage to extend the language without changing the meaning of existing programs.

In Nemerle is would be something like

M( match(x)
   { 
     | ( int ) into i => i;
     | _ => 12;
   } )

No, it would not. It would be more like

M( match(x)
   { 
     | ( i is int ) => i
     | _ => 12
   } )

mikedn · 2015-08-26T06:16:32Z

@gafter

My current thinking is that a pattern variable introduced in an expression would be limited in scope to the nearest enclosing statement, with the single exception that a pattern variable introduced in the condition of an if statement would not be in scope in the else clause.

If the opposite is done and i is introduced in the enclosing scope then I suppose the following would fail to compile:

M(x is int i ? i : 12); 
M(x is int i ? i : 12);

Or would the second statement attempt to use the existing i?

If not then I'm sure that there will be cases where it would be useful to have i declared in the enclosing scope but most of the time it would be highly annoying as you have to come up with new variable names for every such statement. So, short of using special syntax to select where i is declared (and having that would be quite weird IMO), the current scoping choice looks good to me.

gafter · 2015-08-26T14:42:52Z

@mikedn No, it won't reuse an existing variable. If there is already a variable of that name in scope it is an error

If not then I'm sure that there will be cases where it would be useful to have i declared in the enclosing scope

Such cases are rare, as the variable is only definitely assigned when true for the is-expression. Here is one realistic case:

if (!(o is int i)) throw new ArgumentException("o");
// now i is definitely assigned

The proposed scoping rules don't make this legal. On the plus side, they do make this legal:

if (o is string s) { /* code using string s */ }
else if (o is Something s) { /* code using Something s */ }

whoisj · 2015-08-26T16:05:34Z

Just so those of us with a slower uptake understand this.

This (x is int i ? i : 12) is functionally identical to this var i = x is int ? x : 12, ignoring current type issues?

HaloFour · 2015-08-26T17:24:43Z

@whoisj

Sorta. If I read this right the variable i would be out of scope outside of the expression, so you'd have to assign the result to another variable with a different name. Since the example is passing it as a parameter to a method called M that is not really a concern. Other than that you're right, the x is int i expression returns a bool so it can be used with ?: and the variable i would be in scope for the remainder of that expression.

HaloFour · 2015-08-27T12:34:02Z

@orthoxerox That's right, there are different boxing/unboxing rules for Nullable<T>. Either way, the important bit of this conversation is regarding the scope of those parameters, not necessarily which pattern is being matched. Type checking is just a simple pattern.

orthoxerox · 2015-08-27T12:48:42Z

@HaloFour And that's why I questioned allowing universal use of pattern matching's is operator, since there's no compelling use case where it is superior to existing language features outside of actual pattern matching.

MrJul · 2015-08-27T13:01:50Z

@orthoxerox I don't think the point of this issue is to question pattern matching. If you don't see its use, just don't use it :)

For more complex cases, it becomes superior IMHO:

M(something is Point(int x, int y) ? x * y : 12)

VS

var point = something as Point?;
M(point != null ? point.Value.X * point.Value.Y : 12)

And it really shines in a switch construct, as seen in the proposal.

HaloFour · 2015-08-27T13:13:09Z

Might be an argument to use a new keyword like match instead of is, but that probably belongs under #206.

MrJul · 2015-08-27T13:15:59Z

@gafter

if (!(o is int i)) throw new ArgumentException("o");
The proposed scoping rules don't make this legal.

I think this could be solved by allowing to match into an existing variable:

int i;
if (!(o is int into i)) throw new ArgumentException("o");
// i is definitely assigned here.

(Please don't focus on the chosen syntax (into): I simply want to discuss using existing variables with pattern matching, even if it won't be the main use case).

orthoxerox · 2015-08-27T14:00:47Z

@MrJul I didn't say pattern matching was bad, it's very good where it belongs, inside switches. Your more complex example could be rewritten using my proposal to autocast variables inside ifs:

if (something is Point)
    M(something.X * something.Y);
else
    M(12);

or actually using switch:

switch (something) {
    case Point(int x, int y):
        M(x, y);
        break;
    default:
        M(12);
        break;
}

~~The only use for a compound is+as operator the new is is is the need to get a downcast variable inside the switch, since it has no case-level scopes, e.g.~~

switch (token) {
    case Literal litToken:
        return litToken.ParseValue(); //can't use token automatically cast as Literal here
    ...
}

~~And this is once again a special case.~~

EDIT: I forgot that the pattern-matching switch has case-level scoping. There's absolutely no need to have x is type y constructs.

HaloFour · 2015-08-27T14:01:38Z

@MrJul Not a bad idea but I can see how it might be awkward. With the other forms of pattern matching I think it would be a little clearer, the type specific syntax above it would be a little weird.

object o = ...;
int x;
if (o is Point(out x, var y)) { } // reuse x, declare y

gafter · 2015-08-27T14:55:23Z

@AdamSpeight2008

The recursive pattern you suggest, doesn't indicate how knows each part of the pattern match
is referring to the .Left and .Right properties on the object..
Or that you are match ng against types ( or more abstractly the structure of object) and not against object values.

I suggest you read the whole specification (#206) for the details, not just the example. Or try the prototype. I'm working on reintegrating the prototype into Roslyn now, so it will be easier to experiment.

alrz · 2015-08-29T13:08:48Z

could KeyValuePair<,> be treated as a tuple and make this possible?

foreach ((var key, var value) in dictionary)
{

}

gafter · 2015-08-29T22:12:33Z

@alrz Treating KeyValuePair as a tuple doesn't get that behavior. We'd have to change the foreach statement to do decomposition.

alrz · 2015-08-30T03:49:09Z

@gafter in F#, patterns (which do the decomposition) are everywhere, as in let pattern = expression in ... , fun pattern -> expression and for pattern in enumerable-expression -> ... but still you can't have for key,value in dictionary where dictionary is a Dictionary<,> because KeyValuePair<,> is not a tuple.

gafter · 2015-08-30T05:00:09Z

@alrz If we had started with pattern matching we might have decomposition everywhere too. But we didn't so we don't.

whoisj · 2015-08-31T14:55:54Z

Treating KeyValuePair as a tuple doesn't get that behavior. We'd have to change the foreach statement to do decomposition.

@gafter can't you just add an implicit conversion to the KeyValuePair type?

paulomorgado · 2015-08-31T15:12:37Z

@whoisj, what would be the gain in doing that? Do you want the tuple or the tuple items?

alrz · 2015-08-31T15:24:29Z

@whoisj it would be a good idea but C# doesn't support generic operators

public static implicit operator (TKey,TValue)<TKey,TValue>(KeyValuePair<TKey,TValue> p)
    => (p.Key,p.Value);

which can be another feature request. also read the latest comments on #98.

gafter · 2015-08-31T15:30:35Z

@alrz If declared inside the KeyValuePair struct, this conversion operator would not have to be generic.

public static implicit operator (TKey,TValue)(KeyValuePair<TKey,TValue> p) => (p.Key,p.Value);

However, that still would not cause the foreach loop to perform decomposition.

Eyas · 2015-08-31T16:00:18Z

@gafter Since we are talking about an entirely new feature anyways (pattern matching/decomposition), what would be the best place to talk about the possibilities of decomposition in a foreach loop? Or has it been discussed and determined to be impractical due to some deep language/compat change that the team is not willing to make?

whoisj · 2015-08-31T16:38:12Z

@whoisj, what would be the gain in doing that? Do you want the tuple or the tuple items?

Cleaner code, which expresses intent and not so much boiler plate.

@whoisj it would be a good idea but C# doesn't support generic operators

Touche.

gafter · 2015-08-31T19:17:10Z

@Eyas This is a fine place to discuss it, or in the tuple issue (#347). I'm just pointing out that you won't get decomposition in the foreach loop "for free" even if you get both tuples and a decomposition statement.

alrz · 2015-09-01T13:41:15Z

@gafter since foreach loop uses the assignment statement var current = enumerator.Current; I think it would.

gafter · 2015-09-01T15:02:28Z

@alrz Assignment and decomposition are different. For example, the pattern object o only matches a non-null value, but an assignment to it always succeeds. Since the elements that a foreach loop operates over may contain null, it must continue to use assignment (not decomposition) semantics.

KrisVandermotten · 2015-09-01T15:19:38Z

I really dislike the idea the the if branch and the else branch of an if statement do not live in the same scope.

In C# 6, the statement

if ( expr )
{
    statement1;
}
else
{
    statement2;
}

can always legally be transformed into

if ( ! ( expr ) )
{
    statement2;
}
else
{
    statement1;
}

no matter wat expr, statement1 or statement2 really are. There is elegance in this symmetry.

Under the proposed scoping rules, that would no longer be the case.

That looks very weird to me.

Like many others, I like to test my inputs as soon as possible, throwing an exception when they are wrong, afterwards writing the code processing them when they are ok. So I would like to be able to write something like:

if (!(x is int i))
{
     throw new Exception("...")
}
else
{
    // use i
}

The alternative today works, but requires a redundant cast:

if (!(x is int))
{
     throw new Exception("...")
}
else
{
    int i = (int) x;
    // use i
}

alrz · 2015-09-01T15:30:43Z

@gafter thanks for explanation. but will tuples continue to use the underlying reference type System.Tuple (which is also used by F# and as discussed here it wasn't a good decision)? otherwise, being null wouldn't be the problem.

gafter · 2015-09-01T15:46:43Z

@alrz No, tuples are likely to be value types. But since the semantics of the foreach loop will continue to use assignment semantics for the reasons discussed, that won't make any difference for that use case.

gafter · 2015-09-01T15:51:31Z

@KrisVandermotten We are discussing a decomposition statement, something like

let Pattern = expression;

in which the match must be statically verifiable (the compiler can prove it always succeeds), and variables from the pattern are introduced into the enclosing scope. If the match can fail you'd use this form:

let Pattern = expression else statement

In the latter form there is an optional where expression clause after the expression.

This is perfect for the early error detection cases. Your example would be written

let int i = x else throw new Exception("...");

or, if you get paid by the line of code

let
    int i = x
else
{
    throw new Exception("...");
}

gafter added Discussion Area-Language Design Language-C# labels Aug 25, 2015

gafter added this to the C# 7 and VB 15 milestone Aug 25, 2015

gafter self-assigned this Aug 25, 2015

gafter closed this as completed Aug 25, 2015

gafter reopened this Aug 25, 2015

gafter mentioned this issue Sep 3, 2015

Proposal: Pattern matching and record types #206

Closed

HaloFour mentioned this issue Sep 4, 2015

[Discussion]: match construct for pattern matching #5016

Closed

gafter closed this as completed Sep 6, 2015

gafter modified the milestone: C# 7 and VB 15 Nov 26, 2015

Discussion: Scope of pattern variables, and tuple decomposition #4781

Discussion: Scope of pattern variables, and tuple decomposition #4781

Comments

gafter commented Aug 25, 2015

svick commented Aug 25, 2015

gafter commented Aug 25, 2015

MgSam commented Aug 25, 2015

dsaf commented Aug 25, 2015

dsaf commented Aug 25, 2015

gafter commented Aug 25, 2015

MgSam commented Aug 25, 2015

gafter commented Aug 25, 2015

HaloFour commented Aug 25, 2015

AdamSpeight2008 commented Aug 25, 2015

gafter commented Aug 25, 2015

gafter commented Aug 25, 2015

AdamSpeight2008 commented Aug 25, 2015

gafter commented Aug 25, 2015

AdamSpeight2008 commented Aug 25, 2015

gafter commented Aug 25, 2015

dsaf commented Aug 25, 2015

AdamSpeight2008 commented Aug 25, 2015

gafter commented Aug 25, 2015

mikedn commented Aug 26, 2015

gafter commented Aug 26, 2015

whoisj commented Aug 26, 2015

HaloFour commented Aug 26, 2015

HaloFour commented Aug 27, 2015

orthoxerox commented Aug 27, 2015

MrJul commented Aug 27, 2015

HaloFour commented Aug 27, 2015

MrJul commented Aug 27, 2015

orthoxerox commented Aug 27, 2015

HaloFour commented Aug 27, 2015

gafter commented Aug 27, 2015

alrz commented Aug 29, 2015

gafter commented Aug 29, 2015

alrz commented Aug 30, 2015

gafter commented Aug 30, 2015

whoisj commented Aug 31, 2015

paulomorgado commented Aug 31, 2015

alrz commented Aug 31, 2015

gafter commented Aug 31, 2015

Eyas commented Aug 31, 2015

whoisj commented Aug 31, 2015

gafter commented Aug 31, 2015

alrz commented Sep 1, 2015

gafter commented Sep 1, 2015

KrisVandermotten commented Sep 1, 2015

alrz commented Sep 1, 2015

gafter commented Sep 1, 2015

gafter commented Sep 1, 2015