Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Extension Patterns #9005

Closed
HaloFour opened this issue Feb 22, 2016 · 45 comments
Closed

Proposal: Extension Patterns #9005

HaloFour opened this issue Feb 22, 2016 · 45 comments

Comments

@HaloFour
Copy link

Problem:

The current pattern matching proposal spec defines "User-defined operator is" as being an operator defined on a type which defines how a value may be matched to that type, optionally deconstructing values which may be used in recursive patterns.

public static class Polar {
    public static bool operator is(this Coordinates coordinates, out double r, out double theta) {
        r = Math.Sqrt(coordinates.X * coordinates.X + coordinates.Y * coordinates.Y);
        theta = Math.Atan2(coordinates.Y, coordinates.X);
        return coordinates.X != 0 || coordinates.Y != 0;
    }
}

In my opinion the problem with this is that it requires defining a new type for each custom pattern. This might be fine for patterns to translate between types, however for utility/helper patterns it creates an odd syntax of individual classes named after their operations.

Solution

This proposal suggests that in addition to the is operators above that there also be named extension is operators that can be defined in other static classes and resolved in the same manner as extension methods.

public static class CoordinateExtensions {
    public static bool operator is Polar(this Coordinates coordinates, out double r, out double theta) {
        r = Math.Sqrt(coordinates.X * coordinates.X + coordinates.Y * coordinates.Y);
        theta = Math.Atan2(coordinates.Y, coordinates.X);
        return coordinates.X != 0 || coordinates.Y != 0;
    }
}

The return type of the operator may be either bool or void. If the return type is bool a return value of true indicates that the pattern matched successfully; otherwise a return value of false indicates that the pattern did not match. If the return type is void then the pattern is guaranteed to match.

The first parameter of the pattern is the type of the value to be matched. I am using the keyword this in the examples to indicate a similarity with extension methods.

In addition to permitting deconstruction of the value through out parameters I also propose allowing for input parameters that allow passing values to the extension pattern which may be used at affect the behavior of that pattern. As with any method the argument passed to an input parameter must be any expression of the type of the parameter. The out parameters indicate the deconstructed values from the pattern which may be used with recursive sub-patterns.

public static class PatternExamples {
    public static bool operator is RegexMatch(this string input, string pattern, out string match) {
        var m = Regex.Match(input, pattern);
        if (m.Success) {
            match = value;
            return true;
        }
        match = null;
        return false;
    }

    public static bool operator is Between(this int input, int minimum, int maximum) {
        return input >= minimum && input <= maximum;
    }
}
string value = "...";
string pattern = ".";
if (value is RegexMatches(pattern, var match)) {
    Console.WriteLine($"Matched {match}");
}

int value = ...;
string result = value match (
    case Between(0, 10): "Small",
    case Between(11, 100): "Medium",
    case Between(101, 1000): "Big"
    case *: "Other"
);

Generic extension patterns would expand the utility of helper patterns:

public static class ComparisonPatterns {
    public static bool operator is Between<T>(this T value, T minimum, T maximum) where T : IComparable<T> {
        return (value.CompareTo(minimum) >= 0) && (value.CompareTo(maximum) <= 0);
    }

    public static bool operator is Between<T>(this T value, T minimum, T maximum, Comparer<T> comparer) {
        return (comparer.Compare(value, minimum) >= 0) && (value.Compare(value, maximum) <= 0);
    }
}

Concerns:

  1. Combining input and output parameters may be confusing, but I think that the ability to provide arguments to the pattern can be very useful in allowing for generic and reusable helper patterns. While I don't propose any specific limitations as to which order the parameters may be defined (aside the first this argument) I do think that it would likely be a good convention that the out parameters be last.
  2. I understand that out parameters may be deprecated from user-defined patterns in favor of tuples. That would certainly affect this proposed syntax. I've not seen any examples of the new syntax or how it may be used. re: Outline of C# 7 demo at MVP Summit 2015-11-02 #6505 (comment)
@alrz
Copy link
Contributor

alrz commented Feb 22, 2016

Using out parameters in this example has some unfortunate consequences:

public static bool operator is Polar(this Coordinates coordinates, out double r, out double theta) {
  r = Math.Sqrt(coordinates.X * coordinates.X + coordinates.Y * coordinates.Y);
  theta = Math.Atan2(coordinates.Y, coordinates.X);
  return coordinates.X != 0 || coordinates.Y != 0;
}
  1. You have to assign them all (though, a default would do) even if the pattern is not successful.
  2. Specifically in this example, you first calculate all the outputs and then check if pattern is even successful. Therefore, considering (1) it becomes more complicated if you want to assign defaults if the pattern doesn't match.
// complete patterns
public static (double, double, double) operator is HSB(this Color color) {
  return (color.GetHue(), color.GetSaturation(), color.GetBrightness());
}

// partial patterns
public static (double r, double theta)? operator is Polar(this Coordinates coordinates) {
  return coordinates.X != 0 || coordinates.Y != 0 ?
      (Math.Sqrt(coordinates.X * coordinates.X + coordinates.Y * coordinates.Y),
        Math.Atan2(coordinates.Y, coordinates.X)) : null;
}

In this case, we might need oneples,

public static (double)? operator is Double(this string str) {
  return double.TryParse(str, out var value) ? (value,) : null;
}

And if this "helper pattern" has more than one possible outputs,

enum class Parity { Odd, Even } 

public static Parity operator is(this int num) {
  return num % 2 == 0 ? Even() : Odd();
}

// complete
n match(
  case Odd: ...
  case Even: ...
)

The last one is more of a conversion operator though.

@HaloFour
Copy link
Author

@alrz

I mention this in my second concern at the bottom of my proposal and I do expect that the user-defined is operators will move to tuples in some form like that. However, without seeing what the team has proposed I'm not going to make any guesses and I'm simply going to work with what they've already provided.

The other problem is that I don't really know how I would combine a tuple return type with other input parameters without the syntax having to get weird in one way or another:

public static class ConversionPatterns {
    public static (int)? operator is Integer(this string input, NumberStyles style) {
        int value;
        return int.TryParse(input, style, null, out value) ? (value,) : null;
    }
}
...

string input = "...";
// accept parameters first?
if (input is Integer(NumberStyles.AllowHexSpecifier, var result)) { ... } 
// accept parameter list, then deconstruct?
if (input is Integer(NumberStyles.AllowHexSpecifier)(var result)) { ... }

Not to mention the last time "oneples" was mentioned it was more or less dismissed, so honestly I don't know how the team would expect to define these operators if they were to deconstruct a single value, short of using a bool as the first element in a tuple to indicate success rather than having it be nullable.

@DavidArno
Copy link

This whole issue could be neatly addressed by first ensuring the Option<T> discriminated union is added to the framework:

public abstract sealed struct Option<T>
{
    struct None;
    struct Some<T>(T Value);
}

Then @HaloFour's Polar example becomes:

public static Option<(double r, double theta)> operator is Polar(this Coordinates coordinates) 
{
    return coordinates.X != 0 || coordinates.Y != 0 
        ? new Some<(double r, double theta)>(
            Math.Sqrt(coordinates.X * coordinates.X + coordinates.Y * coordinates.Y),
            Math.Atan2(coordinates.Y, coordinates.X))
        : new None();
}

That gets rid of both the out parameter and Nullable<T> code smells.

@HaloFour
Copy link
Author

@DavidArno

It doesn't get rid of any smells, it just replaces one for another.

I will note that changing the user-defined is operators to use tuples, whether they be Nullable<T>, Option<T>, or whatever, would involve locking this feature to a yet-to-be-released BCL. I think for that reason alone it's worthwhile to keep support for out parameters so that pattern matching can be used with older frameworks.

@alrz
Copy link
Contributor

alrz commented Feb 22, 2016

We already have Nullable<T>, and since tuples will be value types, you'd have ? syntax anyways, and it doesn't seem to be a "smell". Hmpf.

@DavidArno
Copy link

@HaloFour,

I guess it's a toss up between being purist and "doing it right" and being pragmatic and "doing it now". I've no idea how the various Microsoft teams involved with future releases of Visual Studio, C# and .NET coordinate things and therefore how feasible the "doing it right" option is.

@paulomorgado
Copy link

@alrz,

  1. You have to assign them all (though, a default would do) even if the pattern is not successful.
  2. Specifically in this example, you first calculate all the outputs and then check if pattern is even successful. Therefore, considering (1) it becomes more complicated if you want to assign defaults if the pattern doesn't match.

You still have to assign a value to the tuple items. So assigning a value to the out parameters has no extra cost or burdon. At all!

@paulomorgado
Copy link

A small (in size) but meaningful addition to the proposal.

@alrz
Copy link
Contributor

alrz commented Feb 22, 2016

@paulomorgado Nope, if the pattern is not successful you don't need to assign every tuple items, just return null.

@paulomorgado
Copy link

@alrz, assuming null is an accepting return value when everyone is trying to get rid of them.

@alrz
Copy link
Contributor

alrz commented Feb 22, 2016

@paulomorgado "everyone is trying to get rid of them" as it turns out, no. I personally prefer an Option<T> type for these scenarios, but since optional/nullable types are implemented around the null value (both value types and reference types), so, here we go. It seems that null became idiomatic C#. For example GetMethod returns null if failed. And nothing is going to change that.

@DavidArno
Copy link

@paulomorgado,

"everyone is trying to get rid of them". If only that were true. There's a strong movement to minimise the "evils" of null, but we are far from getting everyone signed up just yet.

@alrz,
null is idiomatic C#, at the moment. Sure, eg GetMethod returns null, but once we have options, there's nothing to stop us creating a TryGetMethod extension method that returns Option<MethodInfo> and you can be pretty sure it will get created by someone.

@HaloFour
Copy link
Author

With all of the obsession over out parameters and tuples many of the comments here have completely missed the mark of this proposal.

Let me reiterate, the signature of these pattern methods is based on what is currently in the pattern matching specification. If that spec changes then I will certainly address those changes in this proposal.

There are two tenets of this proposal:

  1. Provide a syntax for defining custom is operators in static types without having to define and name a new type for each operator.
  2. Specify how input parameters may be defined on custom is operators and evaluate syntax as to how the consumer of said pattern may pass expressions to satisfy those parameters.

@jveselka
Copy link

Reminds me of F# active patterns.
Since I dont think there is a reason to have both syntaxes, I would happily take more flexible extension-based approach over class-based approach.
Btw I believe you are missing this keyword on line public static bool operator is Between(int input, int minimum, int maximum) {

@aluanhaddad
Copy link

I think this is a great idea.

I'm not too keen on the use of void to indicate that a pattern will always be matched, but there is a certain logic to it.

I particularly like the idea of generic extension patterns.

This is definitely one of the best proposals related to pattern matching and I think it would provide enormous value.

@alrz
Copy link
Contributor

alrz commented Feb 24, 2016

To implement the is operator via pattern-matching itself, you often need to assign out parameters inside of the pattern. In that case I think out patterns can be really helpful, for example:

public static bool operator is Member(this Expression @this, out Expression expr, out string memberName) {
  return @this is MemberExpression { Expression is out expr, Member is MemberInfo { Name is out memberName } };
}

As I suggested in #8457 to use var instead of is var, it can be applied to out patterns as well,

public static bool operator is Member(this Expression @this, out Expression expr, out string memberName) {
  return @this is MemberExpression { Expression out expr, Member is MemberInfo { Name out memberName } };
}

The target will be matched against static type of the out parameter, in this case string.

This requirement probably make out parameters actually useful for is operators.

@HaloFour
Copy link
Author

@alrz If anything it's definitely a good argument for having the custom is operators return tuples, it would certainly play better with match expressions.

If I recall the question about pattern matching populating existing variables has been brought up before and was dismissed. After finally finding the comment on the tuplification of custom is operators I'm not terribly in the mood to try to track that comment down. 😀

@alrz
Copy link
Contributor

alrz commented Feb 24, 2016

@HaloFour Yeah I saw your response and that seems a pretty reasonable argument. The thing is that out parameters are part of idiomatic C# and I don't think that tuples as a functional construct could be considered as the de facto alternative alone, you might need option types etc as well and with the direction that nullable reference types are taking I don't see any elegant solution to this particular problem.

I don't know why it's dismissed, but for this specific scenario, returning tuples for is operators doesn't make anyone's life easier.

@DavidArno
Copy link

@alrz,

The thing is that out parameters are part of idiomatic C#

I disagree with that assertion in the strongest possible terms. Methods should obey the single responsibility principle; they should do only one thing. Ergo, they should only return one value (accepting that that might be a composite value, such as a tuple, or other struct or class). out parameters are a code smell that indicates a method is likely doing more than one thing and thus needs to return more than one value.

So either C# is a fundamentally flawed language that has code-smell idioms; or out parameters most definitely are not idiomatic C#. I believe the latter to be the case.

@HaloFour
Copy link
Author

@alrz

public static (Expression expr, string memberName)? operator is Member(this Expression @this) {
  return @this match (
    case MemberExpression {
      Expression is var expr,
      Member is MemberInfo {
        Name is var memberName
      }
    }: (expr, memberName)
    case *: null
  );
}

@DavidArno

There is a third-option: C# is a language that has features of which you disapprove, but are fundamental features nonetheless.

@alrz
Copy link
Contributor

alrz commented Feb 24, 2016

@DavidArno Returning multiple values doesn't have anything to do with the method doing more than one thing. If that was the case you shouldn't return anything more than primitives, otherwise you are practically returning more than one thing. Although, I believe primitives also contains multiple bits, which according to your assertion, violates the single responsibility principle, indeed.

@HaloFour I don't know what's your side! First you say it's a shame to make a feature dependant on a specific .NET framework version and now you are defending it. I didn't say it wouldn't be possible, I'm saying that it doesn't necessarily ease the procedure as a common practice and built-it functionality (defining helper patterns). My first reason to prefer tuples over out parameters was that they make it relatively hard in those particular examples, but at the same time out parameters do seem like a more straightforward solution both in the implementation and usage.

Related: #1757

@HaloFour
Copy link
Author

@alrz

I'd like to think/hope that I could be pragmatic when it comes to language design discussions like this. I am of the belief that a language feature shouldn't require a BCL upgrade unless there is a compelling benefit or requirement. Being able to use simple match expressions in these custom patterns may be benefit enough.

I still see there being some issues with that approach, however, specifically any results containing less that two values. If there isn't a "one-ple" syntax then you may be forced to use the ValueTuple<T1> type directly, assuming it exists. If it doesn't the return type could always just be some scalar value like int or string but for the latter (and all reference types) it would be impossible to have a custom pattern that succeeds but happens to return a value of null. Then, of course, given a lack of Unit or the like in the BCL or C# language, how could you express a custom pattern that can match but not return any values? Given that @gafter 's comment about this was back in November I can only assume that the team has already batted these ideas and concerns around and potentially made some decisions.

I do still have my own reasons for preferring the out variables in that it makes for an easy and I think intuitive way to have both input and output parameters representing values passed to the operator and deconstructed results returned from the pattern, especially from the perspective of consumption. If the operators returned tuples instead I have no idea how or where input parameters could be specified.

@aluanhaddad
Copy link

@DavidArno C# is a language which continues to evolve and grow rapidly. This implies that many different idioms will be born and survive to coexist for many years. Out parameters have a set of use cases for which they work very well, but many of these use cases will likely be superseded by tuples. There are however certain performance critical scenarios for which out parameters will remain a preferred option.

@DavidArno
Copy link

@aluanhaddad,

Sure there are edge-case uses for out parameters, such as around performance. They are edge-cases only though, not normal cases and thus they are not idiomatic.

Care must be taken too that such optimisations are the result of genuine performance bottlenecks, rather than misguided premature or micro-optimising. That's why it's a code smell feature: in most cases (though not all), they are a sign of bad design.

@alrz

@DavidArno Returning multiple values doesn't have anything to do with the method doing more than one thing. If that was the case you shouldn't return anything more than primitives, otherwise you are practically returning more than one thing. Although, I believe primitives also contains multiple bits, which according to your assertion, violates the single responsibility principle, indeed.

You missed the bit where I talked about compound values.

@aluanhaddad
Copy link

@DavidArno the fact that they are uncommon cases, does not make them antipatterns or code smells.

Premature, useless, or unnecessary optimizations are indicative of poor programming.

Language features that facilitate such optimizations are not indicative of poor language design.

I fundamentally disagree with the assertion that a feature which should only be used in rare circumstances should be considered a "code smell feature," whatever that means, for no other reason. Do you consider pointers to be a "code smell feature"?

@HaloFour
Copy link
Author

There's also the entire family of Try* methods, for which the team has already recently expressed preference for their current signature rather than a "tuplified" signature since the bool return value can be directly used in a conditional statement.

Anyway, this proposal isn't about the benefits (or lack-thereof) of out parameters. The only existing example code of a user-defined is operator in the pattern matching spec uses out parameters and I based this proposal on that. If the spec changes, this proposal will be updated to reflect those changes.

@DavidArno
Copy link

@aluanhaddad,

I don't think I explained myself very well, as your summary doesn't reflect what I was trying to say at all! :)

the fact that they are uncommon cases, does not make them antipatterns or code smells.
I agree. Take contravariance, for example. This is a niche feature, that's very uncommon. It's therefore not idiomatic. It certainly isn't a code smell or anti-pattern though.

Idiomatic code is the "good practice" way of doing common tasks in a language. For example, using events to achieve listeners is idiomatic in C#; implementing listeners via the GoF Observer pattern isn't.

I fundamentally disagree with the assertion that a feature which should only be used in rare circumstances should be considered a "code smell feature,"

That's actually back to front to what I was trying to say. "Code smell features" are features that should only be used in rare circumstances. out parameters are one such feature. They force you down the route of variance and they result in values being emitted from a method in multiple places, often breaking the single responsibility principle in the process. Code smells are signs that the code is likely to contain bad design. You cite an example of where their use is valid (addressing an identified, real, performance bottleneck). Such uses are rare though, thus why most of the time their use is bad and thus why they are a code smell.

@HaloFour,
If true, then, IMO, that's a really bad decision on the part of the team. "the bool return value can be directly used in a conditional statement" is a very poor justification for forcing variance in code that need not have it if a better approach were adopted.

int x;
if (int.TryParseInt(someString, out x))
{
    DoSomethingWithX(x);
}
else
{
    Console.WriteLIne($"{someString} isn't an int.");
}

versus

if (someString.TryParseInt() is Some(var x))
{
    DoSomethingWithX(x);
}
else
{
    Console.WriteLIne($"{someString} isn't an int.");
}

The latter case has less code and no forced variable variance (x is only assigned to once). This clearly ought to become the idiomatic use of TryParse and if in C# 7.

Even better would be:

match using (someString.TryParseInt())
{
    case Some(var x): DoSomethingWithX(x);
    default: Console.WriteLIne($"{someString} isn't an int.");
}

and that really should become the one true idiomatic use in C# 7. Of course there are two potential blockers to this: your suggestion that the messy out parameter version of TryParse is favoured by the team and I've no idea yet if the team have decided to avoid extending switch and instead offer a more concise syntax for pattern statements.

@HaloFour
Copy link
Author

@DavidArno

You can opt-out of using every single one of those features in any new program. However, you're always required to create at least one class, even for "Hello World".

Anyhow, thanks for turning this proposal-that-has-nothing-to-do-with-out-parameters into your personal tirade against out parameters.

@DavidArno
Copy link

@alrz,

By itself, I guess it doesn't make much sense. I should have kept going and included static classes and methods, primitives and structs as well as they are all examples of features that some OO purists argue made even early versions of C# a hybrid, rather than a strictly OO, language.

@DavidArno
Copy link

@HaloFour,

Anyhow, thanks for turning this proposal-that-has-nothing-to-do-with-out-parameters into your personal tirade against out parameters.

I agree it is unfortunate that this proposal ended up as a long debate over out parameters. It's highly disingenuous of you though to attempt to lay the blame for that at my feet. Others could have looked up for themselves as to why a growing number of people view such things as a bad idea. Instead, they chose to ask me here and I answered them.

@HaloFour
Copy link
Author

@DavidArno

I agree it is unfortunate that this proposal ended up as a long debate over out parameters. It's highly disingenuous of you though to attempt to lay the blame for that at my feet.

Considering that you trotted out the "code smell" argument regarding out parameters and continued to hammer how "fundamentally flawed" they are despite multiple comments regarding their irrelevance to this proposal you may want to reconsider the "disingenuity" of your argument.

Others could have looked up for themselves as to why a growing number of people view such things as a bad idea. Instead, they chose to ask me here and I answered them.

An editorial blog post by you does not signify an authoritative consensus on the subject. That said I don't doubt that there is such a body because in programming there is a body that disagrees with a great deal about every paradigm in every language. I'm not particularly interested in repeating 50 year old arguments. The beauty of the CLR is that it allows for a great many different languages and paradigms. There's little reason to drag this one language back and forth between competing paradigms and changing idioms. You are free to choose the tool that fits your opinions.

@aluanhaddad
Copy link

Back on topic:
A really nice aspect of this proposal is how it allows existing types to be augmented for integration with patterns. For example an extension operator is could be defined to match List<T> with ImmutableList<T> transparently.

@gafter
Copy link
Member

gafter commented Mar 9, 2016

string value = "...";
string pattern = ".";
if (value is RegexMatches(pattern, var match)) {
    Console.WriteLine($"Matched {match}");
}

How is the parser supposed to know what to parse as a pattern and what to parse as an expression?

@HaloFour
Copy link
Author

HaloFour commented Mar 9, 2016

@gafter Input parameters are expressions, output parameters are patterns.

@gafter
Copy link
Member

gafter commented Mar 9, 2016

@HaloFour And how is the parser supposed to know which is which when it is parsing a pattern-match (i.e. long before semantic analysis), e.g.

if (expr is Name(M(), M()))

See, the first one is an invocation expression so when we look-up M we skip types and only look at things that are invocable. Of course it would be a syntax error if there were a bare * between the parens, as that isn't an expression. The second one is a pattern, so it looks up a type named M (skipping any methods it runs across in the process). Of course it can have a * between the parens because that is a subpattern. They are different syntactic rules and different nodes in the compiler. But how does the parser know to parse them differently?

I guess you might argue that we should completely redesign the syntax for patterns so we can unify them with expressions.

@HaloFour
Copy link
Author

HaloFour commented Mar 9, 2016

@gafter I see. I don't know how to answer that question.

I guess you might argue that we should completely redesign the syntax for patterns so we can unify them with expressions.

Reading that statement in my head it could have so many different inflections that it's a little difficult for my parser to determine your intent. (see what I did there? 😄) If said unification could lead to more interesting scenarios with pattern matching then I could see that being worth it, but I have no idea what that would entail.

Note that I'm not married to the syntax that is in the proposal. I use it to illustrate the two concepts I'd like to see adopted, specifically creating custom is operators without having to declare a new type just to be its home, and to allow those operators to accept expressions as parameters.

@alrz
Copy link
Contributor

alrz commented Mar 9, 2016

@gafter But only constant expressions are allowed inside patterns. Currently as specified, the only expression that you can use in a pattern is a constant-expression (via constant-pattern) right? So I assume same would apply to an "extension pattern",

string value = "...";
const string pattern = ".";
if (value is RegexMatches(pattern, var match)) {
    Console.WriteLine($"Matched {match}");
}

All input parameters must be a constant. As you said about indexer patterns (#5811):

I have a bit of a problem with a general expression appearing as part of the pattern, as you have in the indexer pattern.

@gafter gafter closed this as completed Mar 9, 2016
@gafter gafter reopened this Mar 9, 2016
@gafter
Copy link
Member

gafter commented Mar 9, 2016

@alrz The fact that a pattern can only be a constant does not help disambiguate the syntax very much.

@alrz
Copy link
Contributor

alrz commented Mar 10, 2016

@gafter Right, and in an indexer pattern, the constant argument is not the part that is going to match.

@gafter
Copy link
Member

gafter commented Apr 21, 2017

Issue moved to dotnet/csharplang #481 via ZenHub

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants