-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Normative: allow duplicate named capture groups #2721
base: main
Are you sure you want to change the base?
Conversation
7587c99
to
b2bc16e
Compare
6d18c49
to
35b63c0
Compare
8e770b5
to
2140ce7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only materially necessary change I see is _j_<sup>th</sup> capture of _R_ is not *undefined*
→ _j_<sup>th</sup> element of _r_'s _captures_ List is not *undefined*
, but I also left some suggestions that you can adopt or ignore at your discretion.
spec.html
Outdated
@@ -34758,6 +34758,22 @@ <h1> | |||
</emu-alg> | |||
</emu-clause> | |||
|
|||
<emu-clause id="sec-patterns-static-semantics-can-both-participate" type="abstract operation"> | |||
<h1> | |||
Static Semantics: CanBothParticipate ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Opposite polarity might lead to a more intuitive name, e.g. MutuallyExclusive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @gibson042, the sense of the CanBothParticipate
AO is confusing. It returns true when x and y are in the same Alternative and therefore we should throw a Syntax Error
. Can we rename CanBothParticipate
to CannotBothParticipate
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's more like MightBothParticipate
. As in, both could be components of a single match.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed to MightBothParticipate
.
spec.html
Outdated
1. For each integer _j_ such that _j_ ≠ _i_, _j_ ≥ 1, and _j_ ≤ _n_, do | ||
1. If the _j_<sup>th</sup> capture of _R_ was defined with a |GroupName|, then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would this be better expressed more procedurally?
1. For each integer _j_ such that _j_ ≠ _i_, _j_ ≥ 1, and _j_ ≤ _n_, do | |
1. If the _j_<sup>th</sup> capture of _R_ was defined with a |GroupName|, then | |
1. For each integer _j_ such that _j_ ≥ 1 and _j_ ≤ _n_, in ascending order, do | |
1. If _j_ ≠ _i_ and the _j_<sup>th</sup> capture of _R_ was defined with a |GroupName|, then |
spec.html
Outdated
1. Let _isMatchedElsewhere_ be *false*. | ||
1. For each integer _j_ such that _j_ ≠ _i_, _j_ ≥ 1, and _j_ ≤ _n_, do | ||
1. If the _j_<sup>th</sup> capture of _R_ was defined with a |GroupName|, then | ||
1. Let _sj_ be the CapturingGroupName of that |GroupName|. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would not object to refactoring these aliases, e.g. s → groupName and sj → otherName.
61e7436
to
8dd5ab0
Compare
Happy do defer the editorial review, I'm told there've been enough eyes on this. |
spec.html
Outdated
@@ -35715,7 +35715,7 @@ <h1>Static Semantics: Early Errors</h1> | |||
It is a Syntax Error if CountLeftCapturingParensWithin(|Pattern|) ≥ 2<sup>32</sup> - 1. | |||
</li> | |||
<li> | |||
It is a Syntax Error if |Pattern| contains two or more |GroupSpecifier|s for which CapturingGroupName of |GroupSpecifier| is the same. | |||
It is a Syntax Error if |Pattern| contains two distinct |GroupSpecifier|s _x_ and _y_ for which CapturingGroupName(_x_) is the same as CapturingGroupName(_y_) and such that CanBothParticipate(_x_, _y_) is *true*. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-
PR various editorial changes for comparisons #2877 says to avoid "is the same as", use "is" for comparing Strings.
-
It's rare (though not unheard of) to invoke an SDO with parenthesis notation. On the other hand, using the normal notation would put
CapturingGroupName of _y_
on the RHS ofis
, which I don't think we ever do. -
... for which A and such that B
is odd. Seems like either the two arms should agree re "for which" vs "such that", or else just use one that covers both arms:... such that A and B
It is a Syntax Error if |Pattern| contains two distinct |GroupSpecifier|s _x_ and _y_ for which CapturingGroupName(_x_) is the same as CapturingGroupName(_y_) and such that CanBothParticipate(_x_, _y_) is *true*. | |
It is a Syntax Error if |Pattern| contains two distinct |GroupSpecifier|s _x_ and _y_ such that CapturingGroupName(_x_) is CapturingGroupName(_y_) and CanBothParticipate(_x_, _y_) is *true*. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to
It is a Syntax Error if |Pattern| contains two distinct |GroupSpecifier|s
_x_
and_y_
such that the CapturingGroupName of_x_
is the CapturingGroupName of_y_
and such that CanBothParticipate(_x_
,_y_
) is*true*
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM other than @jmdyck's nits.
8dd5ab0
to
2c813b7
Compare
Comments addressed. |
This allows you to have a regex like
/(?<year>[0-9]{4})-[0-9]{2}|[0-9]{2}-(?<year>[0-9]{4})/
where a capturing group name is re-used across alternatives. It continues to be illegal to re-use a name within the same alternative.
As currently specified, it also enforces that named backreferences correspond to capturing groups in the same alternative, which would make the following (currently legal) program illegal:/(?<a>x)|\k<a>/
There is no reason to write this because the\k
can never refer to anything, meaning it will always match the empty string. For this reason I think it should have been illegal in the first place. But if we want to preserve that behavior, it's easy enough to specify.EDIT: updated so that the above remains legal, per plenary.
(I have a proposal repo for this, but figured it might as well be a PR.)