-
Notifications
You must be signed in to change notification settings - Fork 597
regex capture group names must use identifier syntax #23799
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: blead
Are you sure you want to change the base?
Conversation
(F) Group names must follow the rules for perl identifiers, meaning | ||
they must start with a non-digit word character. A common cause of | ||
this error is using (?&0) instead of (?0). See L<perlre>. | ||
that ASCII-range ones must start with a non-digit word character. A |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's unclear what ones
means in that sentence. s/ones/identifiers
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/ones/group names/
I<name> must not begin with a number, nor contain hyphens. | ||
I<name> must follow the rules for perl identifiers | ||
(L<perldata/Identifier parsing>) which means, for example, that they | ||
can't begin with a number, nor contain hyphens. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
better written as: can't begin with a number or contain hyphens.
The code and test changes look fine. perldelta, maybe something like:
maybe with an example of something that was permitted but no longer is. |
Prior to this commit the non-first characters could be any \w character. But an identifier excludes a few \w characters from appearing in them. This commit tightens what is allowed. Commit xd1e2a852fbc901b45fba20906a8f42ca227ae462 gave a list of them, but I forgot a couple details in generating that list, so it wasn't quite right. The complete corrected list is: GREEK YPOGEGRAMMENI COMBINING CYRILLIC HUNDRED THOUSANDS SIGN COMBINING CYRILLIC MILLIONS SIGN COMBINING PARENTHESES OVERLAY COMBINING ENCLOSING CIRCLE COMBINING ENCLOSING SQUARE COMBINING ENCLOSING DIAMOND COMBINING ENCLOSING CIRCLE BACKSLASH COMBINING ENCLOSING SCREEN COMBINING ENCLOSING KEYCAP COMBINING ENCLOSING UPWARD POINTING TRIANGLE CIRCLED LATIN CAPITAL LETTER A - Z CIRCLED LATIN SMALL LETTER A - Z VERTICAL TILDE COMBINING CYRILLIC TEN MILLIONS SIGN COMBINING CYRILLIC HUNDRED MILLIONS SIGN COMBINING CYRILLIC THOUSAND MILLIONS SIGN ARABIC LIGATURE SHADDA WITH DAMMATAN ISOLATED FORM ARABIC LIGATURE SHADDA WITH KASRATAN ISOLATED FORM ARABIC LIGATURE SHADDA WITH FATHA ISOLATED FORM ARABIC LIGATURE SHADDA WITH DAMMA ISOLATED FORM ARABIC LIGATURE SHADDA WITH KASRA ISOLATED FORM ARABIC LIGATURE SHADDA WITH SUPERSCRIPT ALEF ISOLATED FORM ARABIC LIGATURE SALLALLAHOU ALAYHE WASALLAM ARABIC LIGATURE JALLAJALALOUHOU ARABIC FATHATAN ISOLATED FORM ARABIC DAMMATAN ISOLATED FORM ARABIC KASRATAN ISOLATED FORM ARABIC FATHA ISOLATED FORM ARABIC DAMMA ISOLATED FORM ARABIC KASRA ISOLATED FORM ARABIC SHADDA ISOLATED FORM ARABIC SUKUN ISOLATED FORM SQUARED LATIN CAPITAL LETTER A - Z NEGATIVE CIRCLED LATIN CAPITAL LETTER A - Z NEGATIVE SQUARED LATIN CAPITAL LETTER A - Z
a3f5415
to
c11897a
Compare
Prior to this the non-first characters in a capture group name could be any \w character, though they were supposed to follow perl identifier syntax. But an identifier excludes a few \w characters from appearing in them. This p.r. tightens what is allowed.
#23775 gave a list of them, but I forgot a couple details in generating that list, so it wasn't quite right.
The complete corrected list is: