-
Notifications
You must be signed in to change notification settings - Fork 5
Make the parser conform to ICU MessageFormat #3
Conversation
b7f260d
to
d046478
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for taking so long to get to this. Overall, I like these changes. However, it would be good if you could split the quote-escaping changes into a separate PR, as that's a change that'll require us to update the major version of the parser -- it will break some users' strings.
parser.pegjs
Outdated
|
||
functionParamsStrict = _ ',' p:paramStrict { return p; } | ||
|
||
functionParamsDefault = _ ',' _ p:paramDefault _ { return p.replace(/^[ \t\n\r]*|[ \t\n\r]*$/g, ''); } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not return p.trim()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because _
is just [ \t\n\r]
and trim is [\s]
, which is [ \f\n\r\t\v\u00a0\u1680\u2000-\u200a\u2028\u2029\u202f\u205f\u3000\ufeff]
, and I wanted to keep it the same.
Changing them both to \s would probably be fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yah, let's do that. The _
definition is an ancient remnant.
I've split the function parameter stuff into #4. |
d046478
to
2f59f48
Compare
Apostrophes are now handled correctly, emulating ICU's default DOUBLE_OPTIONAL behavior. Octothorpe is handled correctly, and can be escaped using apostrophes, but only inside plural (depending on strictNumberSign). A single apostrophe only starts quoted literal text if it immediately precedes a curly brace ({}), or, if inside a plural, an octothorpe (#). The parser now supports the strictNumberSign option, since that determines whether a quoted octothorpe is parsed as `'#'` or just `#`. Since choice format isn't supported, the pipe symbol never causes an apostrophe to start quoted literal text. Fixes messageformat#2
2f59f48
to
ad6c0c2
Compare
\s doesn't work in peg.js. It has to be done manually: nkovacs@a0be6aa Also, the identifier syntax in the Java implementation is The |
Make the parser conform to ICU MessageFormat
Apostrophes are now handled correctly, emulating
ICU's default DOUBLE_OPTIONAL behavior.
Octothorpe is handled correctly, and can be escaped using
apostrophes, but only inside plural
(depending on strictNumberSign).
A single apostrophe only starts quoted literal
text if it immediately precedes a curly brace ({}),
or, if inside a plural, an octothorpe (#).
The parser now supports the strictNumberSign option,
since that determines whether a quoted octothorpe
is parsed as
'#'
or just#
.Since choice format isn't supported, the pipe symbol
never causes an apostrophe to start quoted literal text.
Parameters to functions may contain whitespace and
quoted special characters, but argStyle is still trimmed
and split into multiple parameters.
A new option, strictFunctionParams, activates ICU-compatible
parsing, which parses everything from the second
comma to the closing curly brace as a single "argStyleText"
parameter.
Fixes #1, fixes #2