Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't normalize CRLF to LF in Junk #193

Merged
merged 1 commit into from
Oct 30, 2018
Merged

Conversation

stasm
Copy link
Contributor

@stasm stasm commented Oct 22, 2018

Fix #184.

This doesn't change the parsing story for unsupported line endings (\r, \u2028, \u2029). Right now, that story is largely undefined.

line_end)
either(
string("\u000A"),
string("\u000D\u000A"),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it would make sense to add other, technically unsupported, line endings here. In particular, I'm thinking of \r, \u2028 and \u2029, which are not matched by . in the JavaScript regex implementation.

We don't need to include all line terminators recognized by Unicode, but adding these extra three would make sure the reference parser can produce meaningful junk for files with these endings. Otherwise, the parser doesn't have any way to skip over the EOL in the junk_line production, and the parsing of the whole resource ends immediately.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would regex(/[^\n]*/) be closer to our intent? And not change line_end at all?

Copy link
Contributor Author

@stasm stasm Oct 29, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a really good idea! Thanks.

I had some trouble making it work when I tested it. I was seeing additional \ns where I was sure there shouldn't be any. And then I realized that I needed to use echo -n for my CLI testing...

@stasm stasm requested a review from Pike October 22, 2018 14:13
@stasm
Copy link
Contributor Author

stasm commented Oct 29, 2018

@Pike This is now ready for another round of review. Thanks!

@stasm stasm merged commit 561e87f into projectfluent:master Oct 30, 2018
@stasm stasm deleted the junk-crlf branch October 30, 2018 11:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants