-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lexer changes #15339
Lexer changes #15339
Conversation
re-r? last two commits are new. @brson @jbclements |
Rebasing those doc comments changes must be painful. Why not make those a separate pull request, so we can just get those out of the way? EDIT: nvm, pretty much good to go here, pending q. about lifetimes |
Ok, all done. Last commit is new, and completes RFC 21. Still an open question about whether 011b0fb is OK. |
I'm inclined to say it's fine, since we do the same thing elsewhere. It's also wonderful shorthand to have... |
@@ -362,28 +374,34 @@ impl<'a> StringReader<'a> { | |||
} | |||
self.bump(); | |||
} | |||
let ret = self.with_str_from(start_bpos, |string| { | |||
return self.with_str_from(start_bpos, |string| { | |||
let tok; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tok = if ...
r=me with the 3 nits. |
+9000 very timely for rustfmt (re: lexing comments / ws) |
Now that the lexer is more robust, these tests don't need to be in separate files. Yay!
Rather than just dumping the id in the interner, which is useless, actually print the interned string. Adjust the lexer logging to use Show instead of Poly.
This is technically unsafe but interned strings are considered immortal.
This shuffles things around a bit so that LIT_CHAR and co store an Ident which is the original, unaltered literal in the source. When creating the AST, unescape and postprocess them. This changes how syntax extensions can work, slightly, but otherwise poses no visible changes. To get a useful value out of one of these tokens, call `parse::{char_lit, byte_lit, bin_lit, str_lit}` [breaking-change]
This removes a bunch of token types. Tokens now store the original, unaltered numeric literal (that is still checked for correctness), which is parsed into an actual number later, as needed, when creating the AST. This can change how syntax extensions work, but otherwise poses no visible changes. [breaking-change]
Now, the lexer will categorize every byte in its input according to the grammar. The parser skips over these while parsing, thus avoiding their presence in the input to syntax extensions.
Mostly minor things that rebasing is becoming painful.
Mostly minor things that rebasing is becoming painful.