Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lexer changes #15339

Merged
merged 15 commits into from
Jul 9, 2014
Merged

Lexer changes #15339

merged 15 commits into from
Jul 9, 2014

Conversation

emberian
Copy link
Member

@emberian emberian commented Jul 2, 2014

Mostly minor things that rebasing is becoming painful.

@emberian
Copy link
Member Author

emberian commented Jul 4, 2014

re-r? last two commits are new. @brson @jbclements

@jbclements
Copy link
Contributor

Rebasing those doc comments changes must be painful. Why not make those a separate pull request, so we can just get those out of the way?

EDIT: nvm, pretty much good to go here, pending q. about lifetimes

@emberian
Copy link
Member Author

emberian commented Jul 5, 2014

Ok, all done. Last commit is new, and completes RFC 21. Still an open question about whether 011b0fb is OK.

@emberian
Copy link
Member Author

emberian commented Jul 5, 2014

I'm inclined to say it's fine, since we do the same thing elsewhere. It's also wonderful shorthand to have...

@@ -362,28 +374,34 @@ impl<'a> StringReader<'a> {
}
self.bump();
}
let ret = self.with_str_from(start_bpos, |string| {
return self.with_str_from(start_bpos, |string| {
let tok;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tok = if ...

@huonw
Copy link
Member

huonw commented Jul 6, 2014

r=me with the 3 nits.

@olsonjeffery
Copy link
Contributor

+9000 very timely for rustfmt (re: lexing comments / ws)

emberian added 10 commits July 9, 2014 00:06
Rather than just dumping the id in the interner, which is useless, actually
print the interned string. Adjust the lexer logging to use Show instead of
Poly.
This is technically unsafe but interned strings are considered immortal.
This shuffles things around a bit so that LIT_CHAR and co store an Ident
which is the original, unaltered literal in the source. When creating the AST,
unescape and postprocess them.

This changes how syntax extensions can work, slightly, but otherwise poses no
visible changes. To get a useful value out of one of these tokens, call
`parse::{char_lit, byte_lit, bin_lit, str_lit}`

[breaking-change]
This removes a bunch of token types. Tokens now store the original, unaltered
numeric literal (that is still checked for correctness), which is parsed into
an actual number later, as needed, when creating the AST.

This can change how syntax extensions work, but otherwise poses no visible
changes.

[breaking-change]
Now, the lexer will categorize every byte in its input according to the
grammar. The parser skips over these while parsing, thus avoiding their
presence in the input to syntax extensions.
bors added a commit that referenced this pull request Jul 9, 2014
Mostly minor things that rebasing is becoming painful.
@bors bors closed this Jul 9, 2014
@bors bors merged commit 69a0cdf into rust-lang:master Jul 9, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants