Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clean up and consolidate the lexical specification. #567

Open
3 of 6 tasks
ehuss opened this issue Apr 21, 2019 · 0 comments
Open
3 of 6 tasks

Clean up and consolidate the lexical specification. #567

ehuss opened this issue Apr 21, 2019 · 0 comments
Labels
A-grammar Area: Syntax and parsing A-lexer Area: Lexical specification

Comments

@ehuss
Copy link
Contributor

ehuss commented Apr 21, 2019

The lexical specification needs some cleanup and organization. Some things I can think of:

  • There should be an overall introduction and overview of the lexical structure.
  • Paths are in the Lexical chapter, but I don't think they should be. Start documenting name resolution. #937.
  • The UTF8BOM/SHEBANG definition is floating in a chapter outside of the Lexical chapter. I think it is relevant to lexing, so it should be somehow incorporated in the Lexical chapter. (Not sure how, probably need to rearrange things a little.) Input format #1459
  • I think there should be an appendix consolidating all the Lexer rules blocks. This should be generated automatically.
  • The "input format" subchapter is almost completely useless, and could be moved somewhere else. Input format #1459
  • There should be a note about token ambiguity (this can be relatively brief, but should be mentioned). This depends on the lexer/parser implementation. rustc works by splitting tokens into smaller parts. The proc_macro parser works by only issuing the smaller tokens, and using the Spacing to determine if they should be combined later on. The tokens that I'm aware of that cause this issue are:
Token Possibly Split Into
+= + =
&& & &
|| | |
<< < <
<- < -
>> > >
>>= > >=
>= > =
+= + =

See also:
rust-lang/wg-grammar#3
https://internals.rust-lang.org/t/pre-pre-rfc-canonical-lexer-specification/4099

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-grammar Area: Syntax and parsing A-lexer Area: Lexical specification
Projects
None yet
Development

No branches or pull requests

1 participant