Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementation of RFC 2151, Raw Identifiers #48942

Merged
merged 7 commits into from
Mar 23, 2018
Merged

Conversation

Lymia
Copy link
Contributor

@Lymia Lymia commented Mar 11, 2018

See issue #48589.

This implements most of the actual compiler part, though diagnostics and documentation are somewhat lacking. Namely, rustdoc and errors do not yet refer to items with keyword names using raw identifiers.

@rust-highfive
Copy link
Collaborator

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @pnkfelix (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

Please see the contribution instructions for more information.

@petrochenkov petrochenkov self-assigned this Mar 11, 2018
@pietroalbini pietroalbini added the S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. label Mar 12, 2018
@petrochenkov petrochenkov added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Mar 12, 2018
@Lymia Lymia force-pushed the master branch 3 times, most recently from 3d9ecae to d644e65 Compare March 14, 2018 08:35
@Lymia Lymia changed the title [WIP] Implementation of RFC 2151, Raw Identifiers Implementation of RFC 2151, Raw Identifiers Mar 14, 2018
@bors
Copy link
Contributor

bors commented Mar 14, 2018

☔ The latest upstream changes (presumably #48811) made this pull request unmergeable. Please resolve the merge conflicts.

@Lymia
Copy link
Contributor Author

Lymia commented Mar 15, 2018

I don't think there's much else worth adding to this, unless it'd be better to also modify rustdoc in this PR? I don't even know where to begin with diagonstics, so, better to leave that to someone else.

@petrochenkov petrochenkov removed the S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. label Mar 15, 2018
@bors
Copy link
Contributor

bors commented Mar 16, 2018

☔ The latest upstream changes (presumably #49051) made this pull request unmergeable. Please resolve the merge conflicts.

@bors
Copy link
Contributor

bors commented Mar 16, 2018

☔ The latest upstream changes (presumably #48097) made this pull request unmergeable. Please resolve the merge conflicts.

@Lymia
Copy link
Contributor Author

Lymia commented Mar 17, 2018

Help with this conflict? I'm not sure what's the right thing to do with this.

@petrochenkov
Copy link
Contributor

petrochenkov commented Mar 17, 2018

@Lymia

# Reset to the commit before merge
git reset --hard d644e652e65eea4e104c4df63a56781bc5648803
# Add rust-lang/rust as a remote in case this wasn't done before
git remote add upstream https://github.com/rust-lang/rust.git
# Fetch everything from rust-lang/rust
git fetch --all
# Rebase your changes on top of rust-lang/rust master
# In case of conflicts use `git rebase --continue` after resolving them
git rebase upstream/master
# Update this PR
git push -f origin master

@@ -455,6 +455,9 @@ declare_features! (

// Parentheses in patterns
(active, pattern_parentheses, "1.26.0", None, None),

// Raw identifiers allowing items with keyword names
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

allowing keyword names to be used in arbitrary identifier positions, not only item names

@@ -793,6 +793,9 @@ impl<'a> Parser<'a> {
return Err(err);
}
}
if is_raw {
self.sess.raw_identifier_spans.borrow_mut().push(self.span);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not entirely sure all identifiers go through parse_ident_common, it's probably better to do in the lexer when an identifier is created.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh huh. I didn't catch that ParseSess was available in the lexer too. Definitely a better place for it.

@@ -273,17 +310,32 @@ impl Token {
}
}

pub fn ident(&self) -> Option<ast::Ident> {
fn ident_common(&self, allow_raw: bool) -> Option<ast::Ident> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be just pub fn ident(&self) -> Option<(ast::Ident, /* is_raw */ bool)> IMO.
It will avoid introducing all those many little helper functions.
(The fn nonraw_ident vs fn ident distinction also looks error-prone.)

@petrochenkov
Copy link
Contributor

20d0962bfe851db9ce3e19da0c7c7674a091c266 updates the LLVM submodule unintendedly.

@@ -203,6 +233,11 @@ impl Token {
Token::Interpolated(Lrc::new((nt, LazyTokenStream::new())))
}

/// Recovers a `Token` from an `ast::Ident`. This creates a raw identifier if necessary.
pub fn from_ast_ident(ident: ast::Ident) -> Token {
Ident(ident, is_reserved_ident(ident))
Copy link
Contributor

@petrochenkov petrochenkov Mar 17, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function and all its uses are very suspicious.
It's used when we previously lost the "rawness" property and then we are trying to guess it - if it's a keyword, then it was a raw identifier otherwise not. We shouldn't do this guessing, we need to keep the bool instead or use the "r#{}" string encoding like in proc macros if keeping bool is not possible for some reason.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This problem exists when we go from tokens to AST (including MetaItem) and then have to back to tokens.

Copy link
Contributor Author

@Lymia Lymia Mar 18, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where in the AST should this information be encoded? MetaItem can be handled with a new struct field, but in the other use of this method was impl ToTokens for ast::Ident (how is ToToken used anyway?). In the issue, you mentioned that ast::Ident would be a bad place for an is_raw parameter, because of how size could be an issue for it.

Putting it in the Symbol doesn't work in this case, because Symbol comparison is simply an integer comparison, and r#foo should match foo.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tend to leave this as is for now, it's possible that we won't have this roundtrip "Tokens -> AST -> Tokens" in the future and will just keep the original tokens.
I need to think about this a bit more.

@@ -82,10 +82,10 @@ pub fn expand_register_diagnostic<'cx>(ecx: &'cx mut ExtCtxt,
token_tree.get(1),
token_tree.get(2)
) {
(1, Some(&TokenTree::Token(_, token::Ident(ref code))), None, None) => {
(1, Some(&TokenTree::Token(_, token::Ident(ref code, false))), None, None) => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the rawness property is not relevant to diagnostics/plugin.rs, these falses should probably be _.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a huge deal either way? These are internal macros, aren't they?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a huge deal either way?

No :)

return Ok(self.with_str_from(start, |string| {
// FIXME: perform NFKC normalization here. (Issue #2253)
if string == "_" {
token::Underscore
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens with r#_? It should be an error.
Could you add a test for it?

if is_raw_ident && (
ident.name == keywords::SelfValue.name() ||
ident.name == keywords::SelfType.name() ||
ident.name == keywords::Super.name()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The full list is Self, self, super, extern and crate, see fn is_path_segment_keyword.

(Well, there's also $crate, but it's never produced by lexer.)

@petrochenkov
Copy link
Contributor

petrochenkov commented Mar 17, 2018

Questions from #48589:

I've found some other unexpected places where raw identifiers might show up while implementing this

Raw identifiers can appear in any context normal identifiers can appear. It looks like this PR already implements this behavior.

Given this macro definition, which branch should test_macro!(r#a) match:

macro_rules! test_macro {
    (a) => { ... };
    (r#a) => { ... };
}

This is an interesting question.
Macros accept DSLs in which language keywords don't play any special role, and, on the other hand, a macro can invent its own "keywords" treated differently from other identifiers.
So the question is whether r# should provide an opt-out from "keywords" invented by macros (aka idents on the left side of the macro) or not. I think that it should.
I case of the test_macro above a is such macro-specific keyword treated specially by DSL of that macro, and r#a provides an opt-out.
I see the implementation in this PR already implements this behavior. Nice!

@petrochenkov petrochenkov added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Mar 17, 2018
@@ -452,6 +452,9 @@ declare_features! (

// `use path as _;` and `extern crate c as _;`
(active, underscore_imports, "1.26.0", Some(48216), None),

// Raw identifiers allowing keyword names to be used
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sentence seems unfinished.

@petrochenkov
Copy link
Contributor

r=me after addressing the new comments

@@ -268,7 +268,7 @@ impl<'a> base::Resolver for Resolver<'a> {
if k > 0 {
tokens.push(TokenTree::Token(path.span, Token::ModSep).into());
}
let tok = Token::Ident(segment.identifier);
let tok = Token::from_ast_ident(segment.identifier);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed this in from_ast_ident, since it seems this is the best behavior in all these cases.

@Lymia
Copy link
Contributor Author

Lymia commented Mar 22, 2018

@bors r=petrochenkov

@bors
Copy link
Contributor

bors commented Mar 22, 2018

@Lymia: 🔑 Insufficient privileges: Not in reviewers

@Lymia
Copy link
Contributor Author

Lymia commented Mar 22, 2018

@petrochenkov I'm not sure what you want me to do here. :D

I clearly can't do that.

@petrochenkov
Copy link
Contributor

@bors r+
Thanks!

@bors
Copy link
Contributor

bors commented Mar 22, 2018

📌 Commit 57f9c4d has been approved by petrochenkov

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Mar 22, 2018
bors added a commit that referenced this pull request Mar 23, 2018
@alexcrichton
Copy link
Member

@bors: r-

This unfortunately failed in a rollup, but fear not! I'm preemptively r-'ing this in case the rollup fails and we have to abandon it, but otherwise I've already fixed the issue.

If the rollup lands it's best to not rebase/push new commits here so this PR will automatically get closed, but if the rollup ends up getting closed then those fixes can be folded into this PR and we can re-r+

@bors bors added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Mar 23, 2018
@alexcrichton alexcrichton merged commit 57f9c4d into rust-lang:master Mar 23, 2018
@Lymia
Copy link
Contributor Author

Lymia commented Mar 24, 2018

What was the issue that ignore-pretty fixed, and how could I have detected that beforehand? For future reference.

@alexcrichton
Copy link
Member

Oh the builders run more tests than the default ones locally (typically those test suites fail very rarely), but I think this one could in theory have been detected by ./x.py test src/test/run-pass/pretty

@petrochenkov
Copy link
Contributor

@Lymia
As alexcrichton said, some tests are not run by x.py test by default because they are not super important and rarely break (but they are still run on PR merge). This includes pretty-printing tests.
#49369 fixes the pretty-printing issue caught by those tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants