Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

parser: Ensure that all nonterminals have tokens after parsing #84995

Merged
merged 1 commit into from
Jun 6, 2021

Conversation

petrochenkov
Copy link
Contributor

@petrochenkov petrochenkov commented May 6, 2021

parse_nonterminal should always result in something with tokens.

This requirement wasn't satisfied in two cases:

  • stmt nonterminal with expression statements (e.g. 0, or {}, or path + 1) because fn parse_stmt_without_recovery forgot to propagate force_collect in some cases.
  • expr nonterminal with expressions with built-in attributes (e.g. #[allow(warnings)] 0) due to an incorrect optimization in fn parse_expr_force_collect, it assumed that all expressions starting with # have their tokens collected during parsing, but that's not true if all the attributes on that expression are built-in and inert.

(Discovered when trying to implement eager cfg expansion for all attributes #83824 (comment).)

r? @Aaron1011

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label May 6, 2021
@@ -342,16 +342,10 @@ impl<'a> Parser<'a> {

// If we support tokens at all
if let Some(target_tokens) = ret.tokens_mut() {
if let Some(target_tokens) = target_tokens {
assert!(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you remove this assertion?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's hit in cases like #[cfg_eval] #[rustc_dummy] 0.

Tokens for #[rustc_dummy] 0 are collected twice due to the new behavior of parse_expr_force_collect, but we still get to here despite the

        if !self.capture_cfg && matches!(ret.tokens_mut(), None | Some(Some(_))) {
            return Ok(ret);
        }

above because in case of cfg_eval the self.capture_cfg condition is true.

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@Aaron1011 Aaron1011 added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels May 6, 2021
@petrochenkov petrochenkov added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels May 6, 2021
@@ -0,0 +1,15 @@
// check-pass
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file compiles successfully on the latest nightly. I think you need to force recollection of the nonterminals by a proc-macro in order to trigger the panic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the test assumes existence of the missing tokens panic added in this PR, it won't fail without it.

@Aaron1011
Copy link
Member

Let's see what the performance impact of disabling the expr optimization looks like:

@bors try @rust-timer queue

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label May 6, 2021
@bors
Copy link
Contributor

bors commented May 6, 2021

⌛ Trying commit 112094c5215d6487cda23c05e3b804084ffc3dc1 with merge d0a79681ee7f5212b884bbc274a705e0e1e01d40...

@bors
Copy link
Contributor

bors commented May 6, 2021

☀️ Try build successful - checks-actions
Build commit: d0a79681ee7f5212b884bbc274a705e0e1e01d40 (d0a79681ee7f5212b884bbc274a705e0e1e01d40)

@rust-timer
Copy link
Collaborator

Queued d0a79681ee7f5212b884bbc274a705e0e1e01d40 with parent d44f647, future comparison URL.

@rust-timer
Copy link
Collaborator

Finished benchmarking try commit (d0a79681ee7f5212b884bbc274a705e0e1e01d40): comparison url.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. Please note that if the perf results are neutral, you should likely undo the rollup=never given below by specifying rollup- to bors.

Importantly, though, if the results of this run are non-neutral do not roll this PR up -- it will mask other regressions or improvements in the roll up.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label May 7, 2021
@petrochenkov
Copy link
Contributor Author

Minor regression.
Not sure why exactly, stmt matchers and cfg_eval are exceedingly rare, and expressions with attributes in expr matchers should be rare as well.

@@ -85,21 +89,22 @@ impl<'a> Parser<'a> {
self.mk_stmt(lo, StmtKind::Empty)
} else if self.token != token::CloseDelim(token::Brace) {
// Remainder are line-expr stmts.
let e = self.parse_expr_res(Restrictions::STMT_EXPR, Some(attrs))?;
let e = if force_collect == ForceCollect::Yes {
self.collect_tokens_no_attrs(|this| {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be a call to collect_tokens_trailing_token, with attrs passed in (since we do have outer attributes).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assumed this case is analogous to parse_expr_force_collect (https://github.com/rust-lang/rust/pull/84995/files#r629329546) so we can use collect_tokens_no_attrs.

Also, I don't think we can simply pass attrs here because they don't necessary apply to the whole statement, they can apply only to some sub-expression like in #[attr] 1 + 2 (grouped as (#[attr] 1) + 2).

// will never try to collect tokens if we don't have outer attributes.
self.collect_tokens_no_attrs(|this| this.parse_expr())
}
self.collect_tokens_no_attrs(|this| this.parse_expr())
Copy link
Member

@Aaron1011 Aaron1011 May 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Normally, this would be incorrect. collect_tokens_no_attrs was only supposed to be used when the target was guaranteed to not have any outer attributes - if outer attributes are present, we need to capture their tokens separately so that we can create a proper ReplaceRange.

However, we only actually need to know the attribute start position when we're in capture_cfg mode, which never happens when we're parsing nonterminals (we flatten all nonterminals before entering capture_cfg mode). Can you add a debug_assert!(!self.capture_cfg) to parse_expr_force_collect to make sure this method doesn't get misused?

I may do some refactoring in a follow-up PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The assert will fire because parse_expr_force_collect is indeed used with capture_cfg == true.
Here - https://github.com/rust-lang/rust/blob/master/compiler/rustc_builtin_macros/src/cfg_eval.rs#L194

(The result of cfg_eval should have tokens collected, especially with #85073.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assumed that wrapping into collect_tokens_no_attrs is ok because there are two cases:

  • There are active attributes, then token collection will be called once again in parse_expr, will set tokens for the returned expression, so the tokens collected by collect_tokens_no_attrs will simply be thrown away.
    (I'm not sure what happens with range replacement in this case.)
  • There are no active attributes, then collect_tokens_no_attrs is actually doing the right thing and it will set tokens for the returned expression. (I'm again not sure what will happen with range replacement in this case.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it will happen to do the right thing with replace ranges - we'll end up creating a ReplaceRange covering the entire expression (including outer attributes), as expected. Since we'll throw away the new tokens, the final captured tokens will not include the outer attributes, which will allow us to correctly perform cfg-expansion.

I forgot about the fact that we use force collection at the 'top level' of cfg_eval. It's unfortunate that we'll lose the asertion, but I don't think keeping it is worth the extra complexity of keeping track of the 'top-level' status on top of everything else.

@Aaron1011 Aaron1011 added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels May 8, 2021
@petrochenkov petrochenkov added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels May 10, 2021
@Aaron1011
Copy link
Member

The interaction between nonterminal parsing, token collection, and attribute handling is quite messy (this is an issue predating this PR). In particular, I find the 'double-capturing' behavior to be difficult to reason about, which is why I had tried to avoid it before now.

I'd like to try to clean it up in another PR, but I don't think that needs to block this PR.

r=me with the test added.

}

fn main() {
mac!(expr #[allow(warnings)] 0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just realized that we allow attributes on 'nonterminal' expressions, despite this normally being unstable. This may have gone unnoticed due to the fact that actually using the captured expression will trigger the feature gate. However, if it's thrown away (as in this example), it compiles on stable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feature-gating for expression/statement attributes has many holes.
I tried fixing them but it causes breakage in practice unfortunately.
So I thought perhaps it's better to work in the direction of the stabilization instead, which should cause less breakage.

@Aaron1011 Aaron1011 added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels May 10, 2021
@JohnCSimon JohnCSimon added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels May 31, 2021
@@ -0,0 +1,540 @@
PRINT-DERIVE INPUT (DISPLAY): enum E { V = { let _ = #[allow(warnings)] 0 ; 0 }, }
PRINT-DERIVE DEEP-RE-COLLECTED (DISPLAY): enum E { V = { let _ = #[allow(warnings)] #[allow(warnings)] 0 ; 0 }, }
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Aaron1011
The attribute in $expr #[allow(warnings)] 0 is duplicated for some reason.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The duplication reproduces with

macro mac($expr:expr) {
    print_bang!($expr);
}

fn main() {
    mac!(#[allow(warnings)] 0);
}

as well.

At least this is not a regression, previously these examples resulted in ICEs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made #86055 for tracking this.

@petrochenkov
Copy link
Contributor Author

@bors r=Aaron1011

@bors
Copy link
Contributor

bors commented Jun 6, 2021

📌 Commit cbdfa1e has been approved by Aaron1011

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Jun 6, 2021
@bors
Copy link
Contributor

bors commented Jun 6, 2021

⌛ Testing commit cbdfa1e with merge 86b0baf...

@bors
Copy link
Contributor

bors commented Jun 6, 2021

☀️ Test successful - checks-actions
Approved by: Aaron1011
Pushing 86b0baf to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Jun 6, 2021
@bors bors merged commit 86b0baf into rust-lang:master Jun 6, 2021
@rustbot rustbot added this to the 1.54.0 milestone Jun 6, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants