parser: Ensure that all nonterminals have tokens after parsing #84995

petrochenkov · 2021-05-06T16:38:58Z

parse_nonterminal should always result in something with tokens.

This requirement wasn't satisfied in two cases:

stmt nonterminal with expression statements (e.g. 0, or {}, or path + 1) because fn parse_stmt_without_recovery forgot to propagate force_collect in some cases.
expr nonterminal with expressions with built-in attributes (e.g. #[allow(warnings)] 0) due to an incorrect optimization in fn parse_expr_force_collect, it assumed that all expressions starting with # have their tokens collected during parsing, but that's not true if all the attributes on that expression are built-in and inert.

(Discovered when trying to implement eager cfg expansion for all attributes #83824 (comment).)

Aaron1011 · 2021-05-06T16:42:39Z

compiler/rustc_parse/src/parser/attr_wrapper.rs

@@ -342,16 +342,10 @@ impl<'a> Parser<'a> {

        // If we support tokens at all
        if let Some(target_tokens) = ret.tokens_mut() {
-            if let Some(target_tokens) = target_tokens {
-                assert!(


Why did you remove this assertion?

It's hit in cases like #[cfg_eval] #[rustc_dummy] 0.

Tokens for #[rustc_dummy] 0 are collected twice due to the new behavior of parse_expr_force_collect, but we still get to here despite the

if !self.capture_cfg && matches!(ret.tokens_mut(), None | Some(Some(_))) { return Ok(ret); }

above because in case of cfg_eval the self.capture_cfg condition is true.

src/test/ui/macros/expr-stmt-nonterminal-tokens.rs

compiler/rustc_parse/src/parser/nonterminal.rs

Aaron1011 · 2021-05-06T18:47:10Z

src/test/ui/macros/expr-stmt-nonterminal-tokens.rs

@@ -0,0 +1,15 @@
+// check-pass


This file compiles successfully on the latest nightly. I think you need to force recollection of the nonterminals by a proc-macro in order to trigger the panic.

Yes, the test assumes existence of the missing tokens panic added in this PR, it won't fail without it.

Aaron1011 · 2021-05-06T19:08:12Z

Let's see what the performance impact of disabling the expr optimization looks like:

@bors try @rust-timer queue

rust-timer · 2021-05-06T19:08:13Z

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

bors · 2021-05-06T19:08:21Z

⌛ Trying commit 112094c5215d6487cda23c05e3b804084ffc3dc1 with merge d0a79681ee7f5212b884bbc274a705e0e1e01d40...

bors · 2021-05-06T19:56:44Z

☀️ Try build successful - checks-actions
Build commit: d0a79681ee7f5212b884bbc274a705e0e1e01d40 (d0a79681ee7f5212b884bbc274a705e0e1e01d40)

rust-timer · 2021-05-06T19:56:47Z

Queued d0a79681ee7f5212b884bbc274a705e0e1e01d40 with parent d44f647, future comparison URL.

rust-timer · 2021-05-07T03:32:18Z

Finished benchmarking try commit (d0a79681ee7f5212b884bbc274a705e0e1e01d40): comparison url.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. Please note that if the perf results are neutral, you should likely undo the rollup=never given below by specifying rollup- to bors.

Importantly, though, if the results of this run are non-neutral do not roll this PR up -- it will mask other regressions or improvements in the roll up.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf

petrochenkov · 2021-05-07T09:56:44Z

Minor regression.
Not sure why exactly, stmt matchers and cfg_eval are exceedingly rare, and expressions with attributes in expr matchers should be rare as well.

Aaron1011 · 2021-05-08T22:26:51Z

compiler/rustc_parse/src/parser/stmt.rs

@@ -85,21 +89,22 @@ impl<'a> Parser<'a> {
            self.mk_stmt(lo, StmtKind::Empty)
        } else if self.token != token::CloseDelim(token::Brace) {
            // Remainder are line-expr stmts.
-            let e = self.parse_expr_res(Restrictions::STMT_EXPR, Some(attrs))?;
+            let e = if force_collect == ForceCollect::Yes {
+                self.collect_tokens_no_attrs(|this| {


This should be a call to collect_tokens_trailing_token, with attrs passed in (since we do have outer attributes).

I assumed this case is analogous to parse_expr_force_collect (https://github.com/rust-lang/rust/pull/84995/files#r629329546) so we can use collect_tokens_no_attrs.

Also, I don't think we can simply pass attrs here because they don't necessary apply to the whole statement, they can apply only to some sub-expression like in #[attr] 1 + 2 (grouped as (#[attr] 1) + 2).

Aaron1011 · 2021-05-08T22:28:47Z

compiler/rustc_parse/src/parser/expr.rs

-            // will never try to collect tokens if we don't have outer attributes.
-            self.collect_tokens_no_attrs(|this| this.parse_expr())
-        }
+        self.collect_tokens_no_attrs(|this| this.parse_expr())


Normally, this would be incorrect. collect_tokens_no_attrs was only supposed to be used when the target was guaranteed to not have any outer attributes - if outer attributes are present, we need to capture their tokens separately so that we can create a proper ReplaceRange.

However, we only actually need to know the attribute start position when we're in capture_cfg mode, which never happens when we're parsing nonterminals (we flatten all nonterminals before entering capture_cfg mode). Can you add a debug_assert!(!self.capture_cfg) to parse_expr_force_collect to make sure this method doesn't get misused?

I may do some refactoring in a follow-up PR.

The assert will fire because parse_expr_force_collect is indeed used with capture_cfg == true.
Here - https://github.com/rust-lang/rust/blob/master/compiler/rustc_builtin_macros/src/cfg_eval.rs#L194

(The result of cfg_eval should have tokens collected, especially with #85073.)

I assumed that wrapping into collect_tokens_no_attrs is ok because there are two cases:

There are active attributes, then token collection will be called once again in parse_expr, will set tokens for the returned expression, so the tokens collected by collect_tokens_no_attrs will simply be thrown away.
(I'm not sure what happens with range replacement in this case.)

There are no active attributes, then collect_tokens_no_attrs is actually doing the right thing and it will set tokens for the returned expression. (I'm again not sure what will happen with range replacement in this case.)

I think it will happen to do the right thing with replace ranges - we'll end up creating a ReplaceRange covering the entire expression (including outer attributes), as expected. Since we'll throw away the new tokens, the final captured tokens will not include the outer attributes, which will allow us to correctly perform cfg-expansion.

I forgot about the fact that we use force collection at the 'top level' of cfg_eval. It's unfortunate that we'll lose the asertion, but I don't think keeping it is worth the extra complexity of keeping track of the 'top-level' status on top of everything else.

src/test/ui/macros/expr-stmt-nonterminal-tokens.rs

Aaron1011 · 2021-05-10T20:50:20Z

The interaction between nonterminal parsing, token collection, and attribute handling is quite messy (this is an issue predating this PR). In particular, I find the 'double-capturing' behavior to be difficult to reason about, which is why I had tried to avoid it before now.

I'd like to try to clean it up in another PR, but I don't think that needs to block this PR.

r=me with the test added.

Aaron1011 · 2021-05-10T20:53:53Z

src/test/ui/macros/expr-stmt-nonterminal-tokens.rs

+}
+
+fn main() {
+    mac!(expr #[allow(warnings)] 0);


I just realized that we allow attributes on 'nonterminal' expressions, despite this normally being unstable. This may have gone unnoticed due to the fact that actually using the captured expression will trigger the feature gate. However, if it's thrown away (as in this example), it compiles on stable.

Feature-gating for expression/statement attributes has many holes.
I tried fixing them but it causes breakage in practice unfortunately.
So I thought perhaps it's better to work in the direction of the stabilization instead, which should cause less breakage.

petrochenkov · 2021-06-06T11:25:26Z

src/test/ui/proc-macro/expr-stmt-nonterminal-tokens.stdout

@@ -0,0 +1,540 @@
+PRINT-DERIVE INPUT (DISPLAY): enum E { V = { let _ = #[allow(warnings)] 0 ; 0 }, }
+PRINT-DERIVE DEEP-RE-COLLECTED (DISPLAY): enum E { V = { let _ = #[allow(warnings)] #[allow(warnings)] 0 ; 0 }, }


@Aaron1011
The attribute in $expr #[allow(warnings)] 0 is duplicated for some reason.

The duplication reproduces with

macro mac($expr:expr) { print_bang!($expr); } fn main() { mac!(#[allow(warnings)] 0); }

as well.

At least this is not a regression, previously these examples resulted in ICEs.

Made #86055 for tracking this.

petrochenkov · 2021-06-06T11:43:35Z

@bors r=Aaron1011

bors · 2021-06-06T11:43:36Z

📌 Commit cbdfa1e has been approved by Aaron1011

bors · 2021-06-06T14:00:48Z

⌛ Testing commit cbdfa1e with merge 86b0baf...

bors · 2021-06-06T16:41:31Z

☀️ Test successful - checks-actions
Approved by: Aaron1011
Pushing 86b0baf to master...

rust-highfive assigned Aaron1011 May 6, 2021

rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label May 6, 2021

Aaron1011 reviewed May 6, 2021

View reviewed changes

This comment has been minimized.

Sign in to view

petrochenkov force-pushed the tcollect branch from ce64d56 to b790feb Compare May 6, 2021 16:50

This comment has been minimized.

Sign in to view

petrochenkov commented May 6, 2021

View reviewed changes

src/test/ui/macros/expr-stmt-nonterminal-tokens.rs Outdated Show resolved Hide resolved

Aaron1011 added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels May 6, 2021

petrochenkov force-pushed the tcollect branch from b790feb to 112094c Compare May 6, 2021 18:16

petrochenkov added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels May 6, 2021

petrochenkov mentioned this pull request May 6, 2021

Stabilize #[cfg_eval] and feature(macro_attributes_in_derive_output) #83824

Closed

Aaron1011 reviewed May 6, 2021

View reviewed changes

compiler/rustc_parse/src/parser/nonterminal.rs Outdated Show resolved Hide resolved

Aaron1011 reviewed May 6, 2021

View reviewed changes

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label May 6, 2021

rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label May 7, 2021

petrochenkov force-pushed the tcollect branch from 112094c to 86b9387 Compare May 7, 2021 09:54

petrochenkov mentioned this pull request May 8, 2021

[experiment] Eagerly expand cfg and cfg_attr in all attribute inputs #85073

Closed

Aaron1011 reviewed May 8, 2021

View reviewed changes

Aaron1011 added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels May 8, 2021

petrochenkov added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels May 10, 2021

Aaron1011 reviewed May 10, 2021

View reviewed changes

src/test/ui/macros/expr-stmt-nonterminal-tokens.rs Outdated Show resolved Hide resolved

Aaron1011 reviewed May 10, 2021

View reviewed changes

Aaron1011 added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels May 10, 2021

JohnCSimon added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels May 31, 2021

parser: Ensure that all nonterminals have tokens after parsing

cbdfa1e

petrochenkov force-pushed the tcollect branch from 86b9387 to cbdfa1e Compare June 6, 2021 11:21

petrochenkov commented Jun 6, 2021

View reviewed changes

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Jun 6, 2021

petrochenkov mentioned this pull request Jun 6, 2021

Inert attributes in $expr nonterminals are sometimes duplicated #86055

Closed

bors added the merged-by-bors This PR was explicitly merged by bors. label Jun 6, 2021

bors merged commit 86b0baf into rust-lang:master Jun 6, 2021

rustbot added this to the 1.54.0 milestone Jun 6, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

parser: Ensure that all nonterminals have tokens after parsing #84995

parser: Ensure that all nonterminals have tokens after parsing #84995

petrochenkov commented May 6, 2021 •

edited

Loading

Aaron1011 May 6, 2021

petrochenkov May 6, 2021

This comment has been minimized.

This comment has been minimized.

Aaron1011 May 6, 2021

petrochenkov May 6, 2021

Aaron1011 commented May 6, 2021

rust-timer commented May 6, 2021

bors commented May 6, 2021

bors commented May 6, 2021

rust-timer commented May 6, 2021

rust-timer commented May 7, 2021

petrochenkov commented May 7, 2021

Aaron1011 May 8, 2021

petrochenkov May 10, 2021

Aaron1011 May 8, 2021 •

edited

Loading

petrochenkov May 10, 2021

petrochenkov May 10, 2021

Aaron1011 May 10, 2021

Aaron1011 commented May 10, 2021

Aaron1011 May 10, 2021

petrochenkov Jun 5, 2021

petrochenkov Jun 6, 2021

petrochenkov Jun 6, 2021

petrochenkov Jun 6, 2021

petrochenkov commented Jun 6, 2021

bors commented Jun 6, 2021

bors commented Jun 6, 2021

bors commented Jun 6, 2021

		@@ -0,0 +1,540 @@
		PRINT-DERIVE INPUT (DISPLAY): enum E { V = { let _ = #[allow(warnings)] 0 ; 0 }, }
		PRINT-DERIVE DEEP-RE-COLLECTED (DISPLAY): enum E { V = { let _ = #[allow(warnings)] #[allow(warnings)] 0 ; 0 }, }

parser: Ensure that all nonterminals have tokens after parsing #84995

parser: Ensure that all nonterminals have tokens after parsing #84995

Conversation

petrochenkov commented May 6, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

This comment has been minimized.

This comment has been minimized.

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Aaron1011 commented May 6, 2021

rust-timer commented May 6, 2021

bors commented May 6, 2021

bors commented May 6, 2021

rust-timer commented May 6, 2021

rust-timer commented May 7, 2021

petrochenkov commented May 7, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Aaron1011 May 8, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Aaron1011 commented May 10, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

petrochenkov commented Jun 6, 2021

bors commented Jun 6, 2021

bors commented Jun 6, 2021

bors commented Jun 6, 2021

petrochenkov commented May 6, 2021 •

edited

Loading

Aaron1011 May 8, 2021 •

edited

Loading