Eliminate magic numbers from expression precedence #133603

dtolnay · 2024-11-28T22:55:44Z

Context: see #133140.

This PR continues on backporting Syn's expression precedence design into rustc. Rustc's design used mysterious integer quantities represented variously as i8 or usize (e.g. PREC_CLOSURE = -40i8), a special significance around 0 that is never named, and an extra PREC_FORCE_PAREN precedence level that does not correspond to any expression. Syn's design uses a C-like enum with variants that clearly correspond to specific sets of expression kinds.

This PR is a refactoring that has no intended behavior change on its own, but it unblocks other precedence work that rustc's precedence design was poorly suited to accommodate.

Asymmetrical precedence, so that a pretty-printer can tell (return 1) + 1 needs parens but 1 + return 1 does not.
Squashing the Closure and Jump cases into a single precedence level.
Numerous remaining false positives and false negatives in rustc pretty-printer's parenthesization of macro metavariables, for example in $e < rhs where $e is lhs as Thing<T>.

FYI @fmease — you don't need to review if rustbot picks someone else, but you mentioned being interested in the followup PRs.

rustbot · 2024-11-28T22:55:52Z

r? @lcnr

rustbot has assigned @lcnr.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

rustbot · 2024-11-28T22:55:54Z

Some changes occurred in src/tools/clippy

cc @rust-lang/clippy

lcnr

this looks correct to me and commits 3-7 are a clear improvement imo

Why do you think that entirely removing print_expr_maybe_paren is worth it? I feel like it's easier to read then the alternative. Removing it likely made the next commits easier, is that the main reason?

bors · 2024-11-29T23:01:45Z

☔ The latest upstream changes (presumably #133588) made this pull request unmergeable. Please resolve the merge conflicts.

dtolnay · 2024-11-30T01:06:23Z

I moved commits 1-2 to #133655 with a longer explanation. I can drop those commits if you prefer to keep print_expr_maybe_paren, or we can defer it until after seeing some of the rest of this work like the asymmetrical precedence changes.

Eliminate print_expr_maybe_paren function from pretty printers This PR is part of backporting Syn's expression precedence design into rustc. (See rust-lang#133603 for other work on this.) In Syn, our version of `print_expr_cond_paren` is called `print_subexpression` and it is called from 19 places. Of those calls, 12 of them need a "custom" behavior for the `needs_paren` argument, whereas only 7 use a "standard" behavior resembling `print_subexpression($e, $e.precedence() < Precedence::$Variant, ...)`. In other words the behavior that rustc_ast_pretty's `print_expr_maybe_paren` implements is actually not what you want most of the time. The current usage you see in rustc is overuse. <details> <summary>Aside: am I confident about the correctness of Syn's parenthesization? Yes. Click for details.</summary> --- The behavior is constrained by the following pair of tests which both run over every Rust source file of rustc and the standard library and tools and test suites: - To rule out **false positives**: for every expression in every source file, print the expression, parse it back, and verify that not a single new parenthesis got added. Since these are expressions parsed from source code, not macro-generated syntax trees, we know they must never need automatic parenthesis insertion. Rustc's pretty printer does not pass this. Pseudocode: `assert(expr == parse(print(expr)))` - To rule out **false negatives**: for every expression in every source file, replace every Expr::Paren node in the syntax tree with just its contents, i.e. stripping the parentheses but otherwise preserving the syntax tree structure. Then print the stripped expression performing parenthesis insertion wherever needed, and reparse it. Verify that the reparsed expression has identical structure to the original, despite there being no parentheses in the original prior to printing, i.e. all the right parentheses got re-inserted by the printer to preserve the expression's structure. Rustc's pretty printer does not pass this. See dtolnay/syn#1788 which reveals multiple rustc_ast_pretty bugs. Pseudocode: `assert(unparenthesize(expr) == unparenthesize(parse(print(unparenthesize(expr)))))` --- </details> If `print_expr_maybe_paren` is usually not correct, is there harm in keeping it for the minority of cases where it is correct? I think the answer is yes and Syn doesn't use any equivalent of this helper function. The problems with it are: - Having both `print_expr_maybe_paren` and `print_expr_cond_paren` applies counterproductive inertia against moving from the first to the second. When looking at a call site like `print_expr_maybe_paren(e, Precedence::$Variant, ...)` with parentheses not being inserted where they should be, anyone's first inclination would be to solve the bug by tweaking $Variant because that is the only knob that visibly appears in the function call. For example to pass "prec + 1", like tweaking the code to conditionally pass `Precedence::Prefix` instead of `Precedence::Cast`. Experience in Syn shows this is (almost?) never what you want the person to do. In a call `print_expr_cond_paren(e, e.precedence() < ExprPrecedence::$Variant, ...)` almost always the best fix involves one of: - Changing `e.precedence()`, e.g. to `fixup.leading_precedence(e)` and `fixup.trailing_precedence(e)` in cases of asymmetrical precedence (`(return 1) + 1` vs `1 + return 1`). - Changing `<` to `<=`, to handle associativity and other grammar restrictions like chained comparisons (which rustc gets wrong today). - Adding `||` and/or `&&` clauses to the condition. By using these 3 better knobs instead of $Variant, it upholds the property that any time we talk about precedence, it is always the precedence of some actual expression that our code is actively manipulating, instead of a value standing in for some imaginary precedence level that would exist between two consecutive [real levels](https://doc.rust-lang.org/1.83.0/reference/expressions.html#expression-precedence). For example consider that "`Cast` + 1" might be `Prefix` today, but only until some new Rust syntax ends up adding a level between those. - The `print_expr_maybe_paren` call sites look shorter, but they are not clearer. For myself, a function argument that says "does this subexpression need parenthesization" is a concrete thing that is easy to think about, while a function argument that is "what is the effective precedence level associated with this subexpression's placement inside its parent expression" is abstract and tricky to even state a precise meaning for. I expect that for someone less familiar with the pretty printer working on adding a new expression kind (like postfix match, recently), having every subexpression consistently printed using `print_expr_cond_paren` will be more beneficial, for the same reason, than having `print_expr_maybe_paren` available. r? `@lcnr`

Eliminate print_expr_maybe_paren function from pretty printers This PR is part of backporting Syn's expression precedence design into rustc. (See rust-lang#133603 for other work on this.) In Syn, our version of `print_expr_cond_paren` is called `print_subexpression` and it is called from 19 places. Of those calls, 12 of them need a "custom" behavior for the `needs_paren` argument, whereas only 7 use a "standard" behavior resembling `print_subexpression($e, $e.precedence() < Precedence::$Variant, ...)`. In other words the behavior that rustc_ast_pretty's `print_expr_maybe_paren` implements is actually not what you want most of the time. The current usage you see in rustc is overuse. <details> <summary>Aside: am I confident about the correctness of Syn's parenthesization? Yes. Click for details.</summary> --- The behavior is constrained by the following pair of tests which both run over every Rust source file of rustc and the standard library and tools and test suites: - To rule out **false positives**: for every expression in every source file, print the expression, parse it back, and verify that not a single new parenthesis got added. Since these are expressions parsed from source code, not macro-generated syntax trees, we know they must never need automatic parenthesis insertion. Rustc's pretty printer does not pass this. Pseudocode: `assert(expr == parse(print(expr)))` - To rule out **false negatives**: for every expression in every source file, replace every Expr::Paren node in the syntax tree with just its contents, i.e. stripping the parentheses but otherwise preserving the syntax tree structure. Then print the stripped expression performing parenthesis insertion wherever needed, and reparse it. Verify that the reparsed expression has identical structure to the original, despite there being no parentheses in the original prior to printing, i.e. all the right parentheses got re-inserted by the printer to preserve the expression's structure. Rustc's pretty printer does not pass this. See dtolnay/syn#1788 which reveals multiple rustc_ast_pretty bugs. Pseudocode: `assert(unparenthesize(expr) == unparenthesize(parse(print(unparenthesize(expr)))))` --- </details> If `print_expr_maybe_paren` is usually not correct, is there harm in keeping it for the minority of cases where it is correct? I think the answer is yes and Syn doesn't use any equivalent of this helper function. The problems with it are: - Having both `print_expr_maybe_paren` and `print_expr_cond_paren` applies counterproductive inertia against moving from the first to the second. When looking at a call site like `print_expr_maybe_paren(e, Precedence::$Variant, ...)` with parentheses not being inserted where they should be, anyone's first inclination would be to solve the bug by tweaking $Variant because that is the only knob that visibly appears in the function call. For example to pass "prec + 1", like tweaking the code to conditionally pass `Precedence::Prefix` instead of `Precedence::Cast`. Experience in Syn shows this is (almost?) never what you want the person to do. In a call `print_expr_cond_paren(e, e.precedence() < ExprPrecedence::$Variant, ...)` almost always the best fix involves one of: - Changing `e.precedence()`, e.g. to `fixup.leading_precedence(e)` and `fixup.trailing_precedence(e)` in cases of asymmetrical precedence (`(return 1) + 1` vs `1 + return 1`). - Changing `<` to `<=`, to handle associativity and other grammar restrictions like chained comparisons (which rustc gets wrong today). - Adding `||` and/or `&&` clauses to the condition. By using these 3 better knobs instead of $Variant, it upholds the property that any time we talk about precedence, it is always the precedence of some actual expression that our code is actively manipulating, instead of a value standing in for some imaginary precedence level that would exist between two consecutive [real levels](https://doc.rust-lang.org/1.83.0/reference/expressions.html#expression-precedence). For example consider that "`Cast` + 1" might be `Prefix` today, but only until some new Rust syntax ends up adding a level between those. - The `print_expr_maybe_paren` call sites look shorter, but they are not clearer. For myself, a function argument that says "does this subexpression need parenthesization" is a concrete thing that is easy to think about, while a function argument that is "what is the effective precedence level associated with this subexpression's placement inside its parent expression" is abstract and tricky to even state a precise meaning for. I expect that for someone less familiar with the pretty printer working on adding a new expression kind (like postfix match, recently), having every subexpression consistently printed using `print_expr_cond_paren` will be more beneficial, for the same reason, than having `print_expr_maybe_paren` available. r? ``@lcnr``

Rollup merge of rust-lang#133655 - dtolnay:maybeparen, r=lcnr Eliminate print_expr_maybe_paren function from pretty printers This PR is part of backporting Syn's expression precedence design into rustc. (See rust-lang#133603 for other work on this.) In Syn, our version of `print_expr_cond_paren` is called `print_subexpression` and it is called from 19 places. Of those calls, 12 of them need a "custom" behavior for the `needs_paren` argument, whereas only 7 use a "standard" behavior resembling `print_subexpression($e, $e.precedence() < Precedence::$Variant, ...)`. In other words the behavior that rustc_ast_pretty's `print_expr_maybe_paren` implements is actually not what you want most of the time. The current usage you see in rustc is overuse. <details> <summary>Aside: am I confident about the correctness of Syn's parenthesization? Yes. Click for details.</summary> --- The behavior is constrained by the following pair of tests which both run over every Rust source file of rustc and the standard library and tools and test suites: - To rule out **false positives**: for every expression in every source file, print the expression, parse it back, and verify that not a single new parenthesis got added. Since these are expressions parsed from source code, not macro-generated syntax trees, we know they must never need automatic parenthesis insertion. Rustc's pretty printer does not pass this. Pseudocode: `assert(expr == parse(print(expr)))` - To rule out **false negatives**: for every expression in every source file, replace every Expr::Paren node in the syntax tree with just its contents, i.e. stripping the parentheses but otherwise preserving the syntax tree structure. Then print the stripped expression performing parenthesis insertion wherever needed, and reparse it. Verify that the reparsed expression has identical structure to the original, despite there being no parentheses in the original prior to printing, i.e. all the right parentheses got re-inserted by the printer to preserve the expression's structure. Rustc's pretty printer does not pass this. See dtolnay/syn#1788 which reveals multiple rustc_ast_pretty bugs. Pseudocode: `assert(unparenthesize(expr) == unparenthesize(parse(print(unparenthesize(expr)))))` --- </details> If `print_expr_maybe_paren` is usually not correct, is there harm in keeping it for the minority of cases where it is correct? I think the answer is yes and Syn doesn't use any equivalent of this helper function. The problems with it are: - Having both `print_expr_maybe_paren` and `print_expr_cond_paren` applies counterproductive inertia against moving from the first to the second. When looking at a call site like `print_expr_maybe_paren(e, Precedence::$Variant, ...)` with parentheses not being inserted where they should be, anyone's first inclination would be to solve the bug by tweaking $Variant because that is the only knob that visibly appears in the function call. For example to pass "prec + 1", like tweaking the code to conditionally pass `Precedence::Prefix` instead of `Precedence::Cast`. Experience in Syn shows this is (almost?) never what you want the person to do. In a call `print_expr_cond_paren(e, e.precedence() < ExprPrecedence::$Variant, ...)` almost always the best fix involves one of: - Changing `e.precedence()`, e.g. to `fixup.leading_precedence(e)` and `fixup.trailing_precedence(e)` in cases of asymmetrical precedence (`(return 1) + 1` vs `1 + return 1`). - Changing `<` to `<=`, to handle associativity and other grammar restrictions like chained comparisons (which rustc gets wrong today). - Adding `||` and/or `&&` clauses to the condition. By using these 3 better knobs instead of $Variant, it upholds the property that any time we talk about precedence, it is always the precedence of some actual expression that our code is actively manipulating, instead of a value standing in for some imaginary precedence level that would exist between two consecutive [real levels](https://doc.rust-lang.org/1.83.0/reference/expressions.html#expression-precedence). For example consider that "`Cast` + 1" might be `Prefix` today, but only until some new Rust syntax ends up adding a level between those. - The `print_expr_maybe_paren` call sites look shorter, but they are not clearer. For myself, a function argument that says "does this subexpression need parenthesization" is a concrete thing that is easy to think about, while a function argument that is "what is the effective precedence level associated with this subexpression's placement inside its parent expression" is abstract and tricky to even state a precise meaning for. I expect that for someone less familiar with the pretty printer working on adding a new expression kind (like postfix match, recently), having every subexpression consistently printed using `print_expr_cond_paren` will be more beneficial, for the same reason, than having `print_expr_maybe_paren` available. r? ``@lcnr``

lcnr · 2024-12-02T08:34:22Z

@bors r+ rollup

ps: i'd have been totally happy to just merge this as a single PR after you gave the explanation in #133655, in case you felt like this wasn't the case

bors · 2024-12-02T08:34:24Z

📌 Commit 7ced18f has been approved by lcnr

It is now in the queue for this repository.

…llaumeGomez Rollup of 13 pull requests Successful merges: - rust-lang#133603 (Eliminate magic numbers from expression precedence) - rust-lang#133715 (rustdoc-json: Include safety of `static`s) - rust-lang#133721 (rustdoc-json: Add test for `impl Trait for dyn Trait`) - rust-lang#133725 (Remove `//@ compare-output-lines-by-subset`) - rust-lang#133730 (Add pretty-printer parenthesis insertion test) - rust-lang#133736 (Add `needs-target-has-atomic` directive) - rust-lang#133739 (Re-add myself to rotation) - rust-lang#133743 (Fix docs for `<[T]>::as_array`.) - rust-lang#133744 (Fix typo README.md) - rust-lang#133745 (Remove static HashSet for default IDs list) - rust-lang#133749 (mir validator: don't store mir phase) - rust-lang#133751 (remove `Ty::is_copy_modulo_regions`) - rust-lang#133757 (`impl Default for EarlyDiagCtxt`) r? `@ghost` `@rustbot` modify labels: rollup

Rollup merge of rust-lang#133603 - dtolnay:precedence, r=lcnr Eliminate magic numbers from expression precedence Context: see rust-lang#133140. This PR continues on backporting Syn's expression precedence design into rustc. Rustc's design used mysterious integer quantities represented variously as `i8` or `usize` (e.g. `PREC_CLOSURE = -40i8`), a special significance around `0` that is never named, and an extra `PREC_FORCE_PAREN` precedence level that does not correspond to any expression. Syn's design uses a C-like enum with variants that clearly correspond to specific sets of expression kinds. This PR is a refactoring that has no intended behavior change on its own, but it unblocks other precedence work that rustc's precedence design was poorly suited to accommodate. - Asymmetrical precedence, so that a pretty-printer can tell `(return 1) + 1` needs parens but `1 + return 1` does not. - Squashing the `Closure` and `Jump` cases into a single precedence level. - Numerous remaining false positives and false negatives in rustc pretty-printer's parenthesization of macro metavariables, for example in `$e < rhs` where $e is `lhs as Thing<T>`. FYI `@fmease` — you don't need to review if rustbot picks someone else, but you mentioned being interested in the followup PRs.

Eliminate magic numbers from expression precedence Context: see rust-lang#133140. This PR continues on backporting Syn's expression precedence design into rustc. Rustc's design used mysterious integer quantities represented variously as `i8` or `usize` (e.g. `PREC_CLOSURE = -40i8`), a special significance around `0` that is never named, and an extra `PREC_FORCE_PAREN` precedence level that does not correspond to any expression. Syn's design uses a C-like enum with variants that clearly correspond to specific sets of expression kinds. This PR is a refactoring that has no intended behavior change on its own, but it unblocks other precedence work that rustc's precedence design was poorly suited to accommodate. - Asymmetrical precedence, so that a pretty-printer can tell `(return 1) + 1` needs parens but `1 + return 1` does not. - Squashing the `Closure` and `Jump` cases into a single precedence level. - Numerous remaining false positives and false negatives in rustc pretty-printer's parenthesization of macro metavariables, for example in `$e < rhs` where $e is `lhs as Thing<T>`. FYI `@fmease` — you don't need to review if rustbot picks someone else, but you mentioned being interested in the followup PRs.

rustbot assigned lcnr Nov 28, 2024

rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Nov 28, 2024

dtolnay force-pushed the precedence branch from 8d6ffc4 to f512c54 Compare November 28, 2024 23:08

This comment has been minimized.

Sign in to view

dtolnay force-pushed the precedence branch from f512c54 to fe141e2 Compare November 29, 2024 00:37

jieyouxu added the A-parser Area: The lexing & parsing of Rust source code to an AST label Nov 29, 2024

This comment has been minimized.

Sign in to view

dtolnay force-pushed the precedence branch from fe141e2 to 48e7f17 Compare November 29, 2024 01:20

dtolnay added the A-pretty Area: Pretty printing (including `-Z unpretty`) label Nov 29, 2024

lcnr reviewed Nov 29, 2024

View reviewed changes

dtolnay mentioned this pull request Nov 30, 2024

Eliminate print_expr_maybe_paren function from pretty printers #133655

Merged

dtolnay added 5 commits November 30, 2024 17:53

Eliminate precedence arithmetic from rustc_ast_pretty

c02032c

Eliminate precedence arithmetic from rustc_hir_pretty

34a65f2

Eliminate PREC_FORCE_PAREN

ca8f474

Eliminate precedence arithmetic from rustc_parse

539c863

Eliminate magic numbers from expression precedence

7ced18f

dtolnay force-pushed the precedence branch from 48e7f17 to 7ced18f Compare December 1, 2024 01:56

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Dec 2, 2024

GuillaumeGomez mentioned this pull request Dec 2, 2024

Rollup of 13 pull requests #133760

Merged

bors merged commit 7dd0c83 into rust-lang:master Dec 2, 2024
6 checks passed

rustbot added this to the 1.85.0 milestone Dec 2, 2024

dtolnay deleted the precedence branch December 16, 2024 17:33

dtolnay removed the S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. label Jan 14, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Eliminate magic numbers from expression precedence #133603

Eliminate magic numbers from expression precedence #133603

Uh oh!

dtolnay commented Nov 28, 2024 •

edited

Loading

Uh oh!

rustbot commented Nov 28, 2024

Uh oh!

rustbot commented Nov 28, 2024

Uh oh!

This comment has been minimized.

This comment has been minimized.

lcnr left a comment

Uh oh!

bors commented Nov 29, 2024

Uh oh!

dtolnay commented Nov 30, 2024

Uh oh!

lcnr commented Dec 2, 2024

Uh oh!

bors commented Dec 2, 2024

Uh oh!

Uh oh!

Uh oh!

Eliminate magic numbers from expression precedence #133603

Eliminate magic numbers from expression precedence #133603

Uh oh!

Conversation

dtolnay commented Nov 28, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rustbot commented Nov 28, 2024

Uh oh!

rustbot commented Nov 28, 2024

Uh oh!

This comment has been minimized.

This comment has been minimized.

lcnr left a comment

Choose a reason for hiding this comment

Uh oh!

bors commented Nov 29, 2024

Uh oh!

dtolnay commented Nov 30, 2024

Uh oh!

lcnr commented Dec 2, 2024

Uh oh!

bors commented Dec 2, 2024

Uh oh!

Uh oh!

Uh oh!

dtolnay commented Nov 28, 2024 •

edited

Loading