-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Eliminate magic numbers from expression precedence #133603
Conversation
Some changes occurred in src/tools/clippy cc @rust-lang/clippy |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this looks correct to me and commits 3-7 are a clear improvement imo
Why do you think that entirely removing print_expr_maybe_paren
is worth it? I feel like it's easier to read then the alternative. Removing it likely made the next commits easier, is that the main reason?
☔ The latest upstream changes (presumably #133588) made this pull request unmergeable. Please resolve the merge conflicts. |
I moved commits 1-2 to #133655 with a longer explanation. I can drop those commits if you prefer to keep |
Eliminate print_expr_maybe_paren function from pretty printers This PR is part of backporting Syn's expression precedence design into rustc. (See rust-lang#133603 for other work on this.) In Syn, our version of `print_expr_cond_paren` is called `print_subexpression` and it is called from 19 places. Of those calls, 12 of them need a "custom" behavior for the `needs_paren` argument, whereas only 7 use a "standard" behavior resembling `print_subexpression($e, $e.precedence() < Precedence::$Variant, ...)`. In other words the behavior that rustc_ast_pretty's `print_expr_maybe_paren` implements is actually not what you want most of the time. The current usage you see in rustc is overuse. <details> <summary>Aside: am I confident about the correctness of Syn's parenthesization? Yes. Click for details.</summary> --- The behavior is constrained by the following pair of tests which both run over every Rust source file of rustc and the standard library and tools and test suites: - To rule out **false positives**: for every expression in every source file, print the expression, parse it back, and verify that not a single new parenthesis got added. Since these are expressions parsed from source code, not macro-generated syntax trees, we know they must never need automatic parenthesis insertion. Rustc's pretty printer does not pass this. Pseudocode: `assert(expr == parse(print(expr)))` - To rule out **false negatives**: for every expression in every source file, replace every Expr::Paren node in the syntax tree with just its contents, i.e. stripping the parentheses but otherwise preserving the syntax tree structure. Then print the stripped expression performing parenthesis insertion wherever needed, and reparse it. Verify that the reparsed expression has identical structure to the original, despite there being no parentheses in the original prior to printing, i.e. all the right parentheses got re-inserted by the printer to preserve the expression's structure. Rustc's pretty printer does not pass this. See dtolnay/syn#1788 which reveals multiple rustc_ast_pretty bugs. Pseudocode: `assert(unparenthesize(expr) == unparenthesize(parse(print(unparenthesize(expr)))))` --- </details> If `print_expr_maybe_paren` is usually not correct, is there harm in keeping it for the minority of cases where it is correct? I think the answer is yes and Syn doesn't use any equivalent of this helper function. The problems with it are: - Having both `print_expr_maybe_paren` and `print_expr_cond_paren` applies counterproductive inertia against moving from the first to the second. When looking at a call site like `print_expr_maybe_paren(e, Precedence::$Variant, ...)` with parentheses not being inserted where they should be, anyone's first inclination would be to solve the bug by tweaking $Variant because that is the only knob that visibly appears in the function call. For example to pass "prec + 1", like tweaking the code to conditionally pass `Precedence::Prefix` instead of `Precedence::Cast`. Experience in Syn shows this is (almost?) never what you want the person to do. In a call `print_expr_cond_paren(e, e.precedence() < ExprPrecedence::$Variant, ...)` almost always the best fix involves one of: - Changing `e.precedence()`, e.g. to `fixup.leading_precedence(e)` and `fixup.trailing_precedence(e)` in cases of asymmetrical precedence (`(return 1) + 1` vs `1 + return 1`). - Changing `<` to `<=`, to handle associativity and other grammar restrictions like chained comparisons (which rustc gets wrong today). - Adding `||` and/or `&&` clauses to the condition. By using these 3 better knobs instead of $Variant, it upholds the property that any time we talk about precedence, it is always the precedence of some actual expression that our code is actively manipulating, instead of a value standing in for some imaginary precedence level that would exist between two consecutive [real levels](https://doc.rust-lang.org/1.83.0/reference/expressions.html#expression-precedence). For example consider that "`Cast` + 1" might be `Prefix` today, but only until some new Rust syntax ends up adding a level between those. - The `print_expr_maybe_paren` call sites look shorter, but they are not clearer. For myself, a function argument that says "does this subexpression need parenthesization" is a concrete thing that is easy to think about, while a function argument that is "what is the effective precedence level associated with this subexpression's placement inside its parent expression" is abstract and tricky to even state a precise meaning for. I expect that for someone less familiar with the pretty printer working on adding a new expression kind (like postfix match, recently), having every subexpression consistently printed using `print_expr_cond_paren` will be more beneficial, for the same reason, than having `print_expr_maybe_paren` available. r? `@lcnr`
Eliminate print_expr_maybe_paren function from pretty printers This PR is part of backporting Syn's expression precedence design into rustc. (See rust-lang#133603 for other work on this.) In Syn, our version of `print_expr_cond_paren` is called `print_subexpression` and it is called from 19 places. Of those calls, 12 of them need a "custom" behavior for the `needs_paren` argument, whereas only 7 use a "standard" behavior resembling `print_subexpression($e, $e.precedence() < Precedence::$Variant, ...)`. In other words the behavior that rustc_ast_pretty's `print_expr_maybe_paren` implements is actually not what you want most of the time. The current usage you see in rustc is overuse. <details> <summary>Aside: am I confident about the correctness of Syn's parenthesization? Yes. Click for details.</summary> --- The behavior is constrained by the following pair of tests which both run over every Rust source file of rustc and the standard library and tools and test suites: - To rule out **false positives**: for every expression in every source file, print the expression, parse it back, and verify that not a single new parenthesis got added. Since these are expressions parsed from source code, not macro-generated syntax trees, we know they must never need automatic parenthesis insertion. Rustc's pretty printer does not pass this. Pseudocode: `assert(expr == parse(print(expr)))` - To rule out **false negatives**: for every expression in every source file, replace every Expr::Paren node in the syntax tree with just its contents, i.e. stripping the parentheses but otherwise preserving the syntax tree structure. Then print the stripped expression performing parenthesis insertion wherever needed, and reparse it. Verify that the reparsed expression has identical structure to the original, despite there being no parentheses in the original prior to printing, i.e. all the right parentheses got re-inserted by the printer to preserve the expression's structure. Rustc's pretty printer does not pass this. See dtolnay/syn#1788 which reveals multiple rustc_ast_pretty bugs. Pseudocode: `assert(unparenthesize(expr) == unparenthesize(parse(print(unparenthesize(expr)))))` --- </details> If `print_expr_maybe_paren` is usually not correct, is there harm in keeping it for the minority of cases where it is correct? I think the answer is yes and Syn doesn't use any equivalent of this helper function. The problems with it are: - Having both `print_expr_maybe_paren` and `print_expr_cond_paren` applies counterproductive inertia against moving from the first to the second. When looking at a call site like `print_expr_maybe_paren(e, Precedence::$Variant, ...)` with parentheses not being inserted where they should be, anyone's first inclination would be to solve the bug by tweaking $Variant because that is the only knob that visibly appears in the function call. For example to pass "prec + 1", like tweaking the code to conditionally pass `Precedence::Prefix` instead of `Precedence::Cast`. Experience in Syn shows this is (almost?) never what you want the person to do. In a call `print_expr_cond_paren(e, e.precedence() < ExprPrecedence::$Variant, ...)` almost always the best fix involves one of: - Changing `e.precedence()`, e.g. to `fixup.leading_precedence(e)` and `fixup.trailing_precedence(e)` in cases of asymmetrical precedence (`(return 1) + 1` vs `1 + return 1`). - Changing `<` to `<=`, to handle associativity and other grammar restrictions like chained comparisons (which rustc gets wrong today). - Adding `||` and/or `&&` clauses to the condition. By using these 3 better knobs instead of $Variant, it upholds the property that any time we talk about precedence, it is always the precedence of some actual expression that our code is actively manipulating, instead of a value standing in for some imaginary precedence level that would exist between two consecutive [real levels](https://doc.rust-lang.org/1.83.0/reference/expressions.html#expression-precedence). For example consider that "`Cast` + 1" might be `Prefix` today, but only until some new Rust syntax ends up adding a level between those. - The `print_expr_maybe_paren` call sites look shorter, but they are not clearer. For myself, a function argument that says "does this subexpression need parenthesization" is a concrete thing that is easy to think about, while a function argument that is "what is the effective precedence level associated with this subexpression's placement inside its parent expression" is abstract and tricky to even state a precise meaning for. I expect that for someone less familiar with the pretty printer working on adding a new expression kind (like postfix match, recently), having every subexpression consistently printed using `print_expr_cond_paren` will be more beneficial, for the same reason, than having `print_expr_maybe_paren` available. r? ``@lcnr``
Rollup merge of rust-lang#133655 - dtolnay:maybeparen, r=lcnr Eliminate print_expr_maybe_paren function from pretty printers This PR is part of backporting Syn's expression precedence design into rustc. (See rust-lang#133603 for other work on this.) In Syn, our version of `print_expr_cond_paren` is called `print_subexpression` and it is called from 19 places. Of those calls, 12 of them need a "custom" behavior for the `needs_paren` argument, whereas only 7 use a "standard" behavior resembling `print_subexpression($e, $e.precedence() < Precedence::$Variant, ...)`. In other words the behavior that rustc_ast_pretty's `print_expr_maybe_paren` implements is actually not what you want most of the time. The current usage you see in rustc is overuse. <details> <summary>Aside: am I confident about the correctness of Syn's parenthesization? Yes. Click for details.</summary> --- The behavior is constrained by the following pair of tests which both run over every Rust source file of rustc and the standard library and tools and test suites: - To rule out **false positives**: for every expression in every source file, print the expression, parse it back, and verify that not a single new parenthesis got added. Since these are expressions parsed from source code, not macro-generated syntax trees, we know they must never need automatic parenthesis insertion. Rustc's pretty printer does not pass this. Pseudocode: `assert(expr == parse(print(expr)))` - To rule out **false negatives**: for every expression in every source file, replace every Expr::Paren node in the syntax tree with just its contents, i.e. stripping the parentheses but otherwise preserving the syntax tree structure. Then print the stripped expression performing parenthesis insertion wherever needed, and reparse it. Verify that the reparsed expression has identical structure to the original, despite there being no parentheses in the original prior to printing, i.e. all the right parentheses got re-inserted by the printer to preserve the expression's structure. Rustc's pretty printer does not pass this. See dtolnay/syn#1788 which reveals multiple rustc_ast_pretty bugs. Pseudocode: `assert(unparenthesize(expr) == unparenthesize(parse(print(unparenthesize(expr)))))` --- </details> If `print_expr_maybe_paren` is usually not correct, is there harm in keeping it for the minority of cases where it is correct? I think the answer is yes and Syn doesn't use any equivalent of this helper function. The problems with it are: - Having both `print_expr_maybe_paren` and `print_expr_cond_paren` applies counterproductive inertia against moving from the first to the second. When looking at a call site like `print_expr_maybe_paren(e, Precedence::$Variant, ...)` with parentheses not being inserted where they should be, anyone's first inclination would be to solve the bug by tweaking $Variant because that is the only knob that visibly appears in the function call. For example to pass "prec + 1", like tweaking the code to conditionally pass `Precedence::Prefix` instead of `Precedence::Cast`. Experience in Syn shows this is (almost?) never what you want the person to do. In a call `print_expr_cond_paren(e, e.precedence() < ExprPrecedence::$Variant, ...)` almost always the best fix involves one of: - Changing `e.precedence()`, e.g. to `fixup.leading_precedence(e)` and `fixup.trailing_precedence(e)` in cases of asymmetrical precedence (`(return 1) + 1` vs `1 + return 1`). - Changing `<` to `<=`, to handle associativity and other grammar restrictions like chained comparisons (which rustc gets wrong today). - Adding `||` and/or `&&` clauses to the condition. By using these 3 better knobs instead of $Variant, it upholds the property that any time we talk about precedence, it is always the precedence of some actual expression that our code is actively manipulating, instead of a value standing in for some imaginary precedence level that would exist between two consecutive [real levels](https://doc.rust-lang.org/1.83.0/reference/expressions.html#expression-precedence). For example consider that "`Cast` + 1" might be `Prefix` today, but only until some new Rust syntax ends up adding a level between those. - The `print_expr_maybe_paren` call sites look shorter, but they are not clearer. For myself, a function argument that says "does this subexpression need parenthesization" is a concrete thing that is easy to think about, while a function argument that is "what is the effective precedence level associated with this subexpression's placement inside its parent expression" is abstract and tricky to even state a precise meaning for. I expect that for someone less familiar with the pretty printer working on adding a new expression kind (like postfix match, recently), having every subexpression consistently printed using `print_expr_cond_paren` will be more beneficial, for the same reason, than having `print_expr_maybe_paren` available. r? ``@lcnr``
…llaumeGomez Rollup of 13 pull requests Successful merges: - rust-lang#133603 (Eliminate magic numbers from expression precedence) - rust-lang#133715 (rustdoc-json: Include safety of `static`s) - rust-lang#133721 (rustdoc-json: Add test for `impl Trait for dyn Trait`) - rust-lang#133725 (Remove `//@ compare-output-lines-by-subset`) - rust-lang#133730 (Add pretty-printer parenthesis insertion test) - rust-lang#133736 (Add `needs-target-has-atomic` directive) - rust-lang#133739 (Re-add myself to rotation) - rust-lang#133743 (Fix docs for `<[T]>::as_array`.) - rust-lang#133744 (Fix typo README.md) - rust-lang#133745 (Remove static HashSet for default IDs list) - rust-lang#133749 (mir validator: don't store mir phase) - rust-lang#133751 (remove `Ty::is_copy_modulo_regions`) - rust-lang#133757 (`impl Default for EarlyDiagCtxt`) r? `@ghost` `@rustbot` modify labels: rollup
Rollup merge of rust-lang#133603 - dtolnay:precedence, r=lcnr Eliminate magic numbers from expression precedence Context: see rust-lang#133140. This PR continues on backporting Syn's expression precedence design into rustc. Rustc's design used mysterious integer quantities represented variously as `i8` or `usize` (e.g. `PREC_CLOSURE = -40i8`), a special significance around `0` that is never named, and an extra `PREC_FORCE_PAREN` precedence level that does not correspond to any expression. Syn's design uses a C-like enum with variants that clearly correspond to specific sets of expression kinds. This PR is a refactoring that has no intended behavior change on its own, but it unblocks other precedence work that rustc's precedence design was poorly suited to accommodate. - Asymmetrical precedence, so that a pretty-printer can tell `(return 1) + 1` needs parens but `1 + return 1` does not. - Squashing the `Closure` and `Jump` cases into a single precedence level. - Numerous remaining false positives and false negatives in rustc pretty-printer's parenthesization of macro metavariables, for example in `$e < rhs` where $e is `lhs as Thing<T>`. FYI `@fmease` — you don't need to review if rustbot picks someone else, but you mentioned being interested in the followup PRs.
Eliminate magic numbers from expression precedence Context: see rust-lang#133140. This PR continues on backporting Syn's expression precedence design into rustc. Rustc's design used mysterious integer quantities represented variously as `i8` or `usize` (e.g. `PREC_CLOSURE = -40i8`), a special significance around `0` that is never named, and an extra `PREC_FORCE_PAREN` precedence level that does not correspond to any expression. Syn's design uses a C-like enum with variants that clearly correspond to specific sets of expression kinds. This PR is a refactoring that has no intended behavior change on its own, but it unblocks other precedence work that rustc's precedence design was poorly suited to accommodate. - Asymmetrical precedence, so that a pretty-printer can tell `(return 1) + 1` needs parens but `1 + return 1` does not. - Squashing the `Closure` and `Jump` cases into a single precedence level. - Numerous remaining false positives and false negatives in rustc pretty-printer's parenthesization of macro metavariables, for example in `$e < rhs` where $e is `lhs as Thing<T>`. FYI `@fmease` — you don't need to review if rustbot picks someone else, but you mentioned being interested in the followup PRs.
Context: see #133140.
This PR continues on backporting Syn's expression precedence design into rustc. Rustc's design used mysterious integer quantities represented variously as
i8
orusize
(e.g.PREC_CLOSURE = -40i8
), a special significance around0
that is never named, and an extraPREC_FORCE_PAREN
precedence level that does not correspond to any expression. Syn's design uses a C-like enum with variants that clearly correspond to specific sets of expression kinds.This PR is a refactoring that has no intended behavior change on its own, but it unblocks other precedence work that rustc's precedence design was poorly suited to accommodate.
Asymmetrical precedence, so that a pretty-printer can tell
(return 1) + 1
needs parens but1 + return 1
does not.Squashing the
Closure
andJump
cases into a single precedence level.Numerous remaining false positives and false negatives in rustc pretty-printer's parenthesization of macro metavariables, for example in
$e < rhs
where $e islhs as Thing<T>
.FYI @fmease — you don't need to review if rustbot picks someone else, but you mentioned being interested in the followup PRs.