Skip to content

Commit

Permalink
Add support for nested replacements inside format specifications (#6616)
Browse files Browse the repository at this point in the history
Closes #6442

Python string formatting like `"hello {place}".format(place="world")`
supports format specifications for replaced content such as `"hello
{place:>10}".format(place="world")` which will align the text to the
right in a container filled up to ten characters.

Ruff parses formatted strings into `FormatPart`s each of which is either
a `Field` (content in `{...}`) or a `Literal` (the normal content).
Fields are parsed into name and format specifier sections (we'll ignore
conversion specifiers for now).

There are a myriad of specifiers that can be used in a `FormatSpec`.
Unfortunately for linters, the specifier values can be dynamically set.
For example, `"hello {place:{align}{width}}".format(place="world",
align=">", width=10)` and `"hello {place:{fmt}}".format(place="world",
fmt=">10")` will yield the same string as before but variables can be
used to determine the formatting. In this case, when parsing the format
specifier we can't know what _kind_ of specifier is being used as their
meaning is determined by both position and value.

Ruff does not support nested replacements and our current data model
does not support the concept. Here the data model is updated to support
this concept, although linting of specifications with replacements will
be inherently limited. We could split format specifications into two
types, one without any replacements that we can perform lints with and
one with replacements that we cannot inspect. However, it seems
excessive to drop all parsing of format specifiers due to the presence
of a replacement. Instead, I've opted to parse replacements eagerly and
ignore their possible effect on other format specifiers. This will allow
us to retain a simple interface for `FormatSpec` and most syntax checks.
We may need to add some handling to relax errors if a replacement was
seen previously.

It's worth noting that the nested replacement _can_ also include a
format specification although it may fail at runtime if you produce an
invalid outer format specification. For example, `"hello
{place:{fmt:<2}}".format(place="world", fmt=">10")` is valid so we need
to represent each nested replacement as a full `FormatPart`.

## Test plan

Adding unit tests for `FormatSpec` parsing and snapshots for PLE1300
  • Loading branch information
zanieb committed Aug 17, 2023
1 parent 1334232 commit d0f2a8e
Show file tree
Hide file tree
Showing 6 changed files with 266 additions and 39 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,10 @@

"{:s} {:y}".format("hello", "world") # [bad-format-character]

"{:*^30s}".format("centered")
"{:*^30s}".format("centered") # OK
"{:{s}}".format("hello", s="s") # OK (nested replacement value not checked)

"{:{s:y}}".format("hello", s="s") # [bad-format-character] (nested replacement format spec checked)

## f-strings

Expand Down
4 changes: 3 additions & 1 deletion crates/ruff/src/rules/pyflakes/format.rs
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,9 @@ pub(crate) fn error_to_string(err: &FormatParseError) -> String {
FormatParseError::InvalidCharacterAfterRightBracket => {
"Only '.' or '[' may follow ']' in format field specifier"
}
FormatParseError::InvalidFormatSpecifier => "Max string recursion exceeded",
FormatParseError::PlaceholderRecursionExceeded => {
"Max format placeholder recursion exceeded"
}
FormatParseError::MissingStartBracket => "Single '}' encountered in format string",
FormatParseError::MissingRightBracket => "Expected '}' before end of string",
FormatParseError::UnmatchedBracket => "Single '{' encountered in format string",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ F521.py:3:1: F521 `.format` call has invalid format string: Expected '}' before
5 | "{:{:{}}}".format(1, 2, 3)
|

F521.py:5:1: F521 `.format` call has invalid format string: Max string recursion exceeded
F521.py:5:1: F521 `.format` call has invalid format string: Max format placeholder recursion exceeded
|
3 | "{foo[}".format(foo=1)
4 | # too much string recursion (placeholder-in-placeholder)
Expand Down
43 changes: 33 additions & 10 deletions crates/ruff/src/rules/pylint/rules/bad_string_format_character.rs
Original file line number Diff line number Diff line change
Expand Up @@ -49,16 +49,39 @@ pub(crate) fn call(checker: &mut Checker, string: &str, range: TextRange) {
continue;
};

if let Err(FormatSpecError::InvalidFormatType) = FormatSpec::parse(format_spec) {
checker.diagnostics.push(Diagnostic::new(
BadStringFormatCharacter {
// The format type character is always the last one.
// More info in the official spec:
// https://docs.python.org/3/library/string.html#format-specification-mini-language
format_char: format_spec.chars().last().unwrap(),
},
range,
));
match FormatSpec::parse(format_spec) {
Err(FormatSpecError::InvalidFormatType) => {
checker.diagnostics.push(Diagnostic::new(
BadStringFormatCharacter {
// The format type character is always the last one.
// More info in the official spec:
// https://docs.python.org/3/library/string.html#format-specification-mini-language
format_char: format_spec.chars().last().unwrap(),
},
range,
));
}
Err(_) => {}
Ok(format_spec) => {
for replacement in format_spec.replacements() {
let FormatPart::Field { format_spec, .. } = replacement else {
continue;
};
if let Err(FormatSpecError::InvalidFormatType) =
FormatSpec::parse(format_spec)
{
checker.diagnostics.push(Diagnostic::new(
BadStringFormatCharacter {
// The format type character is always the last one.
// More info in the official spec:
// https://docs.python.org/3/library/string.html#format-specification-mini-language
format_char: format_spec.chars().last().unwrap(),
},
range,
));
}
}
}
}
}
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,17 @@ bad_string_format_character.py:15:1: PLE1300 Unsupported format character 'y'
15 | "{:s} {:y}".format("hello", "world") # [bad-format-character]
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ PLE1300
16 |
17 | "{:*^30s}".format("centered")
17 | "{:*^30s}".format("centered") # OK
|

bad_string_format_character.py:20:1: PLE1300 Unsupported format character 'y'
|
18 | "{:{s}}".format("hello", s="s") # OK (nested replacement value not checked)
19 |
20 | "{:{s:y}}".format("hello", s="s") # [bad-format-character] (nested replacement format spec checked)
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ PLE1300
21 |
22 | ## f-strings
|


Loading

0 comments on commit d0f2a8e

Please sign in to comment.