`logical_line` for call with f-string can result in invalid ast under Python 3.12 #1948

stephenfin · 2024-07-31T16:57:16Z

how did you install flake8?

$ pip install flake8

unmodified output of `flake8 --bug-report`

{}

describe the problem

what I expected to happen

Some of the plugins in hacking use a combo of ast.parse and logical_line to generate an AST for an individual line that is then fed into an ast.NodeVisitor subclass. These work just fine on Python 3.11 and earlier, but I've noticed they fail under specific circumstances on Python 3.12. I've included a minimal reproducer below. There's a chance I am "holding it wrong" but I'd like to confirm this first.

sample code

test.py

import unittest


class TestFoo(unittest.TestCase):

    def test_foo(self):
        response_key = 'foo'
        response_val = 'bar'
        body_output = ''
        self.assertEqual(
            f'{{"{response_key}": "{response_val}"}}', body_output
        )

checks.py:

class NoneArgChecker(ast.NodeVisitor):
    def __init__(self, func_name, num_args=2):
        self.func_name = func_name
        self.num_args = num_args
        self.none_found = False

    def visit_Call(self, node):
        # snipped
        ...


def foo(logical_line, noqa):
    if noqa:
        return

    for func_name in (
        'assertEqual', 'assertIs', 'assertNotEqual', 'assertIsNot'
    ):
        try:
            start = logical_line.index('.%s(' % func_name) + 1
        except ValueError:
            continue
        checker = NoneArgChecker(func_name)
        # print(logical_line)
        checker.visit(ast.parse(logical_line))
        continue
        if checker.none_found:
            yield start, "H203: Use assertIs(Not)None to check for None"

tox.ini:

[flake8:local-plugins]
extension =
    X123 = checks:foo
paths = .

commands ran

$ flake8 test.py
wow.py:1:23: E999 SyntaxError: invalid syntax. Perhaps you forgot a comma?

If I comment out the print statement, I see different results for Python 3.12 compared Pythons 3.10 and 3.11. Under 3.12:

self.assertEqual(f'x{x{response_key}xxxx{response_val}xx}', body_output)

Under 3.10 and 3.11:

self.assertEqual(f'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx', body_output)

additional information

The E999 error is associated with line 1 of the file, despite the error actually coming from a plugin. This led me on a wild goose chase as I tried to figure out why flake8 thought my file was invalid Python but Python itself did not. I suspect this might also be user error and I should be handling the potential exception from ast.parse in my plugin but again, I just wanted to confirm that this was expected practice.

The text was updated successfully, but these errors were encountered:

stephenfin · 2024-07-31T17:14:14Z

I think this may be a bug introduced in 43266a2. Simplifying the above reproducer:

test.py

key = 'foo'
val = 'bar'
print(f'{{"{key}": "{val}"}}'

checks.py

def foo(logical_line):
    print(logical_line)

If we run this we get:

❯ flake8 test.py
key = 'xxx'
val = 'xxx'
print(f'x{x{key}xxxx{val}xx}')

I think we want to get:

❯ flake8 test.py
key = 'xxx'
val = 'xxx'
print(f'xxxxxxxxxxxxxxxxxxxx')

Right?

We parse 'logical_line's in a couple of extensions. There is currently a potential bug in flake8 [1] that means these lines are not valid Python. While we wait on a fix, simply skip these lines. [1] PyCQA/flake8#1948 Change-Id: Ia0f2d729ee48f85afaa58ddb6e983e12d4f298a2 Signed-off-by: Stephen Finucane <stephenfin@redhat.com>

* Update hacking from branch 'master' to 634cb78484a9c58f067c2df8cba7f49458c82043 - Ignore SyntaxError exceptions We parse 'logical_line's in a couple of extensions. There is currently a potential bug in flake8 [1] that means these lines are not valid Python. While we wait on a fix, simply skip these lines. [1] PyCQA/flake8#1948 Change-Id: Ia0f2d729ee48f85afaa58ddb6e983e12d4f298a2 Signed-off-by: Stephen Finucane <stephenfin@redhat.com>

asottile · 2024-07-31T21:54:34Z

nope! the method is meant to redact string contents and not code

stephenfin · 2024-07-31T21:58:53Z

Then we should have:

print(f'xxx{key}xxxx{val}xxx')

?

asottile · 2024-07-31T22:30:23Z

ah yes I see the quirk. ugh. the FSTRING_MIDDLE token is misleading for curly braces -- should be an easy fix (pycodestyle will likely need the same fix)

want to try a patch?

To use a curly brace in an f-string, you must escape it. For example: >>> k = 1 >>> f'{{{k}' '{1' Saving this as a script and running the 'tokenize' module higlights something odd around our counting: ❯ python -m tokenize wow.py 0,0-0,0: ENCODING 'utf-8' 1,0-1,1: NAME 'k' 1,2-1,3: OP '=' 1,4-1,5: NUMBER '1' 1,5-1,6: NEWLINE '\n' 2,0-2,2: FSTRING_START "f'" 2,2-2,3: FSTRING_MIDDLE '{' # <-- here... 2,4-2,5: OP '{' # <-- and here 2,5-2,6: NAME 'k' 2,6-2,7: OP '}' 2,7-2,8: FSTRING_END "'" 2,8-2,9: NEWLINE '\n' 3,0-3,0: ENDMARKER '' The FSTRING_MIDDLE character we have is the escaped/post-parse single curly brace rather than the raw double curly brace, however, while our end index of this token accounts for the parsed form, the start index of the next token does not (put another way, it jumps from 3 -> 4). This triggers some existing, unrelated code that we need to bypass. Do just that. Signed-off-by: Stephen Finucane <stephen@that.guru>

To use a curly brace in an f-string, you must escape it. For example: >>> k = 1 >>> f'{{{k}' '{1' Saving this as a script and running the 'tokenize' module higlights something odd around our counting: ❯ python -m tokenize wow.py 0,0-0,0: ENCODING 'utf-8' 1,0-1,1: NAME 'k' 1,2-1,3: OP '=' 1,4-1,5: NUMBER '1' 1,5-1,6: NEWLINE '\n' 2,0-2,2: FSTRING_START "f'" 2,2-2,3: FSTRING_MIDDLE '{' # <-- here... 2,4-2,5: OP '{' # <-- and here 2,5-2,6: NAME 'k' 2,6-2,7: OP '}' 2,7-2,8: FSTRING_END "'" 2,8-2,9: NEWLINE '\n' 3,0-3,0: ENDMARKER '' The FSTRING_MIDDLE character we have is the escaped/post-parse single curly brace rather than the raw double curly brace, however, while our end index of this token accounts for the parsed form, the start index of the next token does not (put another way, it jumps from 3 -> 4). This triggers some existing, unrelated code that we need to bypass. Do just that. Signed-off-by: Stephen Finucane <stephen@that.guru> Closes: PyCQA#1948

stephenfin · 2024-08-01T11:18:50Z

want to try a patch?

Done. Apologies in advance for the fuzzy wording: I have no idea what the below code is intended to account for:

flake8/src/flake8/processor.py

Lines 207 to 218 in 2a811cc

    
           if previous_row: 
        
               (start_row, start_column) = start 
        
               if previous_row != start_row: 
        
                   row_index = previous_row - 1 
        
                   column_index = previous_column - 1 
        
                   previous_text = self.lines[row_index][column_index] 
        
                   if previous_text == "," or ( 
        
                       previous_text not in "{[(" and text not in "}])" 
        
                   ): 
        
                       text = f" {text}" 
        
               elif previous_column != start_column: 
        
                   text = line[previous_column:start_column] + text

(and that's after git blameing my way back as far as 23c9091...)

To use a curly brace in an f-string, you must escape it. For example: >>> k = 1 >>> f'{{{k}' '{1' Saving this as a script and running the 'tokenize' module highlights something odd around the counting of tokens: ❯ python -m tokenize wow.py 0,0-0,0: ENCODING 'utf-8' 1,0-1,1: NAME 'k' 1,2-1,3: OP '=' 1,4-1,5: NUMBER '1' 1,5-1,6: NEWLINE '\n' 2,0-2,2: FSTRING_START "f'" 2,2-2,3: FSTRING_MIDDLE '{' # <-- here... 2,4-2,5: OP '{' # <-- and here 2,5-2,6: NAME 'k' 2,6-2,7: OP '}' 2,7-2,8: FSTRING_END "'" 2,8-2,9: NEWLINE '\n' 3,0-3,0: ENDMARKER '' The FSTRING_MIDDLE character we have is the escaped/post-parse single curly brace rather than the raw double curly brace, however, while our end index of this token accounts for the parsed form, the start index of the next token does not (put another way, it jumps from 3 -> 4). This triggers some existing, unrelated code that we need to bypass. Do just that. Signed-off-by: Stephen Finucane <stephen@that.guru> Closes: PyCQA#1948

sigmavirus24 · 2024-08-01T12:37:03Z

want to try a patch?

Done. Apologies in advance for the fuzzy wording: I have no idea what the below code is intended to account for:

flake8/src/flake8/processor.py

Lines 207 to 218 in 2a811cc

if previous_row:

(start_row, start_column) = start

if previous_row != start_row:

row_index = previous_row - 1

column_index = previous_column - 1

previous_text = self.lines[row_index][column_index]

if previous_text == "," or (

previous_text not in "{[(" and text not in "}])"

):

text = f" {text}"

elif previous_column != start_column:

text = line[previous_column:start_column] + text

(and that's after git blameing my way back as far as 23c9091...)

Not looking at where you blamed back to but a bunch of this was implemented to keep compat with pycodestyle. If you want reasoning it's likely in that project

To use a curly brace in an f-string, you must escape it. For example: >>> k = 1 >>> f'{{{k}' '{1' Saving this as a script and running the 'tokenize' module highlights something odd around the counting of tokens: ❯ python -m tokenize wow.py 0,0-0,0: ENCODING 'utf-8' 1,0-1,1: NAME 'k' 1,2-1,3: OP '=' 1,4-1,5: NUMBER '1' 1,5-1,6: NEWLINE '\n' 2,0-2,2: FSTRING_START "f'" 2,2-2,3: FSTRING_MIDDLE '{' # <-- here... 2,4-2,5: OP '{' # <-- and here 2,5-2,6: NAME 'k' 2,6-2,7: OP '}' 2,7-2,8: FSTRING_END "'" 2,8-2,9: NEWLINE '\n' 3,0-3,0: ENDMARKER '' The FSTRING_MIDDLE character we have is the escaped/post-parse single curly brace rather than the raw double curly brace, however, while our end index of this token accounts for the parsed form, the start index of the next token does not (put another way, it jumps from 3 -> 4). This triggers some existing, unrelated code that we need to bypass. Do just that. Signed-off-by: Stephen Finucane <stephen@that.guru> Closes: PyCQA#1948

We disable the E203 (whitespace before ':') and E501 (line too long) linter rules since these conflict with ruff-format. We also rework a statement in 'keystoneauth1/tests/unit/test_session.py' since it's triggering a bug in flake8 [1] that is currently (at time of authoring) unresolved. [1] PyCQA/flake8#1948 Signed-off-by: Stephen Finucane <stephenfin@redhat.com> Change-Id: Ief5c1c57d1e72db9fc881063d4c7e1030e76da43

stephenfin changed the title ~~logical_line with f-string can result in invalid ast~~ logical_line for call with f-string can result in invalid ast under Python 3.12 Jul 31, 2024

stephenfin mentioned this issue Aug 1, 2024

Handle escaped braces in f-strings #1949

Merged

asottile closed this as completed in #1949 Aug 4, 2024

asottile added this to the 7.1.1 milestone Aug 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`logical_line` for call with f-string can result in invalid ast under Python 3.12 #1948

`logical_line` for call with f-string can result in invalid ast under Python 3.12 #1948

stephenfin commented Jul 31, 2024

stephenfin commented Jul 31, 2024 •

edited

Loading

asottile commented Jul 31, 2024

stephenfin commented Jul 31, 2024

asottile commented Jul 31, 2024

stephenfin commented Aug 1, 2024

sigmavirus24 commented Aug 1, 2024

logical_line for call with f-string can result in invalid ast under Python 3.12 #1948

logical_line for call with f-string can result in invalid ast under Python 3.12 #1948

Comments

stephenfin commented Jul 31, 2024

how did you install flake8?

unmodified output of flake8 --bug-report

describe the problem

what I expected to happen

sample code

commands ran

additional information

stephenfin commented Jul 31, 2024 • edited Loading

asottile commented Jul 31, 2024

stephenfin commented Jul 31, 2024

asottile commented Jul 31, 2024

stephenfin commented Aug 1, 2024

sigmavirus24 commented Aug 1, 2024

`logical_line` for call with f-string can result in invalid ast under Python 3.12 #1948

`logical_line` for call with f-string can result in invalid ast under Python 3.12 #1948

unmodified output of `flake8 --bug-report`

stephenfin commented Jul 31, 2024 •

edited

Loading