Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pycodestyle] Handle f-strings properly for invalid-escape-sequence (W605) #14748

Merged
merged 7 commits into from
Dec 4, 2024

Conversation

dylwil3
Copy link
Collaborator

@dylwil3 dylwil3 commented Dec 3, 2024

When fixing an invalid escape sequence in an f-string, each f-string element is analyzed for valid escape characters prior to creating the diagnostic and fix. This allows us to safely prefix with r to create a raw string if no valid escape characters were found anywhere in the f-string, and otherwise insert backslashes.

This fixes a bug in the original implementation: each "f-string part" was treated separately, so it was not possible to tell whether a valid escape character was or would be used elsewhere in the f-string.

Progress towards #11491 but format specifiers are not handled in this PR.

@dylwil3 dylwil3 added bug Something isn't working fixes Related to suggested fixes for violations labels Dec 3, 2024
Copy link
Contributor

github-actions bot commented Dec 3, 2024

ruff-ecosystem results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

✅ ecosystem check detected no linter changes.

Copy link
Member

@MichaReiser MichaReiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense and roughly matches what we do in the implicit concatenated string formatting.

Do we need to have any special handling for escapes in format specifiers or the debug text parts of an f-string? We don't have to fix this as part of this PR but format specifiers and debug texts were something that caused me a a lot of pain when working on f-string formatting

@dylwil3
Copy link
Collaborator Author

dylwil3 commented Dec 4, 2024

Do we need to have any special handling for escapes in format specifiers or the debug text parts of an f-string? We don't have to fix this as part of this PR but format specifiers and debug texts were something that caused me a a lot of pain when working on f-string formatting

As far as I can tell you can't really put an escape in either of those places: for format specifiers I think that's a syntax error and for debug text I think you could only do that inside a nested string (e.g. f"{'\InHere'=}"), which is handled fine (added test to fixture).

But maybe I missed something so if someone comes up with an example I can do a followup PR.

Thank you for the thoughtful review!

@MichaReiser
Copy link
Member

MichaReiser commented Dec 4, 2024

>>> a = 10
>>> f"{a:\xFF}"
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: Unknown format code '\xff' for object of type 'int'

I think you can have arbitrary escapes but the above is recognized as a hex string. What you linked to is only the list of standard format specifiers but an object can implement support for arbitrary format specifiers

@dylwil3
Copy link
Collaborator Author

dylwil3 commented Dec 4, 2024

What you linked to is only the list of standard format specifiers but an object can implement support for arbitrary format specifiers

Terrifying.

Yes, this doesn't seem quite right when I make the escape invalid:

cargo run -p ruff -- check tmp.py --no-cache --select W605 --fix --diff
--- tmp.py
+++ tmp.py
@@ -1,2 +1,2 @@
 a = 10
-f"{a:\Xff}"
+rf"{a:\Xff}"

Would fix 1 error.

I'll leave the linked issue open and do a followup PR once I understand better how these things work...

Thank you!

@dylwil3 dylwil3 merged commit c617b2a into astral-sh:main Dec 4, 2024
21 checks passed
@dylwil3 dylwil3 deleted the escape-f branch December 4, 2024 12:59
dcreager added a commit that referenced this pull request Dec 4, 2024
* main:
  [red-knot] Test: Hashable/Sized => A/B (#14769)
  [`flake8-type-checking`] Expands TC006 docs to better explain itself (#14749)
  [`pycodestyle`] Handle f-strings properly for `invalid-escape-sequence (W605)` (#14748)
  [red-knot] Add fuzzer to catch panics for invalid syntax (#14678)
  Check `AIR001` from builtin or providers `operators` module (#14631)
  [airflow]: extend removed names (AIR302) (#14734)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working fixes Related to suggested fixes for violations
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants