-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
various f-string fixes #52
various f-string fixes #52
Conversation
d9b2c26
to
e1ead33
Compare
Out for the day, we have ~30 remaining test failures which I was able to at least group and I think none of them too hard to solve:
|
@@ -1495,7 +1495,8 @@ expr_ty _PyPegen_formatted_value(Parser *p, expr_ty expression, Token *debug, ex | |||
if (PyUnicode_GET_LENGTH(conversion->v.Name.id) > 1 || | |||
!(first == 's' || first == 'r' || first == 'a')) { | |||
RAISE_SYNTAX_ERROR_KNOWN_LOCATION(conversion, | |||
"f-string: invalid conversion character: expected 's', 'r', or 'a'"); | |||
"f-string: invalid conversion character %R: expected 's', 'r', or 'a'", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As we are removing f-string:
from the other tests should we remove it from here as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess there's also the question, should we maybe find a way to keep f-string
in all the error messages that come from inside an f-string?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My reasoning was, errors that can happen in a normal code should not have f-string: in front of them (e.g. 1.X()
vs f"{1.X()}"
) but errors that are specific to f-strings (like this one or the error we give when saying unmatched brace even though there was a previous brace f"{)}" vs "{)}") should include f-string:
prefix to denote that this rule only applies within the f-string context. But am open to other ideas as well (fully removing or automatically adding f-string:
at the top level error delegating part).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I prefer to remove all the f-string:
based on two things:
- We don't mention other constructs elsewhere (as in "error in import statement") or something like that.
- The parser doesn't really know that some syntax error is really inside an f-string. Sure, it can know what is happening inside one but it doesn't need to know how deep into it is.
- I don't really think it helps a lot the user once is clear where the error is when is printed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, let's go with no f-string:
then. While I agree with @isidentical's view and it makes sense, I think in this case consistency is a bit more important, so let's remove them from everywhere if you're all alright with it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, let's make a quick poll. React to this comment with 👍 for removing the prefix and 👎 to keep it only on f-string related errors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I support removing all f-string:
prefix, but insert f-string
somewhere in the context when it's ambiguous.
Now everything inside the {
}
is just as normal as everything out of the f-string
.
I think the f-string
exists previously is because the former parser will send the f"{123+}"
into another parser and output
(123+)
^
SyntaxError: f-string: invalid syntax
With the new parser, the user can now locate the error easily without telling f-string
, thus we can remove all prefix.
But for other errors, such as missing the conversion letter,
SyntaxError: f-string: invalid conversion character: expected 's', 'r', or 'a'
Removing all the f-string
suffix would cause some ambiguity. You cannot search for the correct answer without telling Google it's the f-string
.
https://www.google.com/search?q=python+conversion+characters
I think it would be better if we tell the user
SyntaxError: invalid f-string conversion character: expected 's', 'r', or 'a'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is actually a very good point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed, add some context saying it's a f-string in these cases can be very helpful. I also agree on adding the context in the error message and not as a f-string:
prefix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, this is what we are going to do: let's move this forward and let's make another PR to "normalize" the error messages.
Uh, seems like we are having a weird problem with the parser on the latest revision of
It might be related to extra peeks since if I add a bunch of characters after the first brace |
OK, I think I have a fix at #56. |
Landed the fix, good catch! |
If a syntax error is not specific to the situation of an f-string (like empty expression etc.) then it probably doesn't worth the extra complexity for us to introduce codepaths in tokenizer in order to add an "f-string" prefix in front of the errors.
Note: we can safely disregard whitespace after the conversion character since it doesn't relate to neither the conversion opener (`!`) nor the conversion character itself, it is part of the conversion group (which I think is OK to keep as is).
Co-authored-by: Marta Gómez Macías <mgmacias@google.com>
c5e83df
to
83b2adf
Compare
if (tok->tok_mode_stack_index > 0 && opening == '{') { | ||
assert(current_tok->bracket_stack >= 0); | ||
int previous_bracket = current_tok->bracket_stack - 1; | ||
if (previous_bracket == current_tok->bracket_mark[current_tok->bracket_mark_index]) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: maybe we could make a macro to simplify current_tok->bracket_mark[current_tok->bracket_mark_index]
.
This has been landed |
Please review individual commits. Each commit tries to reduce the number of test failures, currently: