Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-102856: Allow comments inside multi-line f-string expresions #104006

Merged
merged 3 commits into from
May 22, 2023

Conversation

cmaureir
Copy link
Contributor

@cmaureir cmaureir commented Apr 30, 2023

Enabling comments only on multi-line f-strings, but starting from the second line of it.

@cmaureir
Copy link
Contributor Author

The initial condition was based on tok->multi_line_start but then revisiting PEP701 I noticed that the example:

>>> f'''A complex trick: {
... bag['bag']  # recursive bags!
... }'''

contained the opening curly bracket on the same level, so it was failing, but all the other cases were not.

My goal was to skip that error condition, but I couldn't make a difference from the following cases:

f"something {#

and

f"""something {# some lines ahead
...

so I decided not to accept comments on the first line.

Ideas are welcome :)

@pablogsal
Copy link
Member

@lysnikolaou @isidentical @mgmacias95 thoughts?

@lysnikolaou
Copy link
Member

Well, I'm not sure I like the idea of disallowing comments in the first line. One of the main points of 701 was that explaining f-string features would be much easier and this goes in the opposite direction. "You can add comments in f-string expressions" is much easier to remember than "You can add comments in f-string expressions, but not the first line". And this already comes on top of only allowing them in multi-line f-strings.

This is not a very strong opinion and I'd obviously be okay to go with it, if working around it is very difficult and there's consensus that this is okay.

@cmaureir
Copy link
Contributor Author

cmaureir commented May 3, 2023

Another option we could have, is start iterating on the rest of the line characters, and if I find a '}' I know we are on the same line and it's not valid, and if we find a '\n' we know it's a multi-line expression.

Meaning:

>>> f"nope {#}"
  File "<stdin>", line 1
    f"nope {#}"
             ^
SyntaxError: f-string expression part cannot include '#'
>>> f""" yep { # something
... 1 + 2 # math!
... }"""
' yep 3'
>>> 

Can update the PR, but the code will look like this:

          if (INSIDE_FSTRING(tok) && (tok->lineno == 1)) {                                               
             while (c != '\n') {                                                                        
                 c = tok_nextc(tok);                                                                    
                 if (c == '}')                                                                          
                     return MAKE_TOKEN(syntaxerror(tok, "f-string expression part cannot include '#'"));
             }                                                                                          
          }

Do you think something like this would make more sense @lysnikolaou ?

If people gets creative I we will have problems as well with cases like:

>>> f"meh {# I really like the } character
  File "<stdin>", line 1
    f"meh {# I really like the } character
                               ^
SyntaxError: f-string expression part cannot include '#'

edit: if tok>lineno == 1 doesn't look too good, we could also condition it wrt the quote size, so we could have current_tok->f_string_quote_size == 1 instead. After all, we could detect if we are on a possible multi-line string.
This might open another strange-looking code:

>>> f"""{#}"""
... 42}"""
'42'

but maybe it's better?

@lysnikolaou
Copy link
Member

edit: if tok>lineno == 1 doesn't look too good, we could also condition it wrt the quote size, so we could have current_tok->f_string_quote_size == 1 instead. After all, we could detect if we are on a possible multi-line string.

This is the way to go in my opinion. The snippet you showed, while a bit tricky to parse for a human at first, should be okay, since this is also the case outside of f-strings:

a_set = (#{
    {
        1, 2, 3, 4#}
    }
)

@pablogsal
Copy link
Member

This is the way to go in my opinion.

I agree with this 👍

I think that the simpler the rules the easier to explain, specially since most syntax highlighters will do the right thing and show the comment in a darker shade

@cmaureir
Copy link
Contributor Author

cmaureir commented May 4, 2023

Thanks for your feedback 🎉
Let me know if something else then is required for this before it's merged.

if (INSIDE_FSTRING(tok) && (current_tok->f_string_quote_size == 1)) {
while (c != '\n') {
c = tok_nextc(tok);
if (c == '}')
Copy link
Member

@pablogsal pablogsal May 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may raise "false positives". What if for instance the text contains sone dictionary literal inside the expression or escaped }} or anything similar?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to play around with more cases, but I couldn't find a false-positive:

>>> f"# something"
'# something'
>>> f"# {1}"
'# 1'
>>> f"{{}} # {{}}"
'{} # {}'
>>> f"# {{f'# {{}}'}}"
"# {f'# {}'}"
>>> f"""{f"{f'''{f'# {1} #'}'''}"}"""
'# 1 #'
>>> d = {"a": 1}
>>> f"{''.join(f" # {k}: {v}" for k, v in d.items())}"
' # a: 1'

Maybe you have something else in mind? 🤔
Perhaps you were thinking of using INSIDE_FSTRING_EXPR(current_tok) as well?

Copy link
Member

@pablogsal pablogsal May 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Example (check the location of the error caret):

>>>f"{sadasdfsd # {1,2,3,4}   }"
  File "/Users/pgalindo3/github/python/main/lel.py", line 1
    f"{sadasdfsd # {1,2,3,4}   }"
                           ^
SyntaxError: single line f-string expression part cannot include '#'

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another example:

>>> f"{sadasdfsd # {1,2,3,4}

    }"
  File "/Users/pgalindo3/github/python/main/lel.py", line 1
    f"{sadasdfsd # {1,2,3,4}
                           ^
SyntaxError: single line f-string expression part cannot include '#'

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another example:

>>> f"{1 # look some weird escaped }}
+
1
    }"
  File "/Users/pgalindo3/github/python/main/lel.py", line 1
    f"{1 # look some weird escaped }}
                                   ^
SyntaxError: single line f-string expression part cannot include '#'

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we may just want to drop this check because IIRC the error you will get is that there is a { that is never closed and it should point correctly to the one that's not closed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, also then we would need to define if we close the expression the moment we find a '}' or we agreed on something else, because something like:

f"The result is: {2 # a nice number} and {17 # a prime number}"

should work by that condition, but things like:

f"something : {2 # I like the '}' parenthesis}"
#                                            ^ this is the one that closes the expr

will not. And the only solution I think could be possible, would be to either allow the '}}' escape pattern, so it could be valid with:

f"something : {2 # I like the '}}' parenthesis}"

or we simply don't allow having { or } in the comments while being inside a f-string expression.

Not sure if we have another workaround.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nah I think the approach here is that a comment is a comment and we should not try to be clever about this. Let allow any comment inside and let's try to focus on the fact that the error message makes sort of sense (I expect that most of them would be unclosed brackets).

if (INSIDE_FSTRING(tok) && (current_tok->f_string_quote_size == 1)) {
while (c != '\n') {
c = tok_nextc(tok);
if (c == '}')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Add braces per PEP 7

@lysnikolaou
Copy link
Member

Hey @cmaureir, any progress on this? Will you get some time to address the comments so that we can merge this?

@cmaureir
Copy link
Contributor Author

Hey @cmaureir, any progress on this? Will you get some time to address the comments so that we can merge this?

Hey, I got lost in the corner-cases, and I got distracted afterwards.
I tried to think of a mechanism that can enable things like this:
f"something {1 # one} and {2 # two}"
but some solutions are leaving out the cases that Pablo was highlighting, like:

>>> f"{1 # look some weird escaped }}
+
1
    }"

So I'd need some input if
(1) the moment I found a comment, then we don't care what's there and we know it's a comment until '\n'
e.g.:

f"something {1 # this is a comment } { no matter what is around on the rest of the line...
... }"

or
(2) if we start a comment, we iterate the characters until we find the first }

f"something {1 # this is a comment } <- the expr and comment ended there, the rest is just a str"

With (1), we simple remove the error from the tokenizer, and we would expect behavior like this (from Pablo's comments):

>>> a = 1
>>> f"{a # {1,2,3,4}   }"
... }"
'1'

>>> f"{a # {1,2,3,4}
... }"
'1'
>>> f"{1 # look some weird escaped }}
... +
... 1
... }"
'2'

What do you think @lysnikolaou would what be enough?

@lysnikolaou
Copy link
Member

Yeah, let's go with (1).

@cmaureir cmaureir force-pushed the fstring_multiline_comments branch from f5e9e87 to 59d5c34 Compare May 19, 2023 18:49
@cmaureir
Copy link
Contributor Author

For that behavior, then it should be simpler, and only removing the error condition will be enough. If more test cases are required, please let me know.

Copy link
Member

@lysnikolaou lysnikolaou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@lysnikolaou
Copy link
Member

Btw, let's add some more tests, at least covering comments in single-quoted f-string as well, in a different PR.

@lysnikolaou lysnikolaou merged commit 0a77960 into python:main May 22, 2023
@cmaureir
Copy link
Contributor Author

Btw, let's add some more tests, at least covering comments in single-quoted f-string as well, in a different PR.

Understood. Thanks for your time on this :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants