Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Snippet grammar fixes and minor formal improvements throughout the LSP spec #1886

Merged
merged 4 commits into from
Jan 30, 2024

Conversation

falko17
Copy link
Contributor

@falko17 falko17 commented Jan 17, 2024

While reading through the specification, I've noticed a few style issues as well as some shortcomings in the EBNF grammar for snippets. This PR should fix them. See the list below for details on what was changed.

Detailed changes

  • Most of the changes are various grammar, spelling, punctuation, formatting, or consistency improvements1.
  • Some incorrect TypeScript syntax in the examples has been fixed.
  • The EBNF grammar for snippets has been improved in the following ways:
    • Incorrect grouping of alternatives in some parts has been fixed.
    • Ambiguities present in the grammar have been fixed, the grammar should now be LL(1)-parseable.
    • The grammar has been made consistent with the description above it. Specifically, characters which have to be escaped per the textual description now also need to be escaped per the grammar rules2.
    • A description for the ':+', ':?', and ':-' modifiers has been added, as they seem to not be explained anywhere else in the spec.
      • I have used the VSCode source code as a basis for these.
    • The link has been updated to refer to the right EBNF specification, as the grammar uses the XML variant rather than the one described in the originally linked Wikipedia article.

Note

These changes have only been applied to the upcoming 3.18 LSP specification. If they should be applied to the current version 3.17, I could port the changes over as well.

Footnotes

  1. Not all the changes necessarily fix errors, some just reword certain passages to sound more natural, which may be somewhat subjective.

  2. Technically, the text clarifying escaping rules could be removed, as the grammar already describes them now. Keeping the text for clarity's sake might still make sense, though.

This mostly fixes some grammar, spelling, and punctuation issues
throughout version 3.18 of the LSP specification. Additionally,
some incorrect TypeScript syntax has been fixed.

Meta model files have not been changed.
To be precise, "fix" refers to the following changes in this commit:
* Incorrect grouping of alternatives has been remedied
* Ambiguities present in the grammar have been fixed,
  the grammar should now be LL(1)-parseable.
* Grammar has been made consistent with the description above it.
  Specifically, characters which have to be escaped per the textual
  description now also need to be escaped per the grammar rules.
* A description for the ':+', ':?', and ':-' modifiers has been
  added, as they seem to not be explained anywhere else in the spec.
  * I have used the VSCode source code as a basis for these.
* The link has been updated to refer to the right EBNF specification,
  as the grammar uses the XML variant rather than the one described
  in the originally linked Wikipedia article.
@falko17
Copy link
Contributor Author

falko17 commented Jan 17, 2024

@microsoft-github-policy-service agree

if ::= text
else ::= text
text ::= ([^$}\] | '\$' | '\}' | '\\')*
choicetext ::= ([^,|\] | '\,' | '\|' | '\\')*
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO we need to make choicetext as follows:

([^$},|\] | '\$' | '\}' |'\,' | '\|' | '\\')*

Any reason why you didn't

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason why $ and } don't need to be escaped within a choice construct is that it cannot occur in the choice in as part of any other construct, so it is uniquely parseable. Only a ,—indicating the next choicetext—and a |}—indicating the end of the choice construct—can occur here as tokens with semantics attached. Anything else (including $ and }) can be ignored by the parser, and thus doesn't need to be escaped.

For example, ${1|close},$variable|} represents a choice construct with choices close} and $variable. These two choices don't have further meaning and are interpreted purely as strings. Contrast this with e.g. ${1|one,\,,two}, which is a choice construct with choices one, ,, and two. If the comma here were not escaped, it would be unclear whether the choice consists of two empty elements or a comma, hence the escape is necessary.

(Formally, $FIRST(\texttt{choicetext}) \cap FOLLOW(\texttt{choicetext}) = \emptyset$ must be true here1. As $FOLLOW(\texttt{choicetext}) = \{'\texttt{|}',\ '\texttt{,}'\}$, it is only these two characters that need to be escaped, so that they are not part of $FIRST(\texttt{choicetext})$.)

Footnotes

  1. FIRST/FIRST conflicts can occur too, but are only relevant insofar that the backslash needs to be escaped.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@falko17 thanks for the great explanation!

@dbaeumer
Copy link
Member

@falko17 thanks you so much for the great PR. Highly appreciated. I have several comments about indentation. If you are too busy to fix them I am happy to do so. I have one additional comment about the changes you did to the snippet syntax I would like to get your comment on.

Copy link
Member

@dbaeumer dbaeumer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See all my inline comments and my final one.

@falko17
Copy link
Contributor Author

falko17 commented Jan 29, 2024

Thank you for the detailed review, @dbaeumer! I've addressed your comments, please tell me if there is anything else that should be done. Also, sorry about all the indentation problems, my editor displayed tabs as 2 spaces, which is why I didn't catch these. Should be all fixed now.

@dbaeumer dbaeumer added this to the February 2024 milestone Jan 30, 2024
@dbaeumer dbaeumer merged commit a678705 into microsoft:gh-pages Jan 30, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants