PML 2.0 and Attributes Lenient Parsing #56
Replies: 6 comments 12 replies
-
Lenient parsing was more lenient (and more difficult to parse) prior to version 2. For example, instead of:
... you could instead write this in the previous version 1.5.0 :
... and the parser would use some over-complex regex to figure out that "Final Thoughts" is the value for attribute
Yes, I fully agree.
That would certainly be the ideal solution. Even more in future PML versions with extensions that power-users will love, but that will be even more difficult to support in editor plugins. For example: user-defined nodes, and embedded source code that generates PML markup.
Very hard and time consuming, I guess. At least for me, because I've no practical experience with language server implementations. However, as the new parser is no more written in PPL, but written entirely in Java (including lenient parsing, and all text processing rules and nodes such as
In my opinion, editor plugins should not try to implement error-tolerant parsing, because it's just too hard (and in some contexts even impossible) to make it work correctly in all cases. They should just stop parsing at the first error encountered. The new PML parser distinguishes between 'canceling errors' and 'non-cancelling' errors. In case of a 'canceling error' (e.g. the final quote of a quoted attribute value is missing), parsing is simply cancelled, because IMO it's impossible to reliably guess how to continue. Take IntelliJ IDEA, for example. Support for Java is really awesome in this IDE. However, when it comes to fault-tolerant parsing, it often fails miserably, and displays a whole avalanche of false positives that are just disturbing. Sometimes, even correct code that precedes the error is displayed as illegal. I would much prefer to have the first error displayed in red, and all subsequent code just displayed in grey. In a nutshell: For editor plugins, I suggest to consider keeping it simple and:
As you pointed out already, ideal editor support can only be achieved with a dedicated language server that uses the pXML parser. |
Beta Was this translation helpful? Give feedback.
-
I Think I've Nailed It!@pml-lang, after various tests with Sublime PML I think I've now found a method to implement the new attributes system in a way that supports both lenient parsing and smart completions. I'm not 100% sure, but the tests so far seem to be working an promising (the new approach can be viewed in the And the good new is that I didn't have to exploit the new ST4 syntax features either (i.e. branching or multi-pop), which means that it should be possible to replicate this also in VSCode using the old TextMate syntax format (i.e. if it supports meta scopes the way ST does). The challenge was how to be able to pinpoint the specific scopes following an opening tag where attributes could occur — i.e. determining when the valid zone for node attributes ends and node contents begin. This is a twofold requirement:
Lenient parsing makes this harder because in some cases attributes might not be enclosed within parenthesis. But I've found a way to switch context from lenient- to enclosed-attributes, which means that we'll be able to save smart completions (which was my main worried). The whole process adds a bit of overhead, but not as I had foreseen, for I've come up with a new approach that allows sharing attributes definitions among nodes without loosing tag-specific contexts. There are still some unexplored/undocumented questions left, which I'll have to work out by further trial and error:
the answers to the above questions need to be taken into account in PML editor syntaxes, especially with the new approach I've come up with, since these RegEx based syntax definitions are all about handling expected token (valid or invalid) to determine when contexts start and end. The new JSON Tags file really helped me out in finding this solution, because it allowed me to get a better picture of the different tags groups, which is why I've been updating the mustache templates at the PML Playground in these days. Documenting the New MethodSince the new approach is fairly intricate I'm thinking of writing it out in a document first, so I have reference doc to work with, which I can then use for the Rouge syntax, Sublime PML and the VSCode syntax. Documenting it would provide a better action plan to stick with as I go along. That's something I have been planning any way, since I believe that it would be very useful to have a guide for syntax developers (i.e. syntax highlighters or editor syntaxes). The document will cover practical parsing details which are not to be found in the PML official docs, i.e. dealing with edge cases, context switching, etc., all of which are important to developers working on PML syntax support for third party tools. So my next step will be to start drafting this document, focusing on providing a list of all the nodes that require support for the different types of lenient parsing (they are not many, but one needs to know which they are and which kind of parsing leniency they support). Once I have a clear reference to work with, the rest of the work on Sublime PML will be just writing out the rules, one node at the time, until the whole syntax is covered. [ EDIT ] The documentation can now be found at:
VSCode Reference Links |
Beta Was this translation helpful? Give feedback.
-
Great! That's very very good news!
Whether in lenient parsing mode or not, an invalid attribute always raises an error. For example, this code:
... generates the following error:
That would raise an error too, because all attributes must either be included or not included in parenthesis. For example, this code:
... generates:
Yes, absolutely. And, besides being useful in the context of other plugins for PML, in the future it could also be useful for other projects that use the PDML syntax. |
Beta Was this translation helpful? Give feedback.
-
Yes.
Yes.
Rather than adding an
Maybe that would complicate the PML plugin, because you then would have to consider two JSON files (if you want to include support for PDML extension nodes in PML). Alternatively, you could simply scope all extension nodes (i.e. nodes with
PDML extension nodes cannot directly be used to override native PML nodes. Overriding native PML nodes could later be achieved with "user-defined-nodes" (UDNs) in PML (which have been added in version 2.2.). UDNs can currently only be used to add new nodes, not to override existing native PML nodes. The ability to override native nodes (by defining a UDN with the same name as a native node, and adding a field like However, because UDNs are defined in PDML files, PDML extension nodes can be used to define UDNs . For example, you could use PDML extension node |
Beta Was this translation helpful? Give feedback.
-
PML 3 & Lenient Parsing Attributes@pml-lang, I noticed that in the PML Changelog for v3.0.0 it mentions:
Does this means that the lenient parsing rule that allowed to omit the key for default attributes no longer applies to PML in general? I've noticed that the User Manual doesn't mention it any longer, but since the Changelog doesn't specifically mention that the rule is now dropped, I wanted to be sure if that's the case. Also, since now the Furthermore, regarding the new |
Beta Was this translation helpful? Give feedback.
-
The current PMLC version 3.0.0 does not support positional attributes/parameters. I'm not sure yet if it's a good idea to add positional attributes in a future version. Yes, positional attributes save keystrokes, but they can also make code less readable and more error-prone. Moreover, they make parsing PML/PDML documents more challenging, and (as you mentioned) it is not easy (or even impossible) to support them in editor plugins and other PML tools. Hence, I do not plan to add positional PML/PMDL attributes in the near future. Maybe we should even remove field The reason to have positional parameters in Moreover,
Yes, it's like that. However, if positional and named arguments are both supported, then we also need clear usage rules, because ideally, it should be possible to mix positional/named arguments, and to (optionally) use names for positional arguments (to increase readability). Another reason to think twice before adding positional arguments to PML/PDML.
Yes, that would be the best solution, as agreed already.
Yes, absolutely. PML and the PML lang server must evolve together seamlessly, and without code duplication. |
Beta Was this translation helpful? Give feedback.
-
@pml-lang, I was looking into the syntax changes of PML v2.0, and how the attributes notation has changed.
Although the new syntax is way cooler on the end user's side, all the new optional features of the notation, and especially the new Lenient Parsing conventions, are going to make the creation of PML syntaxes for editors much harder.
Without a proper parser and context awareness, it's going to be very hard — if not impossible — to handle all these notation variants. With some luck (and a lot of hard work) it should be possible to cover most cases in Sublime Text 4, thanks to the new syntax branching features which can roll back a parsing context before enforcing it, but I think that there's simply now way that we could implement a PML 2.0 syntax in VSCode (or any TextMate based grammar) at this point.
I think you really need to look into creating an official PML Language Server right now, so that the PML syntax can be achieved at least by editors that support PML. Ideally, the PML language server should be part of the PML package itself, so that whoever has installed the PML converter will also have the matching language server on the machine, which would spare end users from having to install and update them separately.
I'll try to update the Sublime PML syntax to PML 2.0, which is going to take quite some time due to having to update all the syntax tests and completions, along with updating the syntax itself. But I'm still not 100% sure that I'll ultimately be able to fully cover the new syntax.
For example, new leniency rules like:
are really hard to cover with any RegEx based syntax definition. So far, the ST syntax was defining attributes types, which would be handled differently, and were being captured thanks to the attribute tag. But now that some of these tags become optional, it's going to be very hard to handle them via guess work — in syntax definitions we have no variables, and branching conditions are painfully emulated via contexts switches, which are dumb in terms of context awareness.
Add to that the fact that editor syntaxes need to account for malformed markup too (and catch it as an invalid case), things get even more complicated.
Hence I think that resorting to LSP is the only viable option at this point.
Did you manage to look into a PML Lang Server?
How hard would it be to integrate the Lang Server into the PML project and package, so that it can automatically mirror the latest PML version and automatically ship with every package?
Beta Was this translation helpful? Give feedback.
All reactions