Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PDML vs PML Docs Bug: Different ID Patterns #70

Closed
tajmone opened this issue Dec 27, 2021 · 6 comments
Closed

PDML vs PML Docs Bug: Different ID Patterns #70

tajmone opened this issue Dec 27, 2021 · 6 comments
Labels
bug Something isn't working

Comments

@tajmone
Copy link
Contributor

tajmone commented Dec 27, 2021

BUG: ID pattern mentioned in PDML and PML differ!

PDML:  [a-zA-Z_][a-zA-Z1-9_\.-]*
PML:   [a-zA-Z][a-zA-Z0-9_-]*

@pml-lang, in #56 you mentioned:

Yes. To cover all cases, the plugin should use the regex shown in the PDML Specification: [a-zA-Z_][a-zA-Z1-9_\.-]*

which prompted me to check the PML docs on how ID patterns are defined, and I immediately noticed that in the PML Manual and PML Reference whenever the custom ID pattern is mentioned it's: [a-zA-Z][a-zA-Z0-9_-]* and not [a-zA-Z_][a-zA-Z1-9_\.-]* — i.e.:

  1. the dot \. mentioned in the PDML Specification is missing).
  2. The first character can be an underscore _ according to PDML (but not PML).

The same when invoking PMLC for tags info:

> pmlc ti -t admon

    Admonition
----------
Tag            : admon
[...]
Attributes:
1.
Id             : id
Title          : Node Identifier
Type           : string or null
Description    : A unique identifier for the node.
                [...]

                 Note for programmers: The regex of an identifier is: [a-zA-Z]
                 [a-zA-Z0-9_]*. Identifiers are case-sensitive.

So which one is correct? the one with the . and _ or the one without them?

@pml-lang
Copy link
Owner

ID pattern mentioned in PDML and PML differ!

Well spotted! Thanks.

So which one is correct?

OMG! They are both wrong. Sorry for that.
The correct regex is [a-zA-Z_][a-zA-Z0-9_\.-]*
It's now fixed everywhere:

It's also fixed in PMLC, and the fix be included in the next public version 2.3.0.

@pml-lang pml-lang added the bug Something isn't working label Dec 28, 2021
@tajmone
Copy link
Contributor Author

tajmone commented Dec 28, 2021

Well spotted! Thanks.

As of lately, being buried in Sublime PML and all it's matching patterns, I immediately had a hunch that the PDML pattern you mentioned didn't quite ring correct, but of course I wanted to check it before opening an Issue.

OMG! They are both wrong. Sorry for that.
The correct regex is [a-zA-Z_][a-zA-Z0-9_\.-]*
It's now fixed everywhere:

How could this happen? Aren't such core definitions hard-coded somewhere like a constant which is then reused all over the place (in the source code and docs source alike)?

Adding the ID Pattern to JSON File

I was also wondering whether this pattern might belong to the info exposed via the generated pml_tags.json. Currently the JSON file is structured:

{
    "pml_tags":{
        "pml_version":"2.2.0",
        "pml_release_date":"2021-12-14",
        "tags":[
            {

which focuses mainly on the PML tags.

Wouldn't it be worth adding the ID pattern too, e.g.:

{
    "pml_tags":{
        "pml_version":"2.2.0",
        "pml_release_date":"2021-12-14",
        "custom_id_re":"[a-zA-Z_][a-zA-Z0-9_\.-]*", // '\' Needs escaping?
        "tags":[
            {

(or something along those lines) so that end users can extract the pattern either to use it in their syntaxes or documentation, or just to check the current pattern status? (after all, in the future the pattern might change).

Re-Organizing JSON File?

Currently the generated JSON file focuses on the PML tags and their attributes, although it does contain some basic PML version info.

Wouldn't it make more sense to rename the JSON file from pml_tags.json to pml_info.json and expand its contents to also include other info unrelated to tags? i.e. as opposed to having multiple JSON files that could be generated via PMLC options?

E.g.

{
    "pml_info":{
        "pml_meta":{
            "pml_version":"2.2.0",
            "pml_release_date":"2021-12-14",
            "pml_website":"https://www.pml-lang.dev",
            "pmlc_repository":"https://github.com/pml-lang/converter"
        },
        "nodes_meta":{
            "custom_id_re":"[a-zA-Z_][a-zA-Z0-9_\.-]*",
            "nodes_total":45, // <-- i.e. whatever the number of PML nodes is!
            // etc.
        },
        "tags":[
            {

The above entries are just imaginary examples to make the point (although the PML links would be good candidates for a meta section).

IMO, it's better to pack all the info in a single JSON file, rather than having PMLC produce multiple info files. Since the JSON file is usually intended for machine processing, having too much info shouldn't be a problem — as long as the info is well organized in self-contained sections which can be filtered out.

@pml-lang
Copy link
Owner

Aren't such core definitions hard-coded somewhere like a constant which is then reused all over the place (in the source code and docs source alike)?

The regex should of course be defined at one place only. The reason for having it (currently) defined multiple times is that there are different types of source code files that cannot easily share data: PPL files, Java files, PML files, and HTML files.

Wouldn't it be worth adding the ID pattern too

Yes, good idea.

Wouldn't it make more sense to rename the JSON file from pml_tags.json to pml_info.json and expand its contents to also include other info unrelated to tags?

Yes, excellent idea too.

IMO, it's better to pack all the info in a single JSON file, rather than having PMLC produce multiple info files.

I agree, because it's for machine consumption. And it eliminates the risk of accidentally mixing JSON files related to different PML versions.

I suggest to create a dedicated discussion to clearly define and agree on the improved structure. After everything is well defined and thought out, it will be easy to implement it.

@tajmone
Copy link
Contributor Author

tajmone commented Dec 28, 2021

The regex should of course be defined at one place only. The reason for having it (currently) defined multiple times is that there are different types of source code files that cannot easily share data: PPL files, Java files, PML files, and HTML files.

That's why we have tools like mustache! You can generate those source files via mustache, so regardless of the source type you can use the same JSON/YAML data to fill in some critical parts. Wasn't the scope of the JSON file to help maintain syntax highlighters definitions and the like?

Here is the same, except that it's PL files, Java files, PML files, and HTML files — none of which matters to mustache. Of course, you'll add an extra step to your toolchain, but it's worth the price — I always use this approach for projects that require automating the documentation, because trying to handle these details manually is too error prone.

I suggest to create a dedicated discussion to clearly define and agree on the improved structure. After everything is well defined and thought out, it will be easy to implement it.

Good idea!

@pml-lang
Copy link
Owner

you'll add an extra step to your toolchain

That's why I didn't do it yet. My plan for later is to create a customized, one-click build tool that takes care of everything.

@pml-lang
Copy link
Owner

As said already, the regexes have been fixed everywhere.

My plan for later is to create a customized, one-click build tool that takes care of everything.

Done since version 3.0.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants