Skip to content

dmitriid/pegjs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

46 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

An implementation of PEG.js grammar for Erlang

This is a rather straightforward port/implementation of the grammar defined for PEG.js.

Current status

  • As far as I can tell, implements everything from the PEG.js grammar

  • Generates complete useable parsers

  • The project is bootstrapped (see priv/pegjs_parse.pegjs). Original grammar for Neotoma is also available in priv/pegjs_parse.peg

  • It's based on an earlier definition of the grammar (probably this) than the one that currently exists for PEG.js.

    Current-ish version of the grammar has been ported to priv/parser.pegjs, but causes the VM to quit with an out-of-memory exception on sufficiently large garmmars (including its own). See How to contribute section for more info

  • Implements support for @append extension (see, e.g. core-pegjs in the for-GET project)

Further work

  • Dialyze, create dialyzer-friendly parsers

How to use

> pegjs:file("extra/csv_pegjs.peg").
ok
> c("extra/csv_pegjs.erl").
{ok, csv_pegjs}
> csv_pegjs:parse("a,b,c").
[{<<"head">>,
  [{<<"head">>,[[[[],[],<<"a">>]]]},
   {<<"tail">>,
    [[[<<",">>],[[[[],[],<<"b">>]]]],
     [[<<",">>],[[[[],[],<<"c">>]]]]]}]},
 {<<"tail">>,[]}]

There are several options you can pass along to pegjs:file(File, Options::options()):

-type options() :: [option()].

%% options for pegjs

-type option()  :: {output, Dir::string() | binary()} %% where to put the generated file
                                                      %% Default: directory of the input file
                 | {module, string() | binary()}      %% to change the module name
                                                      %% Default: name of the input file
                 | pegjs_analyze:option().

%% options for pegjs_analyze

-type option()  :: {ignore_unused, boolean()}        %% ignore unused rules. Default: true
                 | {ignore_duplicates, boolean()}    %% ignore duplicate rules. Default: false
                 | {ignore_unparsed, boolean()}      %% ignore incomplete parses. Default: false
                 | {ignore_missing_rules, boolean()} %% Default: false
                 | {ignore_invalid_code, boolean()}  %% Default: false
                 | {parser, atom()}                  %% use a different module to parse grammars. 
                                                     %% Default: pegjs_parse
                 | {root, Dir::string() | binary()}. %% root directory for @append instructions. 
                                                     %% Default: undefined

How to contribute/develop

Suggestions and improvements are more than welcome!

Current grammar in priv/pegjs_parse.peg is created for Neotoma, so you need that to tweak pegjs.

pegjs_analyze module is inspired by neotoma_analyze from the 2.0-refactor branch of neotoma.

Non-generated parser combinators can be found in priv/pegjs.template.

Safe working parser is always available at src/pegjs_parse.erl.safe.

pegjs grammar

The current grammar from which the project is now bootstrapped lives in priv/pegjs_parser.pegjs. When you've tweaked it and you want to try your changes, generate a different module and tell pegjs to use your new module instead:

> pegjs:file("priv/pegjs_parse.pegjs", [{output, "src"}, {module, modified_parser}]).
ok
> c(modified_parser).
{ok, modified_parser)
> pegjs:file("extra/json.pegjs", [{parser, modified_parser}]).
ok
... etc. ...

Once you're satisfied with your changes, overwrite pegjs_parser (which is used by default):

> pegjs:file("priv/pegjs_parse.pegjs", [{output, "src"}]).
ok
> c(pegjs_parse).
{ok, pegjs_parse)
> pegjs:file("extra/json.pegjs").
ok
... etc. ...

Up-to-date grammar

A port of a current-ish version of the PEG.js grammar can be found in priv/parser.pegjs. src/pegjs.erl, src/pegjs.hrl and src/pegjs_analyze.erl have all been updated to work with this grammar (and will generate a parser for you. Note, however, that priv/pegjs.template doesn't contain code for the action combinator).

To generate a parser from this grammar:

> pegjs:file("priv/parser.pegjs", [{output, "src/"}]).
ok
> c("src/parser.erl").
{ok, parser}
> pegjs:file("extra/csv.pegjs", [{parser, parser}]).
ok
... etc ...

However, the parser causes the VM to fail with an out-of-memory exception for sufficiently large grammars (including parser.pegjs). YMMV. The culprit is the escape/1,2 function (see initializer section). I haven't figured out what to do about this yet.

Original neotoma

The original parser for pegjs was derived from a grammar defined for Neotoma. You can also start your work from there:

> neotoma:file("priv/pegjs_parse.peg", [{output, "src/"}]).
ok
> pegjs:file(.... etc ... )

However, the original grammar will get increasingly outdated as time goes on, so it's there for reference only.

About

peg.js implementation for erlang

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages