Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

need example rule for single-line comments #653

Open
mvolkmann opened this issue Feb 21, 2025 · 2 comments
Open

need example rule for single-line comments #653

mvolkmann opened this issue Feb 21, 2025 · 2 comments

Comments

@mvolkmann
Copy link

Can someone point me to an example Nearley grammar that handles single-line comments such as those that begin with # or // in many programming languages? I suspect the problem I'm hitting is that newline characters are discarded by default, so I can't write a rule that matches # following by any characters up to the next newline. Perhaps there's an option I can pass to the Parser constructor to tell it not to discard newline characters, but I haven't found that yet.

@mvolkmann
Copy link
Author

I found a solution. Rather than let nearley use the default Moo lexer, I added code to my grammar ".ne" file to customize the lexer so it creates tokens from single line comments. Here's a snippet of that code:

@{%
  const moo = require('moo');
  const lexer = moo.compile({
    colon:   ':',
    comma:   ',',
    comment: /(?:#|\/\/).*$/,
    equal:   '=',
    lbrace:  '{',
    lsquare: '[',
    name:    /[A-Za-z]\w*/,
    number:  /0|[1-9][0-9]*/,
    rbrace:  '}',
    rsquare: ']',
    string:  /"(?:\\["\\]|[^\n"\\])*"/,
    ws:      { match: /[ \n\t]+/, lineBreaks: true },
  });

  // Redefined the next function to skip certain tokens.
  const originalNext = lexer.next;
  lexer.next = function () {
    while (true) {
      const token = originalNext.call(this);
      if (!token) return null; // end of tokens
      if (token.type !== 'ws' && token.type !== 'comment') {
        return token;
      }
    }
  };
%}

@rhys-vdw
Copy link

rhys-vdw commented Mar 9, 2025

I was able to achieve single line comment parsing by tokenizing the newline separately to whitespace. The problem I hit with it was that it wouldn't detect EOF so I always appended a single newline to the end of the input before passing it to nearley.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants