Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reusable tokens #1494

Open
mAAdhaTTah opened this issue Jul 22, 2018 · 6 comments
Open

Reusable tokens #1494

mAAdhaTTah opened this issue Jul 22, 2018 · 6 comments

Comments

@mAAdhaTTah
Copy link
Member

Suggestion by @LeaVerou (#1479 (comment)):

Btw we should probably have common tokens as re-usable variables or something (E.g. Prism.NUMBER). Currently every language needs to redefine numbers, and in most cases they're the same... We could also have factory helpers, e.g. Prism.keywords("foo", "bar", ...) that generate regexes for common things.
What do you think?

@RunDevelopment
Copy link
Member

What are the use cases of this?

Let's take e.g. NUMBER:
What should it be like? Only decimal numbers? If so, are floating point numbers include (with or without exponents)? What about constants like NaN, Infinity or ?
And when you think about hexadecimal and binary numbers, and optional underscores within numbers and the whole thing becomes a mess. (And I don't need something like \d+)

I don't think that there are many places where we can get away with 'number': Prism.NUMBER or even 'number': [ number-style 1, 2, 3, 4, simple pattern for edge cases ].

Patterns like COMMENT or STRING look more promising but they also will fall short in many cases.

The cases where you can easily share patterns are between languages with very similar syntax and these are already taken care of by Prism.languages.extend.

So, where will they be used?

@mAAdhaTTah
Copy link
Member Author

'number': Prism.NUMBER

I think places like these is the intention. If the same pattern isn't used in multiple places, then we don't get much and isn't worth it.

@LeaVerou
Copy link
Member

Also the helper that generates keyword regexes could be general enough to be able to combine other regexes as alternatives. Then we could have a Prism.NUMBER for the actual numbers, and then we could do something like Prism.or(Prism.NUMBER, 'NaN', 'Infinity') for JS numbers for example. I think something like that has the potential to simplify many grammars, if done well.

@RunDevelopment
Copy link
Member

Prism.or(Prism.NUMBER, 'NaN', 'Infinity')

Doesn't this imply that NUMBER will be a rather broad pattern, matching all kinds of numbers?
So in general, are we going to tolerating false positives to increases the applicability of these constants?

It might be ok because we don't valid syntax, but do we really want to be that inclusive?

@LeaVerou
Copy link
Member

Doesn't this imply

How does it imply that? 🤔

@RunDevelopment
Copy link
Member

RunDevelopment commented Jul 24, 2018

How does it imply that?

If NUMBER didn't include stuff like e.g. hexadecimal numbers, it wouldn't work for JS.
Or did I take your example too literal? If so, I'm sorry.

But I would still like to hear the answer to the other questions. 😄

This was referenced Aug 22, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants