Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make pattern for regex_string token less character-hungry #150

Closed
wants to merge 1 commit into from
Closed

make pattern for regex_string token less character-hungry #150

wants to merge 1 commit into from

Conversation

ronaldevers
Copy link

I think this fixes queries with two regexes, like so:

select * from /.../ where x =~ /.../

The lexer would previously extract one big regex from the first to the last '/' character.

A problem that still remains is this:

select a/2, b/2 from x
ERROR: Error at 841508205:2713675. syntax error, unexpected REGEX_STRING, expecting FROM

Don't know how to fix that yet.

@ronaldevers
Copy link
Author

I'm thinking you should probably not eat the regex_string token in the lexer but just pass the '/' character as is and decide in the parser on the meaning of the characters after the '/'. In the parser you know if you are in the "select" part of the query, where the slash means "divide", or in the "from" part where it means "regex_string follows". But then the problem becomes how to lex the characters comprising a regex string in between slashes. Hmm...

@pauldix
Copy link
Member

pauldix commented Dec 29, 2013

slashes aren't valid characters in either time series names or column names, so I'm not sure this is actually a problem.

@ronaldevers
Copy link
Author

Well, maybe not as time series or column names directly, but you can specify regular expressions (using slashes) in the from clause according to the docs and using a slash as arithmetic division operator is the use case in the select clause. Division is not actually in the docs, but it is specified in the lex file and it works. It just doesn't work when there is another slash in the query. For example, this query from the docs does not work as expected (even if you add the missing slash):

select * from events where (email =~ /.*gmail.* or email =~ /.*yahoo.*/) and state == 'ny';

The current lexer will match a REGEX_STRING token containing .*gmail.* or email =~ /.*yahoo.*, right? Or am I missing something?

@pauldix
Copy link
Member

pauldix commented Dec 29, 2013

ah right, seems to be an issue... hmmm. May wait for @jvshahid to get back later this week and discuss with him.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants