Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Find regexp feature #104

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

[WIP] Find regexp feature #104

wants to merge 2 commits into from

Conversation

sam2x
Copy link
Collaborator

@sam2x sam2x commented May 23, 2016

The find_regexp_feature :

  • This feature doesnt break the current find feature, it is transparent
  • Current tests have been passed, some have been added
  • To use regexp feature, just add to your indexed/uniques arguments {regexp:'your_pattern'}

Example of use :

You may still pass a string to your indexed/uniques field, but to trigger the regexp you need to pass an object, with key as 'regexp', your pattern as value :

Model.find({
  email:{
    regexp:'*@domain.*'
  }
}, cb)

Some explanation about logic:

The find function is iterating each property, looking at the property of the key in the Model to define the kind of lookup to apply. I added 3 main things :

1- If the property is an object and has regexp key, i add it to a dedicated list of regexp key. There are two lists of regexp key, the ones who has 'index' property, and the ones who has 'uniques' property.
2- For each list i fetch the appropiate redis prefix, and run keys functions, then i fetch ids of these keys with the appropriate command: smembers or get (because index is a redis set datatype, and unique is a redis string datatype).
3- Once i have fetched all the keys, find function have 4 arrays :

  • sets keys (from simple index/uniques string search)
  • zsetskeys (from numeric search)
  • regexp uniques keys
  • regexp sets keys.

3bis- What i do from there, i use reduce function which lookup for values of each arrays (set/zsetsKeys since regexp are already resolved), and make an intersection of Ids found for each array. Logical 'AND' is still applied as initially for the find function.
I wanted to add "operator OR/AND" feature (where find argument is not obj but array of obj with a key 'operator' defined to OR or AND), but in future (not in my needs actually)

ps: Despite my grammar who is wrong (sorry!), there is a typo on regexp error message, i say "object must be {pattern:'regexp'}", pattern should be replaced by "regexp".

sam2nx added 2 commits May 23, 2016 16:47

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
@sam2x sam2x mentioned this pull request May 23, 2016
@maritz maritz closed this May 23, 2016
@maritz maritz reopened this May 23, 2016
@maritz
Copy link
Owner

maritz commented May 23, 2016

Travis was borking, had to close/reopen.

@sam2x
Copy link
Collaborator Author

sam2x commented May 23, 2016

Hey, spotted broken/useless code (i guess it's when i started to modify the file and trying to figure out how to pass 'regexp'), line 231 :

else if (searches.regexp)
{
  searches = {}
}

Can you remove it ? Else it will create edge case when user has 'regexp' as parameter name.

@sam2x
Copy link
Collaborator Author

sam2x commented May 25, 2017

Please don't merge the current patch. I grealy improved it due to a sidecase. I need to push my patch to reflect the final change.

@sam2x sam2x changed the title Find regexp feature [WIP] Find regexp feature May 25, 2017
@fatfatson
Copy link

is it in working?
I think the fuzzy search is very necessary!

@sam2x
Copy link
Collaborator Author

sam2x commented Feb 10, 2018

Yes it's working, I just didnt clean my code and sync it. Thought nobody was interested about this feature, will do it soon ;)

@maritz
Copy link
Owner

maritz commented Feb 10, 2018

@sam2x before you do that: I will not pull this into the 0.9.x branch, I think. So it'd have to be re-written in the 2.x branch.

@sam2x
Copy link
Collaborator Author

sam2x commented Mar 22, 2018

@maritz Before looking for 2.x code, Is there any importants points/directions to consider like breaking API, major change, current state of 2.x (is that usable right now?) I would like to start using 2.x and submit PR. May also start to check if others redis client works "out of the box" (or with minors modifications).

@maritz
Copy link
Owner

maritz commented Mar 22, 2018

I've been using v2 in my own small projects, so I'd say it is mostly usable. There are a few things not yet implemented properly in v2, like custom validation files.

There are quite a few breaking changes. The biggest one for you as a contributor: it's Typescript now.

The docs are my main focus right now, I have a branch with most things updated to v2 and a mostly-complete changelog of breaking changes. Haven't pushed it yet, because I'm currently only working on that while in the train where I don't have internet. I'm hoping to have that finished and pushed sometime in the next week.

The tests are all converted to v2, so if in doubt you can look at them.

@maritz maritz added the Feature label Mar 22, 2018
@maritz maritz added the v2.1 label Mar 22, 2018
@maritz maritz added this to the 2.1 milestone Mar 22, 2018
@zzswang
Copy link

zzswang commented Jul 5, 2019

Do we already have this feature in v2?

@maritz
Copy link
Owner

maritz commented Jul 5, 2019

Do we already have this feature in v2?

No. This PR would have to be converted to typescript and included into the current code.

I'm also not sure if I want this in the main code, to be honest. It uses a lot of KEYS commands which is not really something you're supposed to do.

To quote redis.io:

Warning: consider KEYS as a command that should only be used in production environments with extreme care. It may ruin performance when it is executed against large databases. This command is intended for debugging and special operations, such as changing your keyspace layout. Don't use KEYS in your regular application code. If you're looking for a way to find keys in a subset of your keyspace, consider using SCAN or sets.

Implementing this in a better way would probably be a bit more difficult, so making it a different layer seems reasonable to me.

If this is just about having a fuzzy search option, I would recommend an external indexing method. I've previously used https://github.com/tj/reds with good results. More comprehensive full-text search solutions exist but are usually more involved and completely out of scope here.

@zzswang
Copy link

zzswang commented Jul 6, 2019

Yes, previously we use SCAN to implement regex search.

I'll take a look at reds, thanks.

@zzswang
Copy link

zzswang commented Jul 6, 2019

@maritz

Any suggestions about how to integrate with external indexing method.

How to execute Nohm's find query within a given ids array ? For supporting pagination.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants