-
-
Notifications
You must be signed in to change notification settings - Fork 784
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Match all tokens should look in different keys as well. #235
Comments
Interesting. Need think on this some more 😅. |
This would be awesome! =) Opens a lot of new possibilities!! |
I've hit the same problem and am currently working around this limitation by using my own getFn which simply bangs all fields together into a single string ... this requires another hack, where I allow multiple fields to be specified in a single key, separated by ",". Yes, an ugly hack, but gets the job done. |
Why do you think it's an ugly hack? That's something that could potentially become a feature. |
One does not preclude the other... ;) |
Show us the code! |
One downside I see is that reporting hit ranges will not work, because these relate to the virtual combined item, not the real key items. |
Could really use this feature for my current project! Would love to see it implemented. |
This would be really helpful |
This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days |
Any update on this? I think searching across fields would be a common task like the examples stated above by @masimplo and I'm not sure if my solution would be a recommended implementation based on best practices. Between @thediveo's solution with the getFn and this on StackOverflow https://stackoverflow.com/questions/51187276/fuse-js-retrieve-records-which-exactly-match-a-multiple-word-search that adds a keywords field as an array of the desired search fields, the solution that worked for me was a combination. I memoized my search list with a new field "keywords" as an array of the search fields concatenated by a space in different orders. Example:
would be defined to Fuse instead as
by doing
I found that an array of just the search fields without concatenating them did not work when I tried it. For example, it did not work when I did this:
I found that my search word order would matter without having the order combinations in the keywords array.
In my opinion, this only works for minimal fields as order combinations would grow exponentially, and, Also, I'm not sure if there are any Fuse options that would have allowed my solution to work differently, but they were left as their defaults. |
For anyone who ends up here from a google search or otherwise, we ended up taking a different approach in sparkletown/sparkle#1460 (thanks to @yarikoptic's awesome work debugging, exploring, and refining this) We basically split our search query using regex ( /**
* Split the provided string by spaces (ignoring spaces within "quoted text") into an array of tokens.
*
* @param string
*
* @see https://stackoverflow.com/a/16261693/1265472
*
* @debt Depending on the outcome of https://github.com/github/codeql/issues/5964 we may end up needing to change
* this regex for performance reasons.
*/
export const tokeniseStringWithQuotesBySpaces = (string: string): string[] =>
string.match(/("[^"]*?"|[^"\s]+)+(?=\s*|\s*$)/g) ?? []; (Note: Please check github/codeql#5964 as the regex may have a ReDoS vulnerability, but it also might just be a false positive in the CodeQL scanner) With our standard Fuse config: new Fuse(filteredPosterVenues, {
keys: [
"name",
"poster.title",
"poster.authorName",
"poster.categories",
],
threshold: 0.2, // 0.1 seems to be exact, default 0.6: brings too distant if anyhow related hits
ignoreLocation: true, // default False: True - to search ignoring location of the words.
findAllMatches: true,
}), But then use our const tokenisedSearchQuery = tokeniseStringWithQuotesBySpaces(
normalizedSearchQuery
);
if (tokenisedSearchQuery.length === 0) return filteredPosterVenues;
return fuseVenues
.search({
$and: tokenisedSearchQuery.map((searchToken: string) => {
const orFields: Fuse.Expression[] = [
{ name: searchToken },
{ "poster.title": searchToken },
{ "poster.authorName": searchToken },
{ "poster.categories": searchToken },
];
return {
$or: orFields,
};
}),
})
.map((fuseResult) => fuseResult.item); This seems to work pretty effectively for our needs from my testing of it all today. |
Note to anyone using the solution above by @0xdevalias: There is a bug in scoring when searching the same token across multiple keys because the FIRST match's score is used, not the BEST match. I fixed, tested, and wrote more about the bug here: #593 Edit: |
Is this lib not maintained anymore? I cant believe that 4 years passed and this is still not implemented, this issue is even not open lol. I think proposed solutions, when you have to assemble keyword field yourself, really defeats the purpose of consuming this lib at all. |
Another 2 years passed by and I guess it's still not there |
From what I understand using tokenize with match all tokens currently forces a match of all tokens on the same key. Would it be possible to force a match of all tokens across any keys of a single match?
e.g.
searching for
Old War
will match this record, butOld War John
will not.This would be useful in searches where you want two or more attributes of a record to match.
i.e.
There should be a way for a search to return only the first record when search for "lawyer france"
The text was updated successfully, but these errors were encountered: