Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using a singular term and not getting plurals of it in the results #444

Closed
eaquigley opened this issue Jul 9, 2014 · 2 comments
Closed
Assignees
Labels
Type: Feature a feature request

Comments

@eaquigley
Copy link
Contributor


Author Name: Elizabeth Quigley (@eaquigley)
Original Redmine Issue: 3859, https://redmine.hmdc.harvard.edu/issues/3859
Original Date: 2014-04-15
Original Assignee: Elda Sotiri


If a user searches for tree, they only see results for tree when they should be seeing results for tree and trees.


Related issue(s): #46, #326, #445
Redmine related issue(s): 2967, 3453, 3741, 3860


@eaquigley
Copy link
Contributor Author


Original Redmine Comment
Author Name: Philip Durbin (@pdurbin)
Original Date: 2014-04-29T14:01:33Z


Yesterday, Gustavo, Liz, and I discussed the need to improve English-language searches and we decided that for beta at least, it's ok to make search more tied to English. Previously, we were using text_general but as of the commit below we are using text_en, and you can see in scripts/search/tests/expected/highlighting-nick-trees and scripts/search/tests/expected/highlighting-pete-trees that we are getting more matches on variations of English words (e.g. trees vs. tree):

9b5c634#diff-91dfc90fdc21f4d36a01adbcbf86a954L382

Please note that updating your Solr schema.xml is required (cp ~/NetBeansProjects/dataverse/conf/solr/4.6.0/schema.xml solr/collection1/conf/schema.xml), as described at https://github.com/IQSS/dataverse/blob/master/doc/Sphinx/source/Developers/dev-main.rst

From what I understand solr/collection1/conf/lang/stopwords_en.txt is now being used (according to

<!-- A text field with defaults appropriate for English: it
), so we might want to move #3860 to QA as well.

It seems like we now have better matching for "Pete" vs. "Pete's" as well, but I didn't spend a lot of time testing this.

Finally, I wanted to mention that at one point I was getting "org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException: All checkers need to use the same Analyzer" and the fix was to disabling the spell checking code but later I moved move fields over from text_general to text_en so in the end I did not disable the spell checking code. We might even want to re-evaluate the spellcheck/"did you mean" feature in #2967 now that English search is seemingly improved.

I know we want to support multiple language so I left a note in the code to think more about the implications of switching from text_general to text_en: 9b5c634#diff-e15254291ddb5024c56b68a2886af250R50 . After Beta, we should consider how to make this configurable.

Passing to QA.

@eaquigley
Copy link
Contributor Author


Original Redmine Comment
Author Name: Elda Sotiri (@esotiri)
Original Date: 2014-04-29T21:00:11Z


search for child, brings child & childs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Feature a feature request
Projects
None yet
Development

No branches or pull requests

2 participants