Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Search: Datasets not appearing in results but should (Solr schema.xml copyField) #174

Closed
eaquigley opened this issue Jul 9, 2014 · 4 comments
Assignees
Labels
Type: Bug a defect

Comments

@eaquigley
Copy link
Contributor


Author Name: Kevin Condon (@kcondon)
Original Redmine Issue: 3586, https://redmine.hmdc.harvard.edu/issues/3586
Original Date: 2014-02-25
Original Assignee: Elda Sotiri


From 2/21/14 Smoke test

Search: Datasets not appearing in results but can see them in show children (and thy have a title that should match search)


Related issue(s): #72, #191
Redmine related issue(s): 3480, 3603


@eaquigley
Copy link
Contributor Author


Original Redmine Comment
Author Name: Philip Durbin (@pdurbin)
Original Date: 2014-02-28T03:30:00Z


I think I need to narrow the scope of this ticket a bit... I'd like to have it focus on the importance of using a custom schema.xml for Solr now that we are getting away from hard coded Solr fields and moving ever more toward dynamic fields, especially for datasets.

As promised I updated the dev guide to explain where to get the new Dataverse-specific schema.xml file. I'll paste the relevant content below.

I also adjusted the autocomplete to operate not on the statically defined "SearchFields.NAME" field that's used for dataverse and filenames but rather the catchall "text" field. Doing this will illustrate at autocomplete time that more Solr documents are matching thanks to the updated schema.xml, which copies dynamic fields into the Solr "catchall" "text" field.

Only a few lines were added to schema.xml. Here they are:

   <!-- Added for Dataverse 4.0 alpha 1 for dynamic datasetfields  -->
   <!-- https://redmine.hmdc.harvard.edu/issues/3586 -->
   <copyField source="*_s" dest="text" maxChars="3000"/>
   <copyField source="*_ss" dest="text" maxChars="3000"/>
   <copyField source="*_i" dest="text" maxChars="3000"/>


And here's the updated part of the dev guide.

$ grep xvfz doc/Sphinx/source/Developers/dev4.rst -B4 -A6

Download solr-4.6.0.tgz from http://lucene.apache.org/solr/ to any directory you like but in the example below, we have downloaded the tarball to a directory called "solr" in our home directory. For now we are using the "example" template but we are replacing schema.xml with our own.

  • cd ~/solr
  • tar xvfz solr-4.6.0.tgz
  • cd solr-4.6.0/example
  • cp ~/NetBeansProjects/dataverse_temp/conf/solr/4.6.0/schema.xml solr/collection1/conf/schema.xml
  • java -jar start.jar

Please note: If you prefer, once the proper schema.xml file is in place, you can simply double-click "start.jar" rather that running java -jar start.jar from the command line.

If we could focus this ticket on getting this initial schema.xml in place that would be my preference. We can continue to improve search in #3453 (relevance ticket) and new tickets.

Also, the ticket for the final schema.xml is here: #3480.

The commit: 4088e49 require schema.xml to improve search results on dynamic fields

@eaquigley eaquigley added this to the Dataverse 4.0: Beta 1 milestone Jul 9, 2014
@eaquigley
Copy link
Contributor Author


Original Redmine Comment
Author Name: Philip Durbin (@pdurbin)
Original Date: 2014-03-04T19:22:00Z


I'm stealing this ticket back because issues identified in #3603 will likely result in me changing how indexing works and possibly the schema.xml.

@eaquigley
Copy link
Contributor Author


Original Redmine Comment
Author Name: Gustavo Durand (@scolapasta)
Original Date: 2014-03-14T18:04:45Z


We think this should be all set for now.

@eaquigley
Copy link
Contributor Author


Original Redmine Comment
Author Name: Elda Sotiri (@esotiri)
Original Date: 2014-03-24T18:34:15Z


datasets are part of search result

janvanmansum pushed a commit to janvanmansum/dataverse that referenced this issue Oct 10, 2023
* FD-7423. Fix upload zip with backslashes

* Fix upload zip with backslashes
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Bug a defect
Projects
None yet
Development

No branches or pull requests

2 participants