-
Notifications
You must be signed in to change notification settings - Fork 490
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updated tsv metadata blocks for Beta #485
Comments
Original Redmine Comment It sounds like Eleni might need to fill in more information for coverage.Spectral.Wavelength, for example, which doesn't have a fieldType: https://github.com/IQSS/dataverse/blob/2948e6710b2c9a9dcae216bdcb0d9f1798f09a60/scripts/api/data/metadatablocks/astrophysics.tsv In code I've written, I call anything without a fieldType an "odd" field and just make it a text_en field:
So I'd really appreciate it if fieldType is always filled in. We should probably put a check in parseDatasetField ( dsf.setFieldType(values[5]) ) for this: dataverse/src/main/java/edu/harvard/iq/dataverse/api/DatasetFieldServiceApi.java Line 212 in 9b5c634
|
Original Redmine Comment Fixed missing FieldTypes and missing metadatablock_id. Sending back to Phil to run schema.xml. |
Original Redmine Comment Eleni Castro wrote:
I can't re-generate a schema.xml because loading of the metablocks is throwing exceptions. I'm passing this to Gustavo since he works on all the code to load up metadata blocks from tsv. To give some more detail, I pulled in this commit... Fixed missing field types and metadatablock_id · e15a9bd · IQSS/dataverse - e15a9bd ... and tried running scripts/api/datasetfields.sh It failed with exceptions in DatasetFieldServiceApi.java. The line numbers in the exceptions below correspond too this version of the file:
Exceptions from server.log:
|
Original Redmine Comment Found issue that under #controlledvocabulary in astrophysics.tsv the DatasetFieldValue did not match any of the actual dataset field names. Changed "type" to "astroType" and committed changes in git. Also removed #controlledvocabulary row in social_science.tsv since it is currently not being used for this block. Ready for Phil to try again! Fingers crossed. |
Original Redmine Comment I'm not longer seeing exceptions when importing the TSVs. I did a little sanity checking by creating a dataset and as long as you have the new Solr schema.xml in this commit, I don't expect anyone to have any problems: bring Solr schema.xml up to date with bb178ce #3900 · 2ce1c4f · IQSS/dataverse - 2ce1c4f Obviously, if anyone does see problems related to indexing, please pass this ticket to me. Please note that I recently ( https://redmine.hmdc.harvard.edu/issues/3783#note-10 ) learned about that "int" and "float" are being added to the Google Spreadsheet, but I have not yet changed any of the logic to accommodate this. That is to say, anything marked as "int" or "float" in the Google Speadsheet will be indexed as "text_en" (English text) in Solr. Some day, I'd like to actually index these properly as ints and floats in Solr so we can do range queries, as mentioned (sort of) in #3816 and #3478. We could probably use help from Gus or other astro folks on what the UI should look like (if someone could make a separate ticket for this, that would be great. For now let's use this Google Doc: https://docs.google.com/a/harvard.edu/document/d/19DB4heSUMTm2CNTJFjt9AUDfaaZUZvdoh1QGhedXWkk/edit?usp=sharing ). Passing to QA. Again, please test with this version of schema.xml (or newer) in place: https://github.com/IQSS/dataverse/blob/2ce1c4f7876af206913df36f4703e38026cb7596/conf/solr/4.6.0/schema.xml |
Original Redmine Comment Tested on 5/1 (mostly) Everything works as described. The contributor facets are in the list but could not test due to another bug about not enabling facets. Saw a minor issue where Wavelength Min was not appearing on the search results card but max and central wavelength. Will speak with Phil. http://dvn-build.hmdc.harvard.edu/dataverse.xhtml?id=41&q=Wavelength+Min FITS fields are not being added to dataset metadata because field names changed: astrotype, astrofacility, astroinstrument. Leonid is aware of this and will add it to his FITS indexing ticket. |
Original Redmine Comment Tested on 5/6 All things work with the exception of a solr exception on multiple instances of multi value date fields. Opened as a separate ticket. Also, as mentioned, fits ingest will be tested separately when ready. Closing ticket |
Author Name: Eleni Castro (@posixeleni)
Original Redmine Issue: 3900, https://redmine.hmdc.harvard.edu/issues/3900
Original Date: 2014-04-29
Original Assignee: Kevin Condon
Updated the following TSV metadata blocks for Beta, which will need schema.xml to be run and then QA on the following changes:
** changed descriptions to be more uniform definitions rather than instructions including:
*** updated Subject to have a more helpful definition for users (Issue Improve reporting of ingest failures #3769)
*** updated Contact to say that email will not show.
** moved Data Sources into Social Science block
** added watermark for Contributor Name
** created compound fields for "Time Period Covered" and "Date of Collection" also added watermarks YYYY-MM-DD for Start and End
** made Contributor Name and Contributor Type Facetable and Advance Searchable (boolean TRUE)
** removed Contributor Logo URL and Contributor URL (based on Issue invalid schema and metadataNamespace fields in OAI-PMH ListMetadataFormats response #3621)
** removed compound fields for: Coverage, and Redshift.
** Made new compounds for fields that would be treated as ranges (dates, units).
*** coverage.Spectral.Wavelength
*** coverage.Temporal
*** coverage.RedshiftValue
** For field name disambiguation I changed the field names (field titles remain the same):
*** astroType, astroFacility, astroInstrument
For QA: I'm assuming they will firstly need to confirm that the fields are there and working as designed but for astro especially due to the special ingest feature we have for FITS will need to test with actual data to make sure it works ok. I think it would be good to have Gus or some other astro people to look these over to make sure the data is displaying as expected.
Related issue(s): #72, #368, #495
Redmine related issue(s): 3480, 3783, 3910
The text was updated successfully, but these errors were encountered: