Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Search API results payload extension #10811

Merged
merged 18 commits into from
Sep 11, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
18 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions conf/solr/schema.xml
Original file line number Diff line number Diff line change
Expand Up @@ -142,6 +142,7 @@

<field name="dvName" type="text_en" stored="true" indexed="true" multiValued="false"/>
<field name="dvAlias" type="text_en" stored="true" indexed="true" multiValued="false"/>
<field name="dvParentAlias" type="text_en" stored="true" indexed="true" multiValued="false"/>
<field name="dvAffiliation" type="text_en" stored="true" indexed="true" multiValued="false"/>
<field name="dvDescription" type="text_en" stored="true" indexed="true" multiValued="false"/>

Expand Down
52 changes: 52 additions & 0 deletions doc/release-notes/10810-search-api-payload-extensions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
Search API (/api/search) response will now include new fields for the different entities.

For Dataverse:

- "affiliation"
- "parentDataverseName"
- "parentDataverseIdentifier"
- "image_url" (optional)

```javascript
"items": [
{
"name": "Darwin's Finches",
...
"affiliation": "Dataverse.org",
"parentDataverseName": "Root",
"parentDataverseIdentifier": "root",
"image_url":"data:image/png;base64,iVBORw0..."
(etc, etc)
```

For DataFile:

- "releaseOrCreateDate"
- "image_url" (optional)

```javascript
"items": [
{
"name": "test.txt",
...
"releaseOrCreateDate": "2016-05-10T12:53:39Z",
"image_url":"data:image/png;base64,iVBORw0..."
(etc, etc)
```

For Dataset:

- "image_url" (optional)

```javascript
"items": [
{
...
"image_url": "http://localhost:8080/api/datasets/2/logo"
...
(etc, etc)
```

The image_url field was already part of the SolrSearchResult JSON (and incorrectly appeared in Search API documentation), but it wasn’t returned by the API because it was appended only after the Solr query was executed in the SearchIncludeFragment of JSF. Now, the field is set in SearchServiceBean, ensuring it is always returned by the API when an image is available.

The schema.xml file for Solr has been updated to include a new field called dvParentAlias for supporting the new response field "parentDataverseIdentifier". So for the next Dataverse released version, a Solr reindex will be necessary to apply the new schema.xml version.
42 changes: 33 additions & 9 deletions doc/sphinx-guides/source/api/search.rst
Original file line number Diff line number Diff line change
Expand Up @@ -61,25 +61,37 @@ https://demo.dataverse.org/api/search?q=trees
"name":"Trees",
"type":"dataverse",
"url":"https://demo.dataverse.org/dataverse/trees",
"image_url":"https://demo.dataverse.org/api/access/dvCardImage/7",
"image_url":"data:image/png;base64,iVBORw0...",
"identifier":"trees",
"description":"A tree dataverse with some birds",
"published_at":"2016-05-10T12:53:38Z"
"published_at":"2016-05-10T12:53:38Z",
"publicationStatuses": [
"Published"
],
"affiliation": "Dataverse.org",
"parentDataverseName": "Root",
"parentDataverseIdentifier": "root"
},
{
"name":"Chestnut Trees",
"type":"dataverse",
"url":"https://demo.dataverse.org/dataverse/chestnuttrees",
"image_url":"https://demo.dataverse.org/api/access/dvCardImage/9",
"image_url":"data:image/png;base64,iVBORw0...",
"identifier":"chestnuttrees",
"description":"A dataverse with chestnut trees and an oriole",
"published_at":"2016-05-10T12:52:38Z"
"published_at":"2016-05-10T12:52:38Z",
"publicationStatuses": [
"Published"
],
"affiliation": "Dataverse.org",
"parentDataverseName": "Root",
"parentDataverseIdentifier": "root"
},
{
"name":"trees.png",
"type":"file",
"url":"https://demo.dataverse.org/api/access/datafile/12",
"image_url":"https://demo.dataverse.org/api/access/fileCardImage/12",
"image_url":"data:image/png;base64,iVBORw0...",
"file_id":"12",
"description":"",
"published_at":"2016-05-10T12:53:39Z",
Expand All @@ -91,16 +103,26 @@ https://demo.dataverse.org/api/search?q=trees
"dataset_name": "Dataset One",
"dataset_id": "32",
"dataset_persistent_id": "doi:10.5072/FK2/XTT5BV",
"dataset_citation":"Spruce, Sabrina, 2016, \"Spruce Goose\", http://dx.doi.org/10.5072/FK2/XTT5BV, Root Dataverse, V1"
"dataset_citation":"Spruce, Sabrina, 2016, \"Spruce Goose\", http://dx.doi.org/10.5072/FK2/XTT5BV, Root Dataverse, V1",
"publicationStatuses": [
"Published"
],
"releaseOrCreateDate": "2016-05-10T12:53:39Z"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Files do not get parentDataverseIdentifier, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right! Just for Dataverses

},
{
"name":"Birds",
"type":"dataverse",
"url":"https://demo.dataverse.org/dataverse/birds",
"image_url":"https://demo.dataverse.org/api/access/dvCardImage/2",
"image_url":"data:image/png;base64,iVBORw0...",
"identifier":"birds",
"description":"A bird Dataverse collection with some trees",
"published_at":"2016-05-10T12:57:27Z"
"published_at":"2016-05-10T12:57:27Z",
"publicationStatuses": [
"Published"
],
"affiliation": "Dataverse.org",
"parentDataverseName": "Root",
"parentDataverseIdentifier": "root"
},
{
"name":"Darwin's Finches",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Datasets can have identifier_of_dataverse but not parentDataverseIdentifier, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. parentDataverseIdentifier is only for Dataverses.

Expand Down Expand Up @@ -151,6 +173,8 @@ https://demo.dataverse.org/api/search?q=trees
}
}

Note that the image_url field, if exists, will be returned as a regular URL for Datasets, while for Files and Dataverses, it will be returned as a Base64 URL. We plan to standardize this behavior so that the field always returns a regular URL. (See: https://github.com/IQSS/dataverse/issues/10831)

.. _advancedsearch-example:

Advanced Search Examples
Expand Down Expand Up @@ -178,7 +202,7 @@ In this example, ``show_relevance=true`` matches per field are shown. Available
"name":"Finches",
"type":"dataverse",
"url":"https://demo.dataverse.org/dataverse/finches",
"image_url":"https://demo.dataverse.org/api/access/dvCardImage/3",
"image_url":"data:image/png;base64,iVBORw0...",
"identifier":"finches",
"description":"A Dataverse collection with finches",
"published_at":"2016-05-10T12:57:38Z",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,10 @@ public String getFileCardImageAsBase64Url(SolrSearchResult result) {
if (result.isHarvested()) {
return null;
}

if (result.getEntity() == null) {
return null;
}

Long imageFileId = result.getEntity().getId();

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -278,6 +278,7 @@ public Future<String> indexDataverse(Dataverse dataverse, boolean processPaths)
if (dataverse.getOwner() != null) {
solrInputDocument.addField(SearchFields.PARENT_ID, dataverse.getOwner().getId());
solrInputDocument.addField(SearchFields.PARENT_NAME, dataverse.getOwner().getName());
solrInputDocument.addField(SearchFields.DATAVERSE_PARENT_ALIAS, dataverse.getOwner().getAlias());
}
}
List<String> dataversePathSegmentsAccumulator = new ArrayList<>();
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -94,6 +94,7 @@ public class SearchFields {
public static final String UNF = "unf";
public static final String DATAVERSE_NAME = "dvName";
public static final String DATAVERSE_ALIAS = "dvAlias";
public static final String DATAVERSE_PARENT_ALIAS = "dvParentAlias";
public static final String DATAVERSE_AFFILIATION = "dvAffiliation";
public static final String DATAVERSE_DESCRIPTION = "dvDescription";
public static final String DATAVERSE_CATEGORY = "dvCategory";
Expand Down
Original file line number Diff line number Diff line change
@@ -1,14 +1,6 @@
package edu.harvard.iq.dataverse.search;

import edu.harvard.iq.dataverse.DataFile;
import edu.harvard.iq.dataverse.DatasetFieldConstant;
import edu.harvard.iq.dataverse.DatasetFieldServiceBean;
import edu.harvard.iq.dataverse.DatasetFieldType;
import edu.harvard.iq.dataverse.DatasetVersionServiceBean;
import edu.harvard.iq.dataverse.Dataverse;
import edu.harvard.iq.dataverse.DataverseFacet;
import edu.harvard.iq.dataverse.DataverseMetadataBlockFacet;
import edu.harvard.iq.dataverse.DvObjectServiceBean;
import edu.harvard.iq.dataverse.*;
import edu.harvard.iq.dataverse.authorization.groups.Group;
import edu.harvard.iq.dataverse.authorization.groups.GroupServiceBean;
import edu.harvard.iq.dataverse.authorization.users.AuthenticatedUser;
Expand Down Expand Up @@ -40,6 +32,7 @@
import jakarta.ejb.EJBTransactionRolledbackException;
import jakarta.ejb.Stateless;
import jakarta.ejb.TransactionRolledbackLocalException;
import jakarta.inject.Inject;
import jakarta.inject.Named;
import jakarta.persistence.NoResultException;
import org.apache.solr.client.solrj.SolrQuery;
Expand Down Expand Up @@ -78,6 +71,8 @@ public class SearchServiceBean {
SystemConfig systemConfig;
@EJB
SolrClientService solrClientService;
@Inject
ThumbnailServiceWrapper thumbnailServiceWrapper;

/**
* Import note: "onlyDatatRelatedToMe" relies on filterQueries for providing
Expand Down Expand Up @@ -501,11 +496,14 @@ public SolrQueryResponse search(
String dvTree = (String) solrDocument.getFirstValue(SearchFields.SUBTREE);
String identifierOfDataverse = (String) solrDocument.getFieldValue(SearchFields.IDENTIFIER_OF_DATAVERSE);
String nameOfDataverse = (String) solrDocument.getFieldValue(SearchFields.DATAVERSE_NAME);
String dataverseAffiliation = (String) solrDocument.getFieldValue(SearchFields.DATAVERSE_AFFILIATION);
String dataverseParentAlias = (String) solrDocument.getFieldValue(SearchFields.DATAVERSE_PARENT_ALIAS);
String dataverseParentName = (String) solrDocument.getFieldValue(SearchFields.PARENT_NAME);
Long embargoEndDate = (Long) solrDocument.getFieldValue(SearchFields.EMBARGO_END_DATE);
Long retentionEndDate = (Long) solrDocument.getFieldValue(SearchFields.RETENTION_END_DATE);
//
Boolean datasetValid = (Boolean) solrDocument.getFieldValue(SearchFields.DATASET_VALID);

List<String> matchedFields = new ArrayList<>();

SolrSearchResult solrSearchResult = new SolrSearchResult(query, name);
Expand Down Expand Up @@ -592,10 +590,10 @@ public SolrQueryResponse search(
if (type.equals("dataverses")) {
solrSearchResult.setName(name);
solrSearchResult.setHtmlUrl(baseUrl + SystemConfig.DATAVERSE_PATH + identifier);
// Do not set the ImageUrl, let the search include fragment fill in
// the thumbnail, similarly to how the dataset and datafile cards
// are handled.
//solrSearchResult.setImageUrl(baseUrl + "/api/access/dvCardImage/" + entityid);
solrSearchResult.setDataverseAffiliation(dataverseAffiliation);
solrSearchResult.setDataverseParentAlias(dataverseParentAlias);
solrSearchResult.setDataverseParentName(dataverseParentName);
solrSearchResult.setImageUrl(thumbnailServiceWrapper.getDataverseCardImageAsBase64Url(solrSearchResult));
/**
* @todo Expose this API URL after "dvs" is changed to
* "dataverses". Also, is an API token required for published
Expand All @@ -605,6 +603,7 @@ public SolrQueryResponse search(
} else if (type.equals("datasets")) {
solrSearchResult.setHtmlUrl(baseUrl + "/dataset.xhtml?globalId=" + identifier);
solrSearchResult.setApiUrl(baseUrl + "/api/datasets/" + entityid);
solrSearchResult.setImageUrl(thumbnailServiceWrapper.getDatasetCardImageAsUrl(solrSearchResult));
//Image url now set via thumbnail api
//solrSearchResult.setImageUrl(baseUrl + "/api/access/dsCardImage/" + datasetVersionId);
// No, we don't want to set the base64 thumbnails here.
Expand Down Expand Up @@ -653,6 +652,7 @@ public SolrQueryResponse search(
}
solrSearchResult.setHtmlUrl(baseUrl + "/dataset.xhtml?persistentId=" + parentGlobalId);
solrSearchResult.setDownloadUrl(baseUrl + "/api/access/datafile/" + entityid);
solrSearchResult.setImageUrl(thumbnailServiceWrapper.getFileCardImageAsBase64Url(solrSearchResult));
/**
* @todo We are not yet setting the API URL for files because
* not all files have metadata. Only subsettable files (those
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -95,6 +95,7 @@ public class SolrSearchResult {
private String fileChecksumValue;
private String dataverseAlias;
private String dataverseParentAlias;
private String dataverseParentName;
// private boolean statePublished;
/**
* @todo Investigate/remove this "unpublishedState" variable. For files that
Expand Down Expand Up @@ -504,8 +505,11 @@ public JsonObjectBuilder json(boolean showRelevance, boolean showEntityIds, bool

// displayName = null; // testing NullSafeJsonBuilder
// because we are using NullSafeJsonBuilder key/value pairs will be dropped if the value is null
NullSafeJsonBuilder nullSafeJsonBuilder = jsonObjectBuilder().add("name", displayName)
.add("type", getDisplayType(getType())).add("url", preferredUrl).add("image_url", getImageUrl())
NullSafeJsonBuilder nullSafeJsonBuilder = jsonObjectBuilder()
.add("name", displayName)
.add("type", getDisplayType(getType()))
.add("url", preferredUrl)
.add("image_url", getImageUrl())
// .add("persistent_url", this.persistentUrl)
// .add("download_url", this.downloadUrl)
/**
Expand Down Expand Up @@ -536,7 +540,8 @@ public JsonObjectBuilder json(boolean showRelevance, boolean showEntityIds, bool
* @todo Expose MIME Type:
* https://github.com/IQSS/dataverse/issues/1595
*/
.add("file_type", this.filetype).add("file_content_type", this.fileContentType)
.add("file_type", this.filetype)
.add("file_content_type", this.fileContentType)
.add("size_in_bytes", getFileSizeInBytes())
/**
* "md5" was the only possible value so it's hard-coded here but
Expand All @@ -545,12 +550,18 @@ public JsonObjectBuilder json(boolean showRelevance, boolean showEntityIds, bool
*/
.add("md5", getFileMd5())
.add("checksum", JsonPrinter.getChecksumTypeAndValue(getFileChecksumType(), getFileChecksumValue()))
.add("unf", getUnf()).add("file_persistent_id", this.filePersistentId).add("dataset_name", datasetName)
.add("dataset_id", datasetId).add("publisher", publisherName)
.add("dataset_persistent_id", datasetPersistentId).add("dataset_citation", datasetCitation)
.add("deaccession_reason", this.deaccessionReason).add("citationHtml", this.citationHtml)
.add("unf", getUnf())
.add("file_persistent_id", this.filePersistentId)
.add("dataset_name", datasetName)
.add("dataset_id", datasetId)
.add("publisher", publisherName)
.add("dataset_persistent_id", datasetPersistentId)
.add("dataset_citation", datasetCitation)
.add("deaccession_reason", this.deaccessionReason)
.add("citationHtml", this.citationHtml)
.add("identifier_of_dataverse", this.identifierOfDataverse)
.add("name_of_dataverse", this.nameOfDataverse).add("citation", this.citation);
.add("name_of_dataverse", this.nameOfDataverse)
.add("citation", this.citation);
// Now that nullSafeJsonBuilder has been instatiated, check for null before adding to it!
if (showRelevance) {
nullSafeJsonBuilder.add("matches", getRelevance());
Expand Down Expand Up @@ -668,6 +679,15 @@ public JsonObjectBuilder json(boolean showRelevance, boolean showEntityIds, bool

nullSafeJsonBuilder.add("metadataBlocks", metadataFieldBuilder);
}
} else if (this.entity.isInstanceofDataverse()) {
nullSafeJsonBuilder.add("affiliation", dataverseAffiliation);
nullSafeJsonBuilder.add("parentDataverseName", dataverseParentName);
nullSafeJsonBuilder.add("parentDataverseIdentifier", dataverseParentAlias);
} else if (this.entity.isInstanceofDataFile()) {
// "published_at" field is only set when the version state is not draft.
// On the contrary, this field also takes into account DataFiles in draft version,
// returning the creation date if the DataFile is not published, or the publication date otherwise.
nullSafeJsonBuilder.add("releaseOrCreateDate", getFormattedReleaseOrCreateDate());
}
}

Expand Down Expand Up @@ -747,11 +767,15 @@ private Map<String, List<String>> computeRequestedMetadataFieldMapNames(List<Str
private String getDateTimePublished() {
String datePublished = null;
if (draftState == false) {
datePublished = releaseOrCreateDate == null ? null : Util.getDateTimeFormat().format(releaseOrCreateDate);
datePublished = getFormattedReleaseOrCreateDate();
}
return datePublished;
}

private String getFormattedReleaseOrCreateDate() {
return releaseOrCreateDate == null ? null : Util.getDateTimeFormat().format(releaseOrCreateDate);
}

public String getId() {
return id;
}
Expand Down Expand Up @@ -1223,6 +1247,13 @@ public void setDataverseParentAlias(String dataverseParentAlias) {
this.dataverseParentAlias = dataverseParentAlias;
}

/**
* @param dataverseParentName the dataverseParentName to set
*/
public void setDataverseParentName(String dataverseParentName) {
this.dataverseParentName = dataverseParentName;
}

public float getScore() {
return score;
}
Expand Down
Loading
Loading