Skip to content

Commit

Permalink
Merge branch 'IQSS:develop' into develop
Browse files Browse the repository at this point in the history
  • Loading branch information
lubitchv authored Jul 20, 2023
2 parents 63eb6af + 8f882b7 commit 01788fb
Show file tree
Hide file tree
Showing 124 changed files with 4,855 additions and 1,384 deletions.
2 changes: 2 additions & 0 deletions doc/release-notes/5042-add-mydata-doc-api.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
An API named 'MyData' is supported by Dataverse. A documentation has been added describing its use (PR #9596)
This API is used to get a list of only the objects (datasets, dataverses or datafiles) that an authenticated user can modify.
3 changes: 3 additions & 0 deletions doc/release-notes/8889-filepids-in-collections.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
It is now possible to configure registering PIDs for files in individual collections.

For example, registration of PIDs for files can be enabled in a specific collection when it is disabled instance-wide. Or it can be disabled in specific collections where it is enabled by default. See the [:FilePIDsEnabled](https://guides.dataverse.org/en/latest/installation/config.html#filepidsenabled) section of the Configuration guide for details.
4 changes: 4 additions & 0 deletions doc/release-notes/9331-extract-bounding-box.md
Original file line number Diff line number Diff line change
@@ -1 +1,5 @@
An attempt will be made to extract a geospatial bounding box (west, south, east, north) from NetCDF and HDF5 files and then insert these values into the geospatial metadata block, if enabled.

The following JVM setting has been added:

- dataverse.netcdf.geo-extract-s3-direct-upload
1 change: 1 addition & 0 deletions doc/release-notes/9387-system_metadata_blocks.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Dataverse supports requiring a secret key to add or edit metadata in specified 'system' metadata blocks. Changing the metadata in such system metadata blocks is not allowed without the key and is currently only allowed via API.
4 changes: 4 additions & 0 deletions doc/release-notes/9431-checksum-alg-in-direct-uploads.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
Direct upload via the Dataverse UI will now support any algorithm configured via the :FileFixityChecksumAlgorithm setting.
External apps using the direct upload API can now query Dataverse to discover which algorithm should be used.

Sites that have been using an algorithm other than MD5 and direct upload and/or dvwebloader may want to use the /api/admin/updateHashValues call (see https://guides.dataverse.org/en/latest/installation/config.html?highlight=updatehashvalues#filefixitychecksumalgorithm) to replace any MD5 hashes on existing files.
1 change: 1 addition & 0 deletions doc/release-notes/9480-h5web.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
A file previewer called H5Web is now available for exploring and visualizing NetCDF and HDF5 files.
6 changes: 6 additions & 0 deletions doc/release-notes/9588-datasets-api-extension.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
The following APIs have been added:

- /api/datasets/summaryFieldNames
- /api/datasets/privateUrlDatasetVersion/{privateUrlToken}
- /api/datasets/privateUrlDatasetVersion/{privateUrlToken}/citation
- /api/datasets/{datasetId}/versions/{version}/citation
5 changes: 5 additions & 0 deletions doc/release-notes/9656-api-optional-dataset-params.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
The following fields are now available in the native JSON output:

- alternativePersistentId
- publicationDate
- citationDate
1 change: 1 addition & 0 deletions doc/release-notes/9667-metadatablocks-api-extension.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
DatasetFieldType attribute "displayFormat", is now returned by the API.
1 change: 1 addition & 0 deletions doc/release-notes/9691-go-client-library.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
A Go client library is now available. See https://preview.guides.gdcc.io/en/develop/api/client-libraries.html
1 change: 1 addition & 0 deletions doc/release-notes/9698-embargo-info-in-ORE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
The OAI_ORE metadata export (and hence the archival Bag for a dataset) now includes information about file embargoes
Original file line number Diff line number Diff line change
Expand Up @@ -2,5 +2,5 @@ Tool Type Scope Description
Data Explorer explore file A GUI which lists the variables in a tabular data file allowing searching, charting and cross tabulation analysis. See the README.md file at https://github.com/scholarsportal/dataverse-data-explorer-v2 for the instructions on adding Data Explorer to your Dataverse.
Whole Tale explore dataset A platform for the creation of reproducible research packages that allows users to launch containerized interactive analysis environments based on popular tools such as Jupyter and RStudio. Using this integration, Dataverse users can launch Jupyter and RStudio environments to analyze published datasets. For more information, see the `Whole Tale User Guide <https://wholetale.readthedocs.io/en/stable/users_guide/integration.html>`_.
Binder explore dataset Binder allows you to spin up custom computing environments in the cloud (including Jupyter notebooks) with the files from your dataset. `Installation instructions <https://github.com/data-exp-lab/girder_ythub/issues/10>`_ are in the Data Exploration Lab girder_ythub project. See also :ref:`binder`.
File Previewers explore file A set of tools that display the content of files - including audio, html, `Hypothes.is <https://hypothes.is/>`_ annotations, images, PDF, text, video, tabular data, spreadsheets, GeoJSON, zip, and NcML files - allowing them to be viewed without downloading the file. The previewers can be run directly from github.io, so the only required step is using the Dataverse API to register the ones you want to use. Documentation, including how to optionally brand the previewers, and an invitation to contribute through github are in the README.md file. Initial development was led by the Qualitative Data Repository and the spreasdheet previewer was added by the Social Sciences and Humanities Open Cloud (SSHOC) project. https://github.com/gdcc/dataverse-previewers
File Previewers explore file A set of tools that display the content of files - including audio, html, `Hypothes.is <https://hypothes.is/>`_ annotations, images, PDF, text, video, tabular data, spreadsheets, GeoJSON, zip, HDF5, NetCDF, and NcML files - allowing them to be viewed without downloading the file. The previewers can be run directly from github.io, so the only required step is using the Dataverse API to register the ones you want to use. Documentation, including how to optionally brand the previewers, and an invitation to contribute through github are in the README.md file. Initial development was led by the Qualitative Data Repository and the spreasdheet previewer was added by the Social Sciences and Humanities Open Cloud (SSHOC) project. https://github.com/gdcc/dataverse-previewers
Data Curation Tool configure file A GUI for curating data by adding labels, groups, weights and other details to assist with informed reuse. See the README.md file at https://github.com/scholarsportal/Dataverse-Data-Curation-Tool for the installation instructions.
Original file line number Diff line number Diff line change
Expand Up @@ -14,14 +14,14 @@
{
"locale":"{localeCode}"
}
],
"allowedApiCalls": [
{
"name":"retrieveDatasetJson",
"httpMethod":"GET",
"urlTemplate":"/api/v1/datasets/{datasetId}",
"timeOut":10
}
]
}
]
},
"allowedApiCalls": [
{
"name":"retrieveDatasetJson",
"httpMethod":"GET",
"urlTemplate":"/api/v1/datasets/{datasetId}",
"timeOut":10
}
]
}
Original file line number Diff line number Diff line change
Expand Up @@ -21,14 +21,14 @@
{
"locale":"{localeCode}"
}
],
"allowedApiCalls": [
{
"name":"retrieveDataFile",
"httpMethod":"GET",
"urlTemplate":"/api/v1/access/datafile/{fileId}",
"timeOut":270
}
]
}
},
"allowedApiCalls": [
{
"name":"retrieveDataFile",
"httpMethod":"GET",
"urlTemplate":"/api/v1/access/datafile/{fileId}",
"timeOut":270
}
]
}
47 changes: 43 additions & 4 deletions doc/sphinx-guides/source/admin/dataverses-datasets.rst
Original file line number Diff line number Diff line change
Expand Up @@ -118,6 +118,28 @@ Creates a link between a dataset and a Dataverse collection (see the :ref:`datas

curl -H "X-Dataverse-key: $API_TOKEN" -X PUT http://$SERVER/api/datasets/$linked-dataset-id/link/$linking-dataverse-alias

List Collections that are Linked from a Dataset
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Lists the link(s) created between a dataset and a Dataverse collection (see the :ref:`dataset-linking` section of the User Guide for more information). ::

curl -H "X-Dataverse-key: $API_TOKEN" http://$SERVER/api/datasets/$linked-dataset-id/links

It returns a list in the following format:

.. code-block:: json
{
"status": "OK",
"data": {
"dataverses that link to dataset id 56782": [
"crc990 (id 18802)"
]
}
}
.. _unlink-a-dataset:

Unlink a Dataset
^^^^^^^^^^^^^^^^

Expand All @@ -131,15 +153,32 @@ Mint a PID for a File That Does Not Have One
In the following example, the database id of the file is 42::

export FILE_ID=42
curl http://localhost:8080/api/admin/$FILE_ID/registerDataFile
curl "http://localhost:8080/api/admin/$FILE_ID/registerDataFile"

Mint PIDs for all unregistered published files in the specified collection
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Mint PIDs for Files That Do Not Have Them
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
The following API will register the PIDs for all the yet unregistered published files in the datasets **directly within the collection** specified by its alias::

If you have a large number of files, you might want to consider miniting PIDs for files individually using the ``registerDataFile`` endpoint above in a for loop, sleeping between each registration::
curl "http://localhost:8080/api/admin/registerDataFiles/{collection_alias}"

It will not attempt to register the datafiles in its sub-collections, so this call will need to be repeated on any sub-collections where files need to be registered as well. File-level PID registration must be enabled on the collection. (Note that it is possible to have it enabled for a specific collection, even when it is disabled for the Dataverse installation as a whole. See :ref:`collection-attributes-api` in the Native API Guide.)

This API will sleep for 1 second between registration calls by default. A longer sleep interval can be specified with an optional ``sleep=`` parameter::

curl "http://localhost:8080/api/admin/registerDataFiles/{collection_alias}?sleep=5"

Mint PIDs for ALL unregistered files in the database
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

The following API will attempt to register the PIDs for all the published files in your instance that do not yet have them::

curl http://localhost:8080/api/admin/registerDataFileAll

The application will attempt to sleep for 1 second between registration attempts as not to overload your persistent identifier service provider. Note that if you have a large number of files that need to be registered in your Dataverse, you may want to consider minting file PIDs within indivdual collections, or even for individual files using the ``registerDataFiles`` and/or ``registerDataFile`` endpoints above in a loop, with a longer sleep interval between calls.



Mint a New DOI for a Dataset with a Handle
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Expand Down
61 changes: 61 additions & 0 deletions doc/sphinx-guides/source/admin/metadatacustomization.rst
Original file line number Diff line number Diff line change
Expand Up @@ -95,6 +95,11 @@ Each of the three main sections own sets of properties:
| displayName | Acts as a brief label for display related to this | Should be relatively brief. The limit is 256 character, |
| | #metadataBlock. | but very long names might cause display problems. |
+----------------+---------------------------------------------------------+---------------------------------------------------------+
| displayFacet | Label displayed in the search area when this | Should be brief. Long names will cause display problems |
| | #metadataBlock is configured as a search facet | in the search area. |
| | for a collection. See | |
| | :ref:`the API <metadata-block-facet-api>`. | |
+----------------+---------------------------------------------------------+---------------------------------------------------------+
| blockURI | Associates the properties in a block with an external | The citation #metadataBlock has the blockURI |
| | URI. | https://dataverse.org/schema/citation/ which assigns a |
| | Properties will be assigned the | default global URI to terms such as |
Expand Down Expand Up @@ -452,12 +457,16 @@ metadatablock.name=(the value of **name** property from #metadatablock)

metadatablock.displayName=(the value of **displayName** property from #metadatablock)

metadatablock.displayFacet=(the value of **displayFacet** property from #metadatablock)

example:

metadatablock.name=citation

metadatablock.displayName=Citation Metadata

metadatablock.displayFacet=Citation

#datasetField (field) properties
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
datasetfieldtype.(the value of **name** property from #datasetField).title=(the value of **title** property from #datasetField)
Expand Down Expand Up @@ -579,6 +588,58 @@ The scripts required can be hosted locally or retrieved dynamically from https:/

Please note that in addition to the :ref:`:CVocConf` described above, an alternative is the :ref:`:ControlledVocabularyCustomJavaScript` setting.

Protecting MetadataBlocks
-------------------------

Dataverse can be configured to only allow entries for a metadata block to be changed (created, edited, deleted) by entities that know a defined secret key.
Metadata blocks protected by such a key are referred to as "System" metadata blocks.
A primary use case for system metadata blocks is to handle metadata created by third-party tools interacting with Dataverse where unintended changes to the metadata could cause a failure. Examples might include archiving systems or workflow engines.
To protect an existing metadatablock, one must set a key (recommended to be long and un-guessable) for that block:

dataverse.metadata.block-system-metadata-keys.<block name>=<key value>

This can be done using system properties (see :ref:`jvm-options`), environment variables or other MicroProfile Config mechanisms supported by the app server.
`See Payara docs for supported sources <https://docs.payara.fish/community/docs/documentation/microprofile/config/README.html#config-sources>`_. Note that a Payara restart may be required to enable the new option.

For these secret keys, Payara password aliases are recommended.

Alias creation example using the codemeta metadata block (actual name: codeMeta20):

.. code-block:: shell
echo "AS_ADMIN_ALIASPASSWORD=1234ChangeMeToSomethingLong" > /tmp/key.txt
asadmin create-password-alias --passwordfile /tmp/key.txt dataverse.metadata.block-system-metadata-keys.codeMeta20
rm /tmp/key.txt
Alias deletion example for the codemeta metadata block (removes protected status)

.. code-block:: shell
asadmin delete-password-alias dataverse.metadata.block-system-metadata-keys.codeMeta20
A Payara restart is required after these example commands.

When protected via a key, a metadata block will not be shown in the user interface when a dataset is being created or when metadata is being edited. Entries in such a system metadata block will be shown to users, consistent with Dataverse's design in which all metadata in published datasets is publicly visible.

Note that protecting a block with required fields, or using a template with an entry in a protected block, will make it impossible to create a new dataset via the user interface. Also note that for this reason protecting the citation metadatablock is not recommended. (Creating a dataset also automatically sets the date of deposit field in the citation block, which would be prohibited if the citation block is protected.)

To remove proted status and return a block to working normally, remove the associated key.

To add metadata to a system metadata block via API, one must include an additional key of the form

mdkey.<blockName>=<key value>

as an HTTP Header or query parameter (case sensitive) for each system metadata block to any API call in which metadata values are changed in that block. Multiple keys are allowed if more than one system metadatablock is being changed in a given API call.

For example, following the :ref:`Add Dataset Metadata <add-semantic-metadata>` example from the :doc:`/developers/dataset-semantic-metadata-api`:

.. code-block:: bash
curl -X PUT -H X-Dataverse-key:$API_TOKEN -H 'Content-Type: application/ld+json' -H 'mdkey.codeMeta20:1234ChangeMeToSomethingLong' -d '{"codeVersion": "1.0.0", "@context":{"codeVersion": "https://schema.org/softwareVersion"}}' "$SERVER_URL/api/datasets/$DATASET_ID/metadata"
curl -X PUT -H X-Dataverse-key:$API_TOKEN -H 'Content-Type: application/ld+json' -d '{"codeVersion": "1.0.1", "@context":{"codeVersion": "https://schema.org/softwareVersion"}}' "$SERVER_URL/api/datasets/$DATASET_ID/metadata?mdkey.codeMeta20=1234ChangeMeToSomethingLong&replace=true"
Tips from the Dataverse Community
---------------------------------

Expand Down
4 changes: 4 additions & 0 deletions doc/sphinx-guides/source/api/client-libraries.rst
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,10 @@ https://github.com/aeonSolutions/OpenScience-Dataverse-API-C-library is the offi

This C/C++ library was created and is currently maintained by `Miguel T. <https://www.linkedin.com/in/migueltomas/>`_ To learn how to install and use it, see the project's `wiki page <https://github.com/aeonSolutions/OpenScience-Dataverse-API-C-library/wiki>`_.

Go
--
https://github.com/libis/rdm-dataverse-go-api is Go API library that can be used in your project by simply adding ``github.com/libis/rdm-dataverse-go-api`` as a dependency in your ``go.mod`` file. See the GitHub page for more details and usage examples.

Java
----

Expand Down
19 changes: 19 additions & 0 deletions doc/sphinx-guides/source/api/curation-labels.rst
Original file line number Diff line number Diff line change
Expand Up @@ -93,3 +93,22 @@ To get the list of allowed curation labels allowed for a given Dataset
curl -H X-Dataverse-key:$API_TOKEN "$SERVER_URL/api/datasets/:persistentId/allowedCurationLabels?persistentId=$DATASET_PID"
You should expect a 200 ("OK") response with a comma-separated list of allowed labels contained in a JSON 'data' object.


Get a Report on the Curation Status of All Datasets
---------------------------------------------------

To get a CSV file listing the curation label assigned to each Dataset with a draft version, along with the creation and last modification dates, and list of those with permissions to publish the version.

This API call is restricted to superusers.

.. code-block:: bash
export API_TOKEN=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
export SERVER_URL=https://demo.dataverse.org
Example: Get the report
curl -H X-Dataverse-key:$API_TOKEN "$SERVER_URL/api/datasets/listCurationStates"
You should expect a 200 ("OK") response with a CSV formatted response.
14 changes: 14 additions & 0 deletions doc/sphinx-guides/source/api/getting-started.rst
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,20 @@ If you ever want to check an environment variable, you can "echo" it like this:
echo $SERVER_URL
With curl version 7.56.0 and higher, it is recommended to use --form-string with outer quote rather than -F flag without outer quote.

For example, curl command parameter below might cause error such as ``warning: garbage at end of field specification: ,"categories":["Data"]}``.

.. code-block:: bash
-F jsonData={\"description\":\"My description.\",\"categories\":[\"Data\"]}
Instead, use --form-string with outer quote. See https://github.com/curl/curl/issues/2022

.. code-block:: bash
--form-string 'jsonData={"description":"My description.","categories":["Data"]}'
If you don't like curl, don't have curl, or want to use a different programming language, you are encouraged to check out the Python, Javascript, R, and Java options in the :doc:`client-libraries` section.

.. _curl: https://curl.haxx.se
Expand Down
Loading

0 comments on commit 01788fb

Please sign in to comment.