Skip to content

Commit 5726a99

Browse files
author
Don Sizemore
committed
#10179 merge with develop
2 parents a9f091c + c4940c8 commit 5726a99

File tree

469 files changed

+18127
-7471
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

469 files changed

+18127
-7471
lines changed

.env

+2-1
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
11
APP_IMAGE=gdcc/dataverse:unstable
2-
POSTGRES_VERSION=13
2+
POSTGRES_VERSION=16
33
DATAVERSE_DB_USER=dataverse
44
SOLR_VERSION=9.3.0
5+
SKIP_DEPLOY=0
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,101 @@
1+
name: Maven Cache Management
2+
3+
on:
4+
# Every push to develop should trigger cache rejuvenation (dependencies might have changed)
5+
push:
6+
branches:
7+
- develop
8+
# According to https://docs.github.com/en/actions/using-workflows/caching-dependencies-to-speed-up-workflows#usage-limits-and-eviction-policy
9+
# all caches are deleted after 7 days of no access. Make sure we rejuvenate every 7 days to keep it available.
10+
schedule:
11+
- cron: '23 2 * * 0' # Run for 'develop' every Sunday at 02:23 UTC (3:23 CET, 21:23 ET)
12+
# Enable manual cache management
13+
workflow_dispatch:
14+
# Delete branch caches once a PR is merged
15+
pull_request:
16+
types:
17+
- closed
18+
19+
env:
20+
COMMON_CACHE_KEY: "dataverse-maven-cache"
21+
COMMON_CACHE_PATH: "~/.m2/repository"
22+
23+
jobs:
24+
seed:
25+
name: Drop and Re-Seed Local Repository
26+
runs-on: ubuntu-latest
27+
if: ${{ github.event_name != 'pull_request' }}
28+
permissions:
29+
# Write permission needed to delete caches
30+
# See also: https://docs.github.com/en/rest/actions/cache?apiVersion=2022-11-28#delete-a-github-actions-cache-for-a-repository-using-a-cache-id
31+
actions: write
32+
contents: read
33+
steps:
34+
- name: Checkout repository
35+
uses: actions/checkout@v4
36+
- name: Determine Java version from Parent POM
37+
run: echo "JAVA_VERSION=$(grep '<target.java.version>' modules/dataverse-parent/pom.xml | cut -f2 -d'>' | cut -f1 -d'<')" >> ${GITHUB_ENV}
38+
- name: Set up JDK ${{ env.JAVA_VERSION }}
39+
uses: actions/setup-java@v4
40+
with:
41+
java-version: ${{ env.JAVA_VERSION }}
42+
distribution: temurin
43+
- name: Seed common cache
44+
run: |
45+
mvn -B -f modules/dataverse-parent dependency:go-offline dependency:resolve-plugins
46+
# This non-obvious order is due to the fact that the download via Maven above will take a very long time (7-8 min).
47+
# Jobs should not be left without a cache. Deleting and saving in one go leaves only a small chance for a cache miss.
48+
- name: Drop common cache
49+
run: |
50+
gh extension install actions/gh-actions-cache
51+
echo "🛒 Fetching list of cache keys"
52+
cacheKeys=$(gh actions-cache list -R ${{ github.repository }} -B develop | cut -f 1 )
53+
54+
## Setting this to not fail the workflow while deleting cache keys.
55+
set +e
56+
echo "🗑️ Deleting caches..."
57+
for cacheKey in $cacheKeys
58+
do
59+
gh actions-cache delete $cacheKey -R ${{ github.repository }} -B develop --confirm
60+
done
61+
echo "✅ Done"
62+
env:
63+
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
64+
- name: Save the common cache
65+
uses: actions/cache@v4
66+
with:
67+
path: ${{ env.COMMON_CACHE_PATH }}
68+
key: ${{ env.COMMON_CACHE_KEY }}
69+
enableCrossOsArchive: true
70+
71+
# Let's delete feature branch caches once their PR is merged - we only have 10 GB of space before eviction kicks in
72+
deplete:
73+
name: Deplete feature branch caches
74+
runs-on: ubuntu-latest
75+
if: ${{ github.event_name == 'pull_request' }}
76+
permissions:
77+
# `actions:write` permission is required to delete caches
78+
# See also: https://docs.github.com/en/rest/actions/cache?apiVersion=2022-11-28#delete-a-github-actions-cache-for-a-repository-using-a-cache-id
79+
actions: write
80+
contents: read
81+
steps:
82+
- name: Checkout repository
83+
uses: actions/checkout@v4
84+
- name: Cleanup caches
85+
run: |
86+
gh extension install actions/gh-actions-cache
87+
88+
BRANCH=refs/pull/${{ github.event.pull_request.number }}/merge
89+
echo "🛒 Fetching list of cache keys"
90+
cacheKeysForPR=$(gh actions-cache list -R ${{ github.repository }} -B $BRANCH | cut -f 1 )
91+
92+
## Setting this to not fail the workflow while deleting cache keys.
93+
set +e
94+
echo "🗑️ Deleting caches..."
95+
for cacheKey in $cacheKeysForPR
96+
do
97+
gh actions-cache delete $cacheKey -R ${{ github.repository }} -B $BRANCH --confirm
98+
done
99+
echo "✅ Done"
100+
env:
101+
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}

.github/workflows/maven_unit_test.yml

+2
Original file line numberDiff line numberDiff line change
@@ -4,13 +4,15 @@ on:
44
push:
55
paths:
66
- "**.java"
7+
- "**.sql"
78
- "pom.xml"
89
- "modules/**/pom.xml"
910
- "!modules/container-base/**"
1011
- "!modules/dataverse-spi/**"
1112
pull_request:
1213
paths:
1314
- "**.java"
15+
- "**.sql"
1416
- "pom.xml"
1517
- "modules/**/pom.xml"
1618
- "!modules/container-base/**"

.gitignore

+1
Original file line numberDiff line numberDiff line change
@@ -61,3 +61,4 @@ src/main/webapp/resources/images/dataverseproject.png.thumb140
6161

6262
# Docker development volumes
6363
/docker-dev-volumes
64+
/.vs

CONTRIBUTING.md

+2-2
Original file line numberDiff line numberDiff line change
@@ -56,12 +56,12 @@ If you are interested in working on the main Dataverse code, great! Before you s
5656

5757
Please read http://guides.dataverse.org/en/latest/developers/version-control.html to understand how we use the "git flow" model of development and how we will encourage you to create a GitHub issue (if it doesn't exist already) to associate with your pull request. That page also includes tips on making a pull request.
5858

59-
After making your pull request, your goal should be to help it advance through our kanban board at https://github.com/orgs/IQSS/projects/2 . If no one has moved your pull request to the code review column in a timely manner, please reach out. Note that once a pull request is created for an issue, we'll remove the issue from the board so that we only track one card (the pull request).
59+
After making your pull request, your goal should be to help it advance through our kanban board at https://github.com/orgs/IQSS/projects/34 . If no one has moved your pull request to the code review column in a timely manner, please reach out. Note that once a pull request is created for an issue, we'll remove the issue from the board so that we only track one card (the pull request).
6060

6161
Thanks for your contribution!
6262

6363
[dataverse-community Google Group]: https://groups.google.com/group/dataverse-community
6464
[Community Call]: https://dataverse.org/community-calls
6565
[dataverse-dev Google Group]: https://groups.google.com/group/dataverse-dev
6666
[community contributors]: https://docs.google.com/spreadsheets/d/1o9DD-MQ0WkrYaEFTD5rF_NtyL8aUISgURsAXSL7Budk/edit?usp=sharing
67-
[dev efforts]: https://github.com/orgs/IQSS/projects/2#column-5298405
67+
[dev efforts]: https://github.com/orgs/IQSS/projects/34/views/6

README.md

+2-1
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ Dataverse&#174;
33

44
Dataverse is an [open source][] software platform for sharing, finding, citing, and preserving research data (developed by the [Dataverse team](https://dataverse.org/about) at the [Institute for Quantitative Social Science](https://iq.harvard.edu/) and the [Dataverse community][]).
55

6-
[dataverse.org][] is our home on the web and shows a map of Dataverse installations around the world, a list of [features][], [integrations][] that have been made possible through [REST APIs][], our development [roadmap][], and more.
6+
[dataverse.org][] is our home on the web and shows a map of Dataverse installations around the world, a list of [features][], [integrations][] that have been made possible through [REST APIs][], our [project board][], our development [roadmap][], and more.
77

88
We maintain a demo site at [demo.dataverse.org][] which you are welcome to use for testing and evaluating Dataverse.
99

@@ -29,6 +29,7 @@ Dataverse is a trademark of President and Fellows of Harvard College and is regi
2929
[Installation Guide]: https://guides.dataverse.org/en/latest/installation/index.html
3030
[latest release]: https://github.com/IQSS/dataverse/releases
3131
[features]: https://dataverse.org/software-features
32+
[project board]: https://github.com/orgs/IQSS/projects/34
3233
[roadmap]: https://www.iq.harvard.edu/roadmap-dataverse-project
3334
[integrations]: https://dataverse.org/integrations
3435
[REST APIs]: https://guides.dataverse.org/en/latest/api/index.html

conf/proxy/Caddyfile

+12
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
# This configuration is intended to be used with Caddy, a very small high perf proxy.
2+
# It will serve the application containers Payara Admin GUI via HTTP instead of HTTPS,
3+
# avoiding the trouble of self signed certificates for local development.
4+
5+
:4848 {
6+
reverse_proxy https://dataverse:4848 {
7+
transport http {
8+
tls_insecure_skip_verify
9+
}
10+
header_down Location "^https://" "http://"
11+
}
12+
}

conf/solr/9.3.0/schema.xml

+8-5
Original file line numberDiff line numberDiff line change
@@ -157,7 +157,8 @@
157157
<field name="publicationStatus" type="string" stored="true" indexed="true" multiValued="true"/>
158158
<field name="externalStatus" type="string" stored="true" indexed="true" multiValued="false"/>
159159
<field name="embargoEndDate" type="plong" stored="true" indexed="true" multiValued="false"/>
160-
160+
<field name="retentionEndDate" type="plong" stored="true" indexed="true" multiValued="false"/>
161+
161162
<field name="subtreePaths" type="string" stored="true" indexed="true" multiValued="true"/>
162163

163164
<field name="fileName" type="text_en" stored="true" indexed="true" multiValued="true"/>
@@ -229,6 +230,8 @@
229230

230231
<!-- incomplete datasets issue 8822 -->
231232
<field name="datasetValid" type="boolean" stored="true" indexed="true" multiValued="false"/>
233+
234+
<field name="license" type="string" stored="true" indexed="true" multiValued="false"/>
232235

233236
<!--
234237
METADATA SCHEMA FIELDS
@@ -327,7 +330,7 @@
327330
<field name="keywordVocabularyURI" type="text_en" multiValued="true" stored="true" indexed="true"/>
328331
<field name="kindOfData" type="text_en" multiValued="true" stored="true" indexed="true"/>
329332
<field name="language" type="text_en" multiValued="true" stored="true" indexed="true"/>
330-
<field name="northLongitude" type="text_en" multiValued="true" stored="true" indexed="true"/>
333+
<field name="northLatitude" type="text_en" multiValued="true" stored="true" indexed="true"/>
331334
<field name="notesText" type="text_en" multiValued="false" stored="true" indexed="true"/>
332335
<field name="originOfSources" type="text_en" multiValued="false" stored="true" indexed="true"/>
333336
<field name="otherDataAppraisal" type="text_en" multiValued="false" stored="true" indexed="true"/>
@@ -370,7 +373,7 @@
370373
<field name="software" type="text_en" multiValued="true" stored="true" indexed="true"/>
371374
<field name="softwareName" type="text_en" multiValued="true" stored="true" indexed="true"/>
372375
<field name="softwareVersion" type="text_en" multiValued="true" stored="true" indexed="true"/>
373-
<field name="southLongitude" type="text_en" multiValued="true" stored="true" indexed="true"/>
376+
<field name="southLatitude" type="text_en" multiValued="true" stored="true" indexed="true"/>
374377
<field name="state" type="text_en" multiValued="true" stored="true" indexed="true"/>
375378
<field name="studyAssayCellType" type="text_en" multiValued="true" stored="true" indexed="true"/>
376379
<field name="studyAssayMeasurementType" type="text_en" multiValued="true" stored="true" indexed="true"/>
@@ -566,7 +569,7 @@
566569
<copyField source="keywordVocabularyURI" dest="_text_" maxChars="3000"/>
567570
<copyField source="kindOfData" dest="_text_" maxChars="3000"/>
568571
<copyField source="language" dest="_text_" maxChars="3000"/>
569-
<copyField source="northLongitude" dest="_text_" maxChars="3000"/>
572+
<copyField source="northLatitude" dest="_text_" maxChars="3000"/>
570573
<copyField source="notesText" dest="_text_" maxChars="3000"/>
571574
<copyField source="originOfSources" dest="_text_" maxChars="3000"/>
572575
<copyField source="otherDataAppraisal" dest="_text_" maxChars="3000"/>
@@ -609,7 +612,7 @@
609612
<copyField source="software" dest="_text_" maxChars="3000"/>
610613
<copyField source="softwareName" dest="_text_" maxChars="3000"/>
611614
<copyField source="softwareVersion" dest="_text_" maxChars="3000"/>
612-
<copyField source="southLongitude" dest="_text_" maxChars="3000"/>
615+
<copyField source="southLatitude" dest="_text_" maxChars="3000"/>
613616
<copyField source="state" dest="_text_" maxChars="3000"/>
614617
<copyField source="studyAssayCellType" dest="_text_" maxChars="3000"/>
615618
<copyField source="studyAssayMeasurementType" dest="_text_" maxChars="3000"/>
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
Detection of mime-types based on a filename with extension and detection of the RO-Crate metadata files.
2+
3+
From now on, filenames with extensions can be added into `MimeTypeDetectionByFileName.properties` file. Filenames added there will take precedence over simply recognizing files by extensions. For example, two new filenames are added into that file:
4+
```
5+
ro-crate-metadata.json=application/ld+json; profile="http://www.w3.org/ns/json-ld#flattened http://www.w3.org/ns/json-ld#compacted https://w3id.org/ro/crate"
6+
ro-crate-metadata.jsonld=application/ld+json; profile="http://www.w3.org/ns/json-ld#flattened http://www.w3.org/ns/json-ld#compacted https://w3id.org/ro/crate"
7+
```
8+
9+
Therefore, files named `ro-crate-metadata.json` will be then detected as RO-Crated metadata files from now on, instead as generic `JSON` files.
10+
For more information on the RO-Crate specifications, see https://www.researchobject.org/ro-crate
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
If your S3 store does not support tagging and gives an error if you configure direct uploads, you can disable the tagging by using the ``dataverse.files.<id>.disable-tagging`` JVM option. For more details see https://dataverse-guide--10029.org.readthedocs.build/en/10029/developers/big-data-support.html#s3-tags #10022 and #10029.
2+
3+
## New config options
4+
5+
- dataverse.files.<id>.disable-tagging
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
Bug fixed for the ``incomplete metadata`` label being shown for published dataset with incomplete metadata in certain scenarios. This label will now be shown for draft versions of such datasets and published datasets that the user can edit. This label can also be made invisible for published datasets (regardless of edit rights) with the new option ``dataverse.ui.show-validity-label-when-published`` set to `false`.
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
New api endpoints have been added to allow you to add or remove featured collections from a dataverse collection.
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
You are now able to add HTTP request headers required by the External Vocabulary Services you are implementing.
2+
3+
A combined documentation can be found on pull request [#10404](https://github.com/IQSS/dataverse/pull/10404).
4+
5+
For more information, see issue [#10316](https://github.com/IQSS/dataverse/issues/10316) and pull request [gddc/dataverse-external-vocab-support#19](https://github.com/gdcc/dataverse-external-vocab-support/pull/19).
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
The API endpoint for getting the Dataset version has been extended to include latestVersionPublishingStatus.

doc/release-notes/10339-workflow.md

+3
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
The computational workflow metadata block has been updated to present a clickable link for the External Code Repository URL field.
2+
3+
Release notes should include the usual instructions, for those who have installed this optional block, to update the computational_workflow block. (PR#10441)
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
New optional query parameters added to ``api/metadatablocks`` and ``api/dataverses/{id}/metadatablocks`` endpoints:
2+
3+
- ``returnDatasetFieldTypes``: Whether or not to return the dataset field types present in each metadata block. If not set, the default value is false.
4+
- ``onlyDisplayedOnCreate``: Whether or not to return only the metadata blocks that are displayed on dataset creation. If ``returnDatasetFieldTypes`` is true, only the dataset field types shown on dataset creation will be returned within each metadata block. If not set, the default value is false.
5+
6+
Added new ``displayOnCreate`` field to the MetadataBlock and DatasetFieldType payloads.
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
For scenarios involving API calls related to large datasets (Numerous files, for example: ~10k) it has been optimized:
2+
3+
- The search API endpoint.
4+
- The permission checking logic present in PermissionServiceBean.
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
A new file has been added to import the MIT License to Dataverse: licenseMIT.json.
2+
3+
Documentation has been added to explain the procedure for adding new licenses to the guides.
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
The Metadata Source facet has been updated to show the name of the harvesting client rather than grouping all such datasets under 'harvested'
2+
3+
TODO: for the v6.13 release note: Please add a full re-index using http://localhost:8080/api/admin/index to the upgrade instructions.
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
DataLad has been integrated with Dataverse. For more information, see https://dataverse-guide--10470.org.readthedocs.build/en/10470/admin/integrations.html#datalad
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
Changed ``api/dataverses/{id}/metadatablocks`` so that setting the query parameter ``onlyDisplayedOnCreate=true`` also returns metadata blocks with dataset field type input levels configured as required on the General Information page of the collection, in addition to the metadata blocks and their fields with the property ``displayOnCreate=true`` (which was the original behavior).
2+
3+
A new endpoint ``api/dataverses/{id}/inputLevels`` has been created for updating the dataset field type input levels of a collection via API.
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
The Dataverse object returned by /api/dataverses has been extended to include "isReleased": {boolean}.
2+
```javascript
3+
{
4+
"status": "OK",
5+
"data": {
6+
"id": 32,
7+
"alias": "dv6f645bb5",
8+
"name": "dv6f645bb5",
9+
"dataverseContacts": [
10+
{
11+
"displayOrder": 0,
12+
"contactEmail": "54180268@mailinator.com"
13+
}
14+
],
15+
"permissionRoot": true,
16+
"dataverseType": "UNCATEGORIZED",
17+
"ownerId": 1,
18+
"creationDate": "2024-04-12T18:05:59Z",
19+
"isReleased": true
20+
}
21+
}
22+
```

doc/release-notes/6.1-release-notes.md

+2-1
Original file line numberDiff line numberDiff line change
@@ -247,7 +247,7 @@ Upgrading requires a maintenance window and downtime. Please plan ahead, create
247247

248248
These instructions assume that you've already upgraded through all the 5.x releases and are now running Dataverse 6.0.
249249

250-
0\. These instructions assume that you are upgrading from 6.0. If you are running an earlier version, the only safe way to upgrade is to progress through the upgrades to all the releases in between before attempting the upgrade to 5.14.
250+
0\. These instructions assume that you are upgrading from 6.0. If you are running an earlier version, the only safe way to upgrade is to progress through the upgrades to all the releases in between before attempting the upgrade to 6.1.
251251

252252
If you are running Payara as a non-root user (and you should be!), **remember not to execute the commands below as root**. Use `sudo` to change to that user first. For example, `sudo -i -u dataverse` if `dataverse` is your dedicated application user.
253253

@@ -288,6 +288,7 @@ As noted above, deployment of the war file might take several minutes due a data
288288

289289
6a\. Update Citation Metadata Block (to make Alternative Title repeatable)
290290

291+
- `wget https://github.com/IQSS/dataverse/releases/download/v6.1/citation.tsv`
291292
- `curl http://localhost:8080/api/admin/datasetfield/load -H "Content-type: text/tab-separated-values" -X POST --upload-file scripts/api/data/metadatablocks/citation.tsv`
292293

293294
7\. Upate Solr schema.xml to allow multiple Alternative Titles to be used. See specific instructions below for those installations without custom metadata blocks (7a) and those with custom metadata blocks (7b).

0 commit comments

Comments
 (0)