-
Notifications
You must be signed in to change notification settings - Fork 500
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
IQSS/6497 migrate dataset api #7504
IQSS/6497 migrate dataset api #7504
Conversation
FWIW: We have an error handler for the edu.harvard.iq.dataverse.util.json.JsonParseException class but not for javax.json.stream.JsonParsingException which was getting caught by the Throwable handler and returned as a 500 error with json message {}
and avoid having contexts with specific entries for terms that are in a namespace already
Conflicts: src/main/java/edu/harvard/iq/dataverse/util/bagit/OREMap.java
which can be used with existing AbstractApiBean.ok()
Co-authored-by: Philip Durbin <philipdurbin@gmail.com>
IQSS/6497-migrate_dataset_api Conflicts: src/main/java/edu/harvard/iq/dataverse/util/json/JSONLDUtil.java
…obalDataverseCommunityConsortium/dataverse.git into IQSS/6497-migrate_dataset_api
@@ -22,7 +22,7 @@ This dashboard tool allows you to define sets of local datasets to make availabl | |||
Metadata Export | |||
--------------- | |||
|
|||
This part of the Dashboard is simply a reminder message that metadata export happens through the Dataverse Software API. See the :doc:`metadataexport` section and the :doc:`/api/native-api` section of the API Guide for more details. | |||
This part of the Dashboard is simply a reminder message that metadata export happens through the Dataverse Software API. See the :doc:`/admin/metadataexport` section and the :doc:`/api/native-api` section of the API Guide for more details. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW: unrelated - just found it when fixing another link
if(migrating) { | ||
// unparsable PID passed. Terminate. | ||
throw new BadRequestException("Cannot parse the @id. Make sure it is in valid form - see Dataverse Native API documentation."); | ||
if (migrating) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW: Submitting an '@id' value for the dataset PID, and a modificationdate (next diff) are only allowed for migrate (not valid in the create dataset case), so these are cleaner fixes to just skip running into the null values in the create case.
OK - with the semantic api PR merged, this is ready for review - just minor changes beyond the new migrate and releasemigrated methods. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code-wise, I think this is ready for QA but there are two things I think we should fix:
- The new integrations tests are yielding a 500 error.
- We should update the list of APIs to include this one and others that have been recently added. Otherwise they're hard to find. @qqmyers let me know if you'd like me to make this change.
// Now use the migrate API to recreate the dataset | ||
response = UtilIT.recreateDatasetJsonLD(apiToken, dataverseAlias, expectedString); | ||
String body = response.getBody().asString(); | ||
response.then().assertThat().statusCode(CREATED.getStatusCode()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -0,0 +1,51 @@ | |||
Dataset Migration API |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that we're seeing a number of APIs show up under the Dev Guide, it's probably time to update the "List of Dataverse APIs" I started when I did a light re-write of the API Guide. As of https://guides.dataverse.org/en/5.5/api/intro.html#lists-of-dataverse-apis this is how it looks:
@qqmyers can you please update the "Please note that some APIs are only documented in other guides that are more suited to their audience" section and add "Developer Guide" and under that recent APIs that have been added? (Please include aux files in this.) Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you please update the "Please note that some APIs are only documented in other guides that are more suited to their audience" section and add "Developer Guide" and under that recent APIs that have been added?
@qqmyers I just opened this pull request for this: GlobalDataverseCommunityConsortium#13
add dev guide links to list of APIs IQSS#6497
OK - @pdurbin made the doc change he asked for and I fixed the test - moving to QA (feel free to drop it back if there's something else that needs to happen first). |
Some questions/issues encountered:
|
What this PR does / why we need it: This PR supports migration of datasets from other systems, keeping their PID, version, and publicationdate, and allowing file upload via the existing API. It leverages leveraging the json-ld metadata api from #7414.
(Note that #7414 in turn requires resolution of updating our Java version or having a work-around for Java 8 added there - see that PR for details).
Which issue(s) this PR closes:
Closes #6497
Special notes for your reviewer: This is similar to the :import API but is different (hopefully usefully) in a few ways:
FYI: The last step is slightly different than the normal publish step since version and publication dates are preserved. The PR does update the PID/DOI (e.g. the landing URL specifically) assuming it has the authority/shoulder configured in Dataverse.
Suggestions on how to test this: The DVUploader uses this API to re-import Bags. Nominally running an upload on Bags of the sample datasets would test this fairly well (current versions at https://drive.google.com/drive/folders/1YXAmJLziFs7XhsvRfqfZ6o9AvLXywyBT?usp=sharing , note that work to improve the Bag/ORE contents is ongoing, so these may get out of date w.r.t. code, so check dates/remind me if an update looks like its needed).
Assuming #7414 is tested separately, the testing here could/should focus on just the added migrate calls, so looking for datasets with a broad range of metadata probably isn't necessary (since testing #7414 covers that.) and the things to look at in a migrated dataset are the DOI, creation/publication dates, whether the DOI is updated at DataCite, etc.
Does this PR introduce a user interface change? If mockups are available, please link/include them here: Not directly - there are questions in the community (remote store/registry metadata working group topic, DANS migration, etc.) as to how one should indicate a migrated dataset, what metadata about the source should be captured, etc. This PR doesn't add any metadata or make an display changes, but this PR may increase interest in resolving these other questions.
Is there a release notes update needed for this change?:
Probably worthwhile to note it exists. As a v~1 implementation, I'd phrase this as an ~experimental API (I don't think we've decided exactly how to phrase this but there have been other recent API adds in this category).
Additional documentation: