Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prefix FILEID for pointer file METS creation #473

Merged
merged 1 commit into from
Jun 25, 2019

Conversation

ross-spencer
Copy link
Contributor

@ross-spencer ross-spencer commented Jun 23, 2019

This commit ensures that IDs and FILEIDs are prefixed when
generated for pointer files belonging to an AIP. The prefix along
with name clean-up procedures in the transfer/ingest workflow ensures
that the ID/FILEID remains consistent with an NC string type as
defined by the XML standard.

Connected to archivematica/Issues#660

The output will be a fileSec and structMap in a pointer file that looks as follows:

  <mets:fileSec>
    <mets:fileGrp USE="Archival Information Package">
      <mets:file ID="file-my-transfer-5bc5ac2a-951f-4bc5-9039-08ec619eef1f" GROUPID="Group-file-5bc5ac2a-951f-4bc5-9039-08ec619eef1f" ADMID="amdSec_2">
        <mets:FLocat xlink:href="/var/archivematica/sharedDirectory/www/AIPsStore/5bc5/ac2a/951f/4bc5/9039/08ec/619e/ef1f/my-transfer-5bc5ac2a-951f-4bc5-9039-08ec619eef1f.7z" LOCTYPE="OTHER" OTHERLOCTYPE="SYSTEM"/>
        <mets:transformFile TRANSFORMTYPE="decompression" TRANSFORMORDER="1" TRANSFORMALGORITHM="bzip2"/>
      </mets:file>
    </mets:fileGrp>
  </mets:fileSec>
  <mets:structMap ID="structMap_1" LABEL="Archivematica default" TYPE="physical">
    <mets:div TYPE="Archival Information Package" LABEL="my-transfer-5bc5ac2a-951f-4bc5-9039-08ec619eef1f.7z">
      <mets:fptr FILEID="file-my-transfer-5bc5ac2a-951f-4bc5-9039-08ec619eef1f"/>
    </mets:div>
  </mets:structMap>
  <mets:structMap ID="structMap_2" LABEL="Normative Directory Structure" TYPE="logical">
    <mets:div TYPE="Archival Information Package" LABEL="my-transfer-5bc5ac2a-951f-4bc5-9039-08ec619eef1f.7z"/>
  </mets:structMap>

Note the ID and FILEID in the respective instances are both prefixed with file- per the original issue.

And this structure will persist following reingest as well which is why this PR: requires artefactual-labs/mets-reader-writer#72

NB. This PR is complicated by the need to use a different metsrw. Once the code has been reviewed and approved, I will push the new metsrw and then update this PR to change the requirements so they reference metsrw-0.3.10.

@@ -5,7 +5,6 @@
# pip-compile --output-file=test.txt test.in
#
agentarchives==0.4.0
appnope==0.1.0 # via ipython
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(This deleted itself running the requirements Makefile)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Argh, I was wondering when this was added. It looks like...ipython requires appnope on os x. See here: https://github.com/ipython/ipython/blob/master/setup.py#L204

Copy link
Contributor

@jraddaoui jraddaoui left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ross-spencer!

This looks good, but I wonder about the addition of setuptools in the test requirements ...

@@ -113,3 +112,6 @@ virtualenv==16.6.0 # via tox
wcwidth==0.1.7 # via prompt-toolkit
whitenoise==3.3.0
wrapt==1.11.1

# The following packages are considered to be unsafe in a requirements file:
# setuptools==41.0.1 # via ipdb, ipython, pytest, tox
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this auto-added too? Should we remove it from the requirements file and require it somewhere else if it's really needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed as a team in Slack. There's a little bit of chatter here too: jazzband/pip-tools#522. The follow up is to log a new issue to refine our use of pip-compile some more to catch things like this that may be undesirable.

This commit ensures that IDs and FILEIDs are prefixed when
generated for pointer files belonging to an AIP. The prefix along
with name clean-up procedures in the transfer/ingest workflow ensures
that the ID/FILEID remains consistent with an NC string type as
defined by the XML standard.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants