Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem: AIP pointer file validation is failing #1292

Open
5 tasks
tw4l opened this issue Aug 27, 2020 · 1 comment
Open
5 tasks

Problem: AIP pointer file validation is failing #1292

tw4l opened this issue Aug 27, 2020 · 1 comment
Labels
📍 Pointer-files Status: ready The issue is sufficiently described/scoped to be picked up by a developer. Type: bug A flaw in the code that causes the software to produce an incorrect or unexpected result.

Comments

@tw4l
Copy link

tw4l commented Aug 27, 2020

Expected behaviour

AIP pointer files pass validation.

Current behaviour

All AIP pointer files are failing validation. The Storage Service debug logs are full of messages like:

/var/log/archivematica/storage-service/storage_service_debug.log:153406:ERROR     2020-08-27 10:45:51  locations.models.package:package:create_pointer_file:1450:  Pointer file constructed for 9e6c5762-ba8b-4bd1-86b3-8cc8758aa497 is not valid.

Looking more closely at the logs, we see:

archivematica-storage-service_1  | ERROR     2020-08-27 08:08:58  locations.models.package:package:create_pointer_file:1450:  Pointer file constructed for bb71cef4-41af-41d5-b995-2dac14323730 is not valid.
archivematica-storage-service_1  | Schematron Error(s):
archivematica-storage-service_1  | 1. A techMD mdWrap element must contain a PREMIS object element.
archivematica-storage-service_1  |    test: m:xmlData/p:object
archivematica-storage-service_1  |    location: /*[local-name()='mets' and namespace-uri()='http://www.loc.gov/METS/']/*[local-name()='amdSec' and namespace-uri()='http://www.loc.gov/METS/']/*[local-name()='techMD' and namespace-uri()='http://www.loc.gov/METS/']/*[local-name()='mdWrap' and namespace-uri()='http://www.loc.gov/METS/']
archivematica-storage-service_1  | 
archivematica-storage-service_1  | 
archivematica-storage-service_1  | 2. A techMD mdWrap element MUST contain an XML schema location.
archivematica-storage-service_1  |    test: m:xmlData/p:object/@xsi:schemaLocation
archivematica-storage-service_1  |    location: /*[local-name()='mets' and namespace-uri()='http://www.loc.gov/METS/']/*[local-name()='amdSec' and namespace-uri()='http://www.loc.gov/METS/']/*[local-name()='techMD' and namespace-uri()='http://www.loc.gov/METS/']/*[local-name()='mdWrap' and namespace-uri()='http://www.loc.gov/METS/']
archivematica-storage-service_1  | 
archivematica-storage-service_1  | 
archivematica-storage-service_1  | 3. A techMD mdWrap element MUST have an xsi:type attribute of file.
archivematica-storage-service_1  |    test: m:xmlData/p:object/@xsi:type = 'premis:file'
archivematica-storage-service_1  |    location: /*[local-name()='mets' and namespace-uri()='http://www.loc.gov/METS/']/*[local-name()='amdSec' and namespace-uri()='http://www.loc.gov/METS/']/*[local-name()='techMD' and namespace-uri()='http://www.loc.gov/METS/']/*[local-name()='mdWrap' and namespace-uri()='http://www.loc.gov/METS/']
archivematica-storage-service_1  | 
archivematica-storage-service_1  | 
archivematica-storage-service_1  | 4. A digiprovMD mdWrap element MUST contain an XML schema location.
archivematica-storage-service_1  |    test: @MDTYPE = 'PREMIS:AGENT' or m:xmlData/p:*/@xsi:schemaLocation
archivematica-storage-service_1  |    location: /*[local-name()='mets' and namespace-uri()='http://www.loc.gov/METS/']/*[local-name()='amdSec' and namespace-uri()='http://www.loc.gov/METS/']/*[local-name()='digiprovMD' and namespace-uri()='http://www.loc.gov/METS/'][1]/*[local-name()='mdWrap' and namespace-uri()='http://www.loc.gov/METS/']
archivematica-storage-service_1  | 
archivematica-storage-service_1  | 
archivematica-storage-service_1  | 5. A digiprovMD mdWrap element MUST contain an XML schema location.
archivematica-storage-service_1  |    test: @MDTYPE = 'PREMIS:AGENT' or m:xmlData/p:*/@xsi:schemaLocation
archivematica-storage-service_1  |    location: /*[local-name()='mets' and namespace-uri()='http://www.loc.gov/METS/']/*[local-name()='amdSec' and namespace-uri()='http://www.loc.gov/METS/']/*[local-name()='digiprovMD' and namespace-uri()='http://www.loc.gov/METS/'][2]/*[local-name()='mdWrap' and namespace-uri()='http://www.loc.gov/METS/']
archivematica-storage-service_1  | 
archivematica-storage-service_1  | 
archivematica-storage-service_1  | 6. A PREMIS:EVENT must be represented by a PREMIS event element.
archivematica-storage-service_1  |    test: m:xmlData/p:event
archivematica-storage-service_1  |    location: /*[local-name()='mets' and namespace-uri()='http://www.loc.gov/METS/']/*[local-name()='amdSec' and namespace-uri()='http://www.loc.gov/METS/']/*[local-name()='digiprovMD' and namespace-uri()='http://www.loc.gov/METS/'][1]/*[local-name()='mdWrap' and namespace-uri()='http://www.loc.gov/METS/']
archivematica-storage-service_1  | 
archivematica-storage-service_1  | 
archivematica-storage-service_1  | 7. A PREMIS:EVENT must be represented by a PREMIS event element.
archivematica-storage-service_1  |    test: m:xmlData/p:event
archivematica-storage-service_1  |    location: /*[local-name()='mets' and namespace-uri()='http://www.loc.gov/METS/']/*[local-name()='amdSec' and namespace-uri()='http://www.loc.gov/METS/']/*[local-name()='digiprovMD' and namespace-uri()='http://www.loc.gov/METS/'][2]/*[local-name()='mdWrap' and namespace-uri()='http://www.loc.gov/METS/']
archivematica-storage-service_1  | 
archivematica-storage-service_1  | 
archivematica-storage-service_1  | 
archivematica-storage-service_1  | XMLSchema (xsd) Error(s):
archivematica-storage-service_1  | 

xmllint output gets us a little closer to the problem:

➜ xmllint --schema mets.xsd pointer.e1d88fc7-18d8-4e9d-a59e-e758f114d6c0.xml --noout 
pointer.e1d88fc7-18d8-4e9d-a59e-e758f114d6c0.xml:8: element object: Schemas validity error : Element '{http://www.loc.gov/premis/v3}object', attribute '{http://www.w3.org/2001/XMLSchema-instance}type': The QName value '{http://www.loc.gov/premis/v3}file' of the xsi:type attribute does not resolve to a type definition.
pointer.e1d88fc7-18d8-4e9d-a59e-e758f114d6c0.xml:8: element object: Schemas validity error : Element '{http://www.loc.gov/premis/v3}object': The type definition is absent.
pointer.e1d88fc7-18d8-4e9d-a59e-e758f114d6c0.xml fails to validate

This does not result in the ingest failing, as the results aren't passed to MCP Server. The AIP is stored successfully and the pointer file is created and appears otherwise fine.

Steps to reproduce

  • Ingest a compressed AIP
  • Look through Storage Service debug logs for error messages

Your environment (version of Archivematica, operating system, other relevant details)

qa/1.x / qa/0.x pre-1.12 release

Additional details

This isn't entirely new - I see the same error message in #380, where it was noted as something needing investigation but was tangential to the main problem described there.


For Artefactual use:

Before you close this issue, you must check off the following:

  • All pull requests related to this issue are properly linked
  • All pull requests related to this issue have been merged
  • A testing plan for this issue has been implemented and passed (testing plan information should be included in the issue body or comments)
  • Documentation regarding this issue has been written and merged (if applicable)
  • Details about this issue have been added to the release notes (if applicable)
@tw4l tw4l added Type: bug A flaw in the code that causes the software to produce an incorrect or unexpected result. Status: refining The issue needs additional details to ensure that requirements are clear. labels Aug 27, 2020
@replaceafill replaceafill added this to the 1.15.0 milestone Jun 24, 2023
@replaceafill replaceafill self-assigned this Jun 24, 2023
@replaceafill replaceafill added the Status: in progress Issue that is currently being worked on. label Jun 28, 2023
@replaceafill replaceafill removed their assignment Jun 28, 2023
@replaceafill replaceafill added Status: ready The issue is sufficiently described/scoped to be picked up by a developer. and removed Status: in progress Issue that is currently being worked on. Status: refining The issue needs additional details to ensure that requirements are clear. labels Jun 28, 2023
@replaceafill replaceafill removed this from the 1.15.0 milestone Jun 28, 2023
@replaceafill
Copy link
Member

I investigated this today and realized it's caused by the mets-reader-writer's schematron files (for METS and pointer file validation) still using PREMIS 2.2 when the Storage Service uses PREMIS 3.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
📍 Pointer-files Status: ready The issue is sufficiently described/scoped to be picked up by a developer. Type: bug A flaw in the code that causes the software to produce an incorrect or unexpected result.
Projects
None yet
Development

No branches or pull requests

3 participants