Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem: package structure for migrated AIPs has changed #97

Closed
sallain opened this issue Dec 4, 2024 · 3 comments
Closed

Problem: package structure for migrated AIPs has changed #97

sallain opened this issue Dec 4, 2024 · 3 comments
Assignees

Comments

@sallain
Copy link
Contributor

sallain commented Dec 4, 2024

Describe the bug

The package structure for migration packages has changed. This affects the BornDigitalAIP and DigitizedAIP transfer types.

AIPs for both of these transfer types will be bagged, so there will be a second-level /data directory that pushes everything else down in the hierarchy. Various preprocessing activities may fail with a bagged package, including:

  • Identify SIP structure
  • Validate SIP structure
  • Verify SIP manifest
  • Verify SIP checksums
  • Validate SIP metadata
  • Restructure SIP

Note that this may not be an exhaustive list!

Going forward, DigitizedAIPs will have the following structure:

7537ab2c-4e6b-4820-95bf-bd2c577351c3
├── bag-info.txt
├── bagit.txt
├── data
│   ├── additional
│   │   ├── 7537ab2c-4e6b-4820-95bf-bd2c577351c3-premis.xml
│   │   └── UpdatedAreldaMetadata.xml
│   └── content
│       ├── content
│       │   └── d_0000001
│       │       ├── 00000001.jp2
│       │       ├── 00000001_PREMIS.xml
...
│       │       ├── 00000020.jp2
│       │       ├── 00000020_PREMIS.xml
│       │       └── Prozess_Digitalisierung_PREMIS.xml
│       └── header
│           ├── old
│           │   └── SIP
│           │       └── metadata.xml
│           └── xsd
│               ├── ablieferung.xsd
│               ├── archivischeNotiz.xsd
│               ├── archivischerVorgang.xsd
│               ├── arelda.xsd
│               ├── base.xsd
│               ├── datei.xsd
│               ├── dokument.xsd
│               ├── dossier.xsd
│               ├── ordner.xsd
│               ├── ordnungssystemposition.xsd
│               ├── ordnungssystem.xsd
│               ├── paket.xsd
│               ├── provenienz.xsd
│               └── zusatzDaten.xsd
├── manifest-sha256.txt
├── manifest-sha512.txt
├── tagmanifest-sha256.txt
└── tagmanifest-sha512.txt

BornDigitalAIPs will have the following structure:

Test-AIP-Files/
├── bag-info.txt
├── bagit.txt
├── data
│   ├── additional
│   │   ├── Test-AIP-Files-premis.xml
│   │   └── UpdatedAreldaMetadata.xml
│   └── content
│       ├── content
│       │   ├── d0001
│       │   │   ├── p0001.pdf
│       │   │   ├── p0002.pdf
│       │   │   └── p0003.pdf
│       │   ├── d0002
│       │   │   ├── p0004.pdf
│       │   │   └── p0005.pdf
│       │   └── d0003
│       │       └── d0004
│       │           ├── p0006.pdf
│       │           └── p0007.pdf
│       └── header
│           ├── old
│           │   └── SIP
│           │       └── metadata.xml
│           └── xsd
│               ├── ablieferung.xsd
│               ├── archivischeNotiz.xsd
│               ├── archivischerVorgang.xsd
│               ├── arelda.xsd
│               ├── base.xsd
│               ├── datei.xsd
│               ├── dokument.xsd
│               ├── dossier.xsd
│               ├── ordner.xsd
│               ├── ordnungssystemposition.xsd
│               ├── ordnungssystem.xsd
│               ├── paket.xsd
│               ├── provenienz.xsd
│               └── zusatzDaten.xsd
├── manifest-sha256.txt
├── manifest-sha512.txt
├── tagmanifest-sha256.txt
└── tagmanifest-sha512.txt

Note that the package structure criteria listed in #79 still applies, but the activities need to be adjusted to accomodate the bag structure.

I will provide sample packages once SFA has confirmed the structure.

@sallain sallain added this to Enduro Dec 4, 2024
@sallain sallain moved this to 🛠 Refining in Enduro Dec 4, 2024
@sallain sallain changed the title WIP ISSUE Problem: package structure for migrated AIPs has changed Problem: package structure for migrated AIPs has changed Dec 5, 2024
@sallain sallain moved this from 🛠 Refining to 👍 Ready in Enduro Dec 5, 2024
@djjuhasz djjuhasz self-assigned this Dec 13, 2024
@djjuhasz djjuhasz moved this from 👍 Ready to 🧐 QA in Enduro Dec 13, 2024
@djjuhasz
Copy link
Contributor

@sallain this is fixed by a682cec which unbags the SIP by deleting the BagIt files (bagit.txt, manifests, etc.), moving the contents from the "data" directory to the base SIP directory, then deleting the "data" directory. The SIP is unbagged right before SIP identification so the identification should work as expected.

@github-project-automation github-project-automation bot moved this from 🧐 QA to 🎉 Done in Enduro Dec 13, 2024
@djjuhasz djjuhasz reopened this Dec 13, 2024
@sallain sallain moved this from 🎉 Done to 🧐 QA in Enduro Jan 8, 2025
@fiver-watson
Copy link

currently blocked by #93

@sallain
Copy link
Contributor Author

sallain commented Jan 9, 2025

Similar to comments on #94, this seems to be working as expected!

@sallain sallain closed this as completed Jan 9, 2025
@github-project-automation github-project-automation bot moved this from 🧐 QA to 🎉 Done in Enduro Jan 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: 🎉 Done
Development

No branches or pull requests

3 participants