Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

checksum mismatch on metadata file when auditing zx284xc2961 (services-disk10) #1397

Closed
jmartin-sul opened this issue Feb 22, 2020 · 4 comments
Assignees
Labels
moab_remediation online moab that may need remediation (e.g. missing files, extraneous files, corrupted content)

Comments

@jmartin-sul
Copy link
Member

spawned from #1324

CompleteMoab.joins(:preserved_object, :moab_storage_root).where.not(status: :ok).pluck(:status, :storage_location, :druid, :status_details)
[
...snip other moabs with other types of errors...
["invalid_checksum",
    "/services-disk10/sdr2objects",
    "zx284xc2961",
    "validate_checksums (actual location: services-disk10; ) checksums for /services-disk10/sdr2objects/zx/284/xc/2961/zx284xc2961/v0004/data/metadata/.versionMetadata.xml.swp version 4 do not match. && CompleteMoab status changed from validity_unknown to invalid_checksum"],
...snip other moabs with other types of errors...
]

the oddly named metadata file is indeed there:

[pres@preservation-catalog-prod-02 ~]$ ls -Ra /services-disk10/sdr2objects/zx/284/xc/2961/zx284xc2961/
/services-disk10/sdr2objects/zx/284/xc/2961/zx284xc2961/:
.  ..  v0001  v0002  v0003  v0004  v0005

/services-disk10/sdr2objects/zx/284/xc/2961/zx284xc2961/v0001:
.  ..  data  manifests

/services-disk10/sdr2objects/zx/284/xc/2961/zx284xc2961/v0001/data:
.  ..  content  metadata

/services-disk10/sdr2objects/zx/284/xc/2961/zx284xc2961/v0001/data/content:
.  ..  Warren Lau - PhD Thesis - March 2017-augmented.pdf  Warren Lau - PhD Thesis - March 2017.pdf

/services-disk10/sdr2objects/zx/284/xc/2961/zx284xc2961/v0001/data/metadata:
.                    descMetadata.xml     identityMetadata.xml      rightsMetadata.xml     workflows.xml
..                   embargoMetadata.xml  provenanceMetadata.xml    technicalMetadata.xml
contentMetadata.xml  events.xml           relationshipMetadata.xml  versionMetadata.xml

/services-disk10/sdr2objects/zx/284/xc/2961/zx284xc2961/v0001/manifests:
.  ..  fileInventoryDifference.xml  manifestInventory.xml  signatureCatalog.xml  versionAdditions.xml  versionInventory.xml

/services-disk10/sdr2objects/zx/284/xc/2961/zx284xc2961/v0002:
.  ..  data  manifests

/services-disk10/sdr2objects/zx/284/xc/2961/zx284xc2961/v0002/data:
.  ..  metadata

/services-disk10/sdr2objects/zx/284/xc/2961/zx284xc2961/v0002/data/metadata:
.  ..  descMetadata.xml  events.xml  identityMetadata.xml  provenanceMetadata.xml  versionMetadata.xml  workflows.xml

/services-disk10/sdr2objects/zx/284/xc/2961/zx284xc2961/v0002/manifests:
.  ..  fileInventoryDifference.xml  manifestInventory.xml  signatureCatalog.xml  versionAdditions.xml  versionInventory.xml

/services-disk10/sdr2objects/zx/284/xc/2961/zx284xc2961/v0003:
.  ..  data  manifests

/services-disk10/sdr2objects/zx/284/xc/2961/zx284xc2961/v0003/data:
.  ..  metadata

/services-disk10/sdr2objects/zx/284/xc/2961/zx284xc2961/v0003/data/metadata:
.  ..  contentMetadata.xml  events.xml  identityMetadata.xml  provenanceMetadata.xml  versionMetadata.xml  workflows.xml

/services-disk10/sdr2objects/zx/284/xc/2961/zx284xc2961/v0003/manifests:
.  ..  fileInventoryDifference.xml  manifestInventory.xml  signatureCatalog.xml  versionAdditions.xml  versionInventory.xml

/services-disk10/sdr2objects/zx/284/xc/2961/zx284xc2961/v0004:
.  ..  data  manifests

/services-disk10/sdr2objects/zx/284/xc/2961/zx284xc2961/v0004/data:
.  ..  metadata

/services-disk10/sdr2objects/zx/284/xc/2961/zx284xc2961/v0004/data/metadata:
.  ..  provenanceMetadata.xml  versionMetadata.xml  .versionMetadata.xml.swp  workflows.xml

/services-disk10/sdr2objects/zx/284/xc/2961/zx284xc2961/v0004/manifests:
.  ..  fileInventoryDifference.xml  manifestInventory.xml  signatureCatalog.xml  versionAdditions.xml  versionInventory.xml

/services-disk10/sdr2objects/zx/284/xc/2961/zx284xc2961/v0005:
.  ..  data  manifests

/services-disk10/sdr2objects/zx/284/xc/2961/zx284xc2961/v0005/data:
.  ..  metadata

/services-disk10/sdr2objects/zx/284/xc/2961/zx284xc2961/v0005/data/metadata:
.  ..  embargoMetadata.xml  events.xml  provenanceMetadata.xml  rightsMetadata.xml  versionMetadata.xml  workflows.xml

/services-disk10/sdr2objects/zx/284/xc/2961/zx284xc2961/v0005/manifests:
.  ..  fileInventoryDifference.xml  manifestInventory.xml  signatureCatalog.xml  versionAdditions.xml  versionInventory.xml
[pres@preservation-catalog-prod-02 ~]$ 
@jmartin-sul jmartin-sul added the moab_remediation online moab that may need remediation (e.g. missing files, extraneous files, corrupted content) label Feb 22, 2020
@jmartin-sul jmartin-sul changed the title checksum mismatch on metadata file when auditing zx284xc2961 checksum mismatch on metadata file when auditing zx284xc2961 (services-disk10) Feb 22, 2020
@jmartin-sul
Copy link
Member Author

questions for all of these remediations: have the moabs in question been replicated? if so, do the archives need to be re-pushed?

related useful query: https://github.com/sul-dlss/preservation_catalog/tree/master/db#view-the-zip-parts-for-a-given-druid

input> druid = 'ab123cd4567'
input> ZipPart.joins(zipped_moab_version: [{ complete_moab: [:preserved_object] }, :zip_endpoint]).where(preserved_objects: { druid: druid }).pluck(:druid, 'current_version AS highest_version', 'zipped_moab_versions.version AS zip_version', :endpoint_name, :status)

@jermnelson
Copy link
Contributor

@andrewjbtw, in version 004 of druid:zx284xc2961 the .versionMetadata.xml.swp file was added and to me looks this is an editor backup swap file artifact from having the file versionMetadata.xml open in vi or vim (both of which are installed on the production VM) . I think if you want to expunge the .versionMetadata.xml.swp file from this druid we could then see if the validation errors continue.

@andrewjbtw
Copy link

I took a look and the .swp file somehow got added to the v0004 manifests:

[pres@preservation-robots1-prod zx284xc2961]$ grep -i swp v000*/manifests/*
v0004/manifests/fileInventoryDifference.xml:      <file change="added" basisPath="" otherPath=".versionMetadata.xml.swp">
v0004/manifests/signatureCatalog.xml:  <entry originalVersion="4" groupId="metadata" storagePath=".versionMetadata.xml.swp">
v0004/manifests/versionAdditions.xml:      <fileInstance path=".versionMetadata.xml.swp" datetime="2019-07-08T22:52:05Z"/>
v0004/manifests/versionInventory.xml:      <fileInstance path=".versionMetadata.xml.swp" datetime="2019-07-08T22:52:05Z"/>
v0005/manifests/fileInventoryDifference.xml:      <file change="deleted" basisPath=".versionMetadata.xml.swp" otherPath="">
v0005/manifests/signatureCatalog.xml:  <entry originalVersion="4" groupId="metadata" storagePath=".versionMetadata.xml.swp">

I'm sure it shouldn't be there (in the manifests or the Moab), but I will need to do some more manifest editing to get it completely out of the Moab.

@andrewjbtw
Copy link

I removed the extraneous .swp file and updated the manifests. This Moab is now valid:

[2] pry(main)> Audit::Checksum.validate_druid('zx284xc2961')
I, [2020-03-25T10:47:12.274494 #21026]  INFO -- : 2020-03-25T17:47:12Z CV validate_druid starting for zx284xc2961
D, [2020-03-25T10:47:12.278174 #21026] DEBUG -- : Found 1 complete moabs.
I, [2020-03-25T10:47:12.490595 #21026]  INFO -- : validate_checksums(zx284xc2961, services-disk10) checksum(s) match
I, [2020-03-25T10:47:12.491774 #21026]  INFO -- : validate_checksums(zx284xc2961, services-disk10) CompleteMoab status changed from invalid_checksum to ok
I, [2020-03-25T10:47:13.042073 #21026]  INFO -- : [{:moab_checksum_valid=>"checksum(s) match"}, {:cm_status_changed=>"CompleteMoab status changed from invalid_checksum to ok"}] for zx284xc2961
I, [2020-03-25T10:47:13.042393 #21026]  INFO -- : 2020-03-25T17:47:13Z CV validate_druid ended for zx284xc2961
=> [#<AuditResults:0x0000000008e09d10
  @actual_version=nil,
  @check_name="validate_checksums",
  @druid="zx284xc2961",
  @log_msg_prefix="validate_checksums(zx284xc2961, services-disk10)",
  @moab_storage_root=
   #<MoabStorageRoot:0x0000000008e09d88
    id: 9,
    name: "services-disk10",
    created_at: Thu, 18 Jan 2018 18:55:35 UTC +00:00,
    updated_at: Thu, 18 Jan 2018 18:55:35 UTC +00:00,
    storage_location: "/services-disk10/sdr2objects">,
  @result_array=[{:moab_checksum_valid=>"checksum(s) match"}, {:cm_status_changed=>"CompleteMoab status changed from invalid_checksum to ok"}],
  @string_prefix="validate_checksums (actual location: services-disk10; )">]

I don't think there is anything more to do to remediate this because the workflows still match the number of versions. Closing the ticket.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
moab_remediation online moab that may need remediation (e.g. missing files, extraneous files, corrupted content)
Projects
None yet
Development

No branches or pull requests

4 participants