Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

missing objects on service disk 5? (was: C2M existence check failed for druid:cm180ts4203) #1177

Closed
jmartin-sul opened this issue Jul 6, 2019 · 6 comments
Assignees

Comments

@jmartin-sul
Copy link
Member

jmartin-sul commented Jul 6, 2019

looked into this alert: https://app.honeybadger.io/projects/54415/faults/45795762
Errno::ENOENT: No such file or directory @ dir_initialize - /services-disk05/sdr2objects/cm/180/ts/4203/cm180ts4203

app/services/moab_validation_handler.rb:32:in `moab_validation_errors`
app/services/moab_validation_handler.rb:69:in `set_status_as_seen_on_disk`
app/lib/audit/catalog_to_moab.rb:71:in `block in compare_version_and_take_action`

i went to manually check up on things via the unix and rails consoles on prod. indeed, the object seems to be missing from preservation:

pry(main)> PreservedObject.find_by(druid: 'cm180ts4203').complete_moabs.first.moab_storage_root
=> #<MoabStorageRoot:0x...
 id: ...,
 name: "services-disk05",
 created_at: Thu, 18 Jan 2018 18:55:35 UTC +00:00,
 updated_at: Thu, 18 Jan 2018 18:55:35 UTC +00:00,
 storage_location: "/services-disk05/sdr2objects">
pry(main)>
pry(main)> cm = PreservedObject.find_by(druid: 'cm180ts4203').complete_moabs.first
=> #<CompleteMoab:0x...
 id: ...,
 version: 5,
 preserved_object_id: ...,
 moab_storage_root_id: 4,
 created_at: Sat, 20 Jan 2018 22:26:39 UTC +00:00,
 updated_at: Wed, 17 Apr 2019 12:00:17 UTC +00:00,
 last_moab_validation: Mon, 12 Nov 2018 11:23:17 UTC +00:00,
 last_checksum_validation: Thu, 14 Feb 2019 16:50:47 UTC +00:00,
 size: 286150609733,
 status: "ok",
 last_version_audit: Thu, 14 Feb 2019 16:50:47 UTC +00:00,
 last_archive_audit: Wed, 17 Apr 2019 12:00:17 UTC +00:00>
pry(main)>
pry(main)> c2m = Audit::CatalogToMoab.new(cm, "/services-disk05/sdr2objects")
=> #<Audit::CatalogToMoab:0x...
 @complete_moab=
  #<CompleteMoab:0x...
   id: ...,
   version: 5,
   preserved_object_id: ...,
   moab_storage_root_id: 4,
   created_at: Sat, 20 Jan 2018 22:26:39 UTC +00:00,
   updated_at: Wed, 17 Apr 2019 12:00:17 UTC +00:00,
   last_moab_validation: Mon, 12 Nov 2018 11:23:17 UTC +00:00,
   last_checksum_validation: Thu, 14 Feb 2019 16:50:47 UTC +00:00,
   size: 286150609733,
   status: "ok",
   last_version_audit: Thu, 14 Feb 2019 16:50:47 UTC +00:00,
   last_archive_audit: Wed, 17 Apr 2019 12:00:17 UTC +00:00>,
 @druid="cm180ts4203",
 @results=
  #<AuditResults:0x...
   @actual_version=nil,
   @check_name=nil,
   @druid="cm180ts4203",
   @moab_storage_root=
    #<MoabStorageRoot:0x...
     id: ...,
     name: "services-disk05",
     created_at: Thu, 18 Jan 2018 18:55:35 UTC +00:00,
     updated_at: Thu, 18 Jan 2018 18:55:35 UTC +00:00,
     storage_location: "/services-disk05/sdr2objects">,
   @result_array=[]>,
 @storage_dir="/services-disk05/sdr2objects">
pry(main)>
pry(main)> c2m.check_catalog_version
Errno::ENOENT: No such file or directory @ dir_initialize - /services-disk05/sdr2objects/cm/180/ts/4203/cm180ts4203
from /opt/app/pres/preservation_catalog/shared/bundle/ruby/2.5.0/gems/moab-versioning-4.2.2/lib/moab/storage_object_validator.rb:169:in `open'
$ ls /services-disk05/sdr2objects/cm/180/ts/4203/cm180ts4203
ls: cannot access /services-disk05/sdr2objects/cm/180/ts/4203/cm180ts4203: No such file or directory
$ ls /services-disk05/sdr2objects/cm/180/ts/
$
$ ls /services-disk05/sdr2objects/cm/180/
ts

so two things seem amiss here, both of which seem like a big deal:

  1. there seems to be an object missing from preservation
  2. even though the CatalogToMoab audit is running against it and erroring, status remains ok

it does appear to be in DOR, as it shows up in argo: https://argo.stanford.edu/view/druid:cm180ts4203

hopefully i'm mistaken, and it's preserved somewhere? but there was only one CompleteMoab for the PreservedObject for the druid in question, so if it is preserved somewhere, pres cat is currently unaware of the copy.

@jmartin-sul jmartin-sul changed the title C2M existence check failed for druid:cm180ts4203 missing objects on service disk 5? (was: C2M existence check failed for druid:cm180ts4203) Jul 6, 2019
@jmartin-sul
Copy link
Member Author

jmartin-sul commented Jul 6, 2019

here's another similar thing, also on service disk 5, but this came up when doing checksum validation...

https://app.honeybadger.io/projects/54415/faults/49685373

NoMethodError: undefined method `version_pathname' for nil:NilClass
app/services/checksum_validator.rb:151:in `latest_signature_catalog_path`
app/services/checksum_validator.rb:158:in `rescue in latest_signature_catalog_entries`
app/services/checksum_validator.rb:155:in `latest_signature_catalog_entries`

the druid wasn't readily available in the error info captured by honeybadger, but the ID for the CompleteMoab that ran into the error was, so i was able to:

> cm = CompleteMoab.find('...')
=> #<CompleteMoab:0x...
 id: ...,
 version: 6,
 preserved_object_id: ...,
 moab_storage_root_id: 4,
 created_at: Sun, 21 Jan 2018 09:00:46 UTC +00:00,
 updated_at: Wed, 17 Apr 2019 13:23:35 UTC +00:00,
 last_moab_validation: Mon, 12 Nov 2018 20:30:43 UTC +00:00,
 last_checksum_validation: Thu, 14 Feb 2019 15:14:23 UTC +00:00,
 size: 227050284340,
 status: "ok",
 last_version_audit: Fri, 15 Feb 2019 10:05:17 UTC +00:00,
 last_archive_audit: Wed, 17 Apr 2019 13:23:35 UTC +00:00>
>
> ChecksumValidator.new(cm).validate_checksums
NoMethodError: undefined method `version_pathname' for nil:NilClass
from /opt/app/pres/preservation_catalog/releases/20190701161459/app/services/checksum_validator.rb:151:in `latest_signature_catalog_path'
Caused by NoMethodError: undefined method `signature_catalog' for nil:NilClass
from /opt/app/pres/preservation_catalog/releases/20190701161459/app/services/checksum_validator.rb:156:in `latest_signature_catalog_entries'
>
> exit
$
$ ls /services-disk05/sdr2objects/vx/143/
vh
$ ls /services-disk05/sdr2objects/vx/143/vh/
$

worriesome for both of the same reasons as the prior example (seems like something is missing, status is still ok). as with the other one, this object is in argo: https://argo.stanford.edu/view/druid:vx143vh9242 (and is part of the same collection).

FWIW, it seems like the fix for the status thing should be pretty straight-forward: make sure we do some error handling to catch the error that's being raised, and set the status accordingly. probably still want to alert through honeybadger, since this should be a pretty exceptional occurrence, hopefully. and good thing honeybadger alerted us here.

@jmartin-sul
Copy link
Member Author

jmartin-sul commented Jul 9, 2019

if these errors were a result of moving the moab, it would seem that the procedure wasn't followed completely, and the old record wasn't cleaned up: https://github.com/sul-dlss/preservation_catalog/wiki/A-Moab-Has-Moved

but it also seems like M2C would've caught the unexpected object in its new location, and then errored when trying to insert it into the catalog, because there's a unique constraint on the druid column in the preserved_objects table.

@ndushay ndushay self-assigned this Jul 23, 2019
@ndushay
Copy link
Contributor

ndushay commented Jul 23, 2019

I found cm180ts4203 on disk 16:

[pres@preservation-catalog-prod-01 current]$ ls /services-disk*/sdr2objects/cm/180/ts
/services-disk05/sdr2objects/cm/180/ts:

/services-disk16/sdr2objects/cm/180/ts:
4203
[pres@preservation-catalog-prod-01 current]$ 

and vx143vh9242 on disk 16 also:

[pres@preservation-catalog-prod-01 current]$ ls /services-disk*/sdr2objects/vx/143/vh
/services-disk05/sdr2objects/vx/143/vh:

/services-disk16/sdr2objects/vx/143/vh:
9242

and also:

  • vf742yx0561
  • rb378fk5493
    ...

there are 942 occurrences of https://app.honeybadger.io/projects/54415/faults/45795762 ...

@ndushay
Copy link
Contributor

ndushay commented Jul 24, 2019

per honeybadger showing 8 recurring errors on a weekly basis, these seem to be the 8 errant druids:

  • cm180ts4203
  • vx143vh9242
  • vf742yx0561
  • rb378fk5493
  • qt808nj0703
  • qk452gk8977
  • fc249qf5364
  • dh967mz5785

All were on disk05 and are now on disk16

@jmartin-sul
Copy link
Member Author

yay! sounds like an incomplete manual move, which is better than inexplicably disappearing moabs.

@ndushay
Copy link
Contributor

ndushay commented Jul 24, 2019

I have updated the preservation_catalog for the 8 Moabs indicated; I have manually run M2C from the rails console for these 8 objects and all have run cleanly.

Since the errors that triggered this ticket have already been marked resolved in Honeybadger, we should see at least the same sort of Honeybadger alerts for failed audits until Issues #1184 and #1185 are addressed.

@ndushay ndushay closed this as completed Jul 24, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants