Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updated PRONOM signature files - v77 #49

Merged
merged 6 commits into from
Jul 22, 2014

Conversation

mistydemeo
Copy link
Contributor

I noticed the signature files have fallen a few releases behind; there's the V74 signature files.

If this is accepted, can you also tag a new release? Thanks, and thanks for all your work on FIDO. :D I'll submit updates for the upcoming signature files too.

@Kris-LIBIS
Copy link

I guess, you should be able to update the signature file yourself, not? The included script update_signatures.py should do the job.

However, when I tried it, I got an error message:

FIDO signature updater v1.2.2
Contacting PRONOM...
Querying latest signaturefile version...
Downloading signature file version 74...
Writing DROID_SignatureFile-v74.xml...
Extracting PRONOM PUID's from signature file...
Traceback (most recent call last):
  File "update_signatures.py", line 171, in <module>
    main(defaults)
  File "update_signatures.py", line 69, in main
    for node in tree.iter("{http://www.nationalarchives.gov.uk/pronom/SignatureFile}FileFormat"):
AttributeError: ElementTree instance has no attribute 'iter'

@mistydemeo
Copy link
Contributor Author

Yep, these new signature files were generated from that script. I'm submitting the new ones here so a version of fido with the latest PRONOM signatures can be packaged.

@Kris-LIBIS
Copy link

I'm sorry, I overlooked the pull request.

Any idea why the update script isn't working in my case?

@mistydemeo
Copy link
Contributor Author

Are you using an older Python release? The ElementTree.iter method was introduced in Python 2.7, so it wouldn't be present if you're using 2.5 or 2.6.

@Kris-LIBIS
Copy link

Thanks. That's it. Sorry, I'm a Python noob.

@mistydemeo
Copy link
Contributor Author

No worries! Yeah, I think FIDO only works with 2.7, and 3.2+.

@mistydemeo
Copy link
Contributor Author

Ping - any chance of getting a new release with the latest PRONOM signatures? We're prepping a new Archivematica release, and it would be great to have an up-to-date FIDO/PRONOM to include.

@mistydemeo
Copy link
Contributor Author

Updating this branch with the new v75 signature files. A few changes are also required to make downloading the signature files work with the reorganization of the Nat Archives website.

@mistydemeo mistydemeo changed the title Updated PRONOM signature files - v74 Updated PRONOM signature files - v75 Jul 17, 2014
@mistydemeo
Copy link
Contributor Author

Updated to the v75 signature files.

Also includes a few updates and bugfixes for the updater, required to be able to do the new import.

I'm not 100% sure that 7852c1f is correct/the right solution, but it does appear to work. I don't know that the newline is meant to be significant given what appears to be the structure of byte signatures.

@anjackson
Copy link
Member

Weird. In the v75 sig file looks like this:

...
                    <LeftFragment MaxOffset="0" MinOffset="0" Position="3">687474703A2F2F7777772E696E7465726C69732E63682F494E5445524C4953322E33</LeftFragment>
                    <LeftFragment MaxOffset="0" MinOffset="0" Position="4">22</LeftFragment>
                    <LeftFragment MaxOffset="0" MinOffset="0" Position="4">27</LeftFragment>
                    <LeftFragment MaxOffset="0" MinOffset="0" Position="5">
3C5452414E5346455220786D6C6E733D</LeftFragment>
...

i.e. the content of that last element starts with a newline.

From context, I think it's probably safe to assume that this whitespace can be ignored, but I'll alert the PRONOM folks anyway. (done)

@mistydemeo
Copy link
Contributor Author

V77 should fix the stray newlines. Removing the two commits that deal with newlines, and testing an import of V77. If that works I'll update the branch.

@mistydemeo
Copy link
Contributor Author

Should I keep the commits with the v74 and v75 signature files, or include only v77?

@mistydemeo mistydemeo changed the title Updated PRONOM signature files - v75 Updated PRONOM signature files - v77 Jul 18, 2014
carlwilson added a commit that referenced this pull request Jul 22, 2014
Updated PRONOM signature files - v77
@carlwilson carlwilson merged commit 6223bf1 into openpreserve:master Jul 22, 2014
@mistydemeo
Copy link
Contributor Author

Thanks!

@mistydemeo mistydemeo deleted the pronom-v74 branch July 22, 2014 18:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants