Skip to content

Commit a1913d7

Browse files
authored
Merge pull request #72 from hathitrust/postZephir-documentation
Add summary of changes that postZephir.pm makes to records
2 parents 737b080 + 10e0c68 commit a1913d7

File tree

1 file changed

+18
-0
lines changed

1 file changed

+18
-0
lines changed

README.md

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -183,3 +183,21 @@ For test coverage, replace the previous `docker compose run` with
183183
```bash
184184
docker compose run --rm pz bash -c "perl -MDevel::Cover=-silent,1 t/*.t && cover -nosummary /usr/src/app/cover_db"
185185
```
186+
187+
## Changes `postZephir.pm` makes to bib records
188+
189+
`postZephir.pm` does some cleanup on records coming from Zephir and adds rights data.
190+
191+
* removes `PST`, `LOC`, `SBL` fields (We are not sure when this might happen)
192+
* removes `974` fields where the rights attribute is `supp` (suppressed)
193+
* removes tabs and newlines from the leader, tags (e.g. `100` or `245`), control field values, and subfield values
194+
* replaces non-breaking spaces (unicode `U+00A0`) in the leader, control field values, subfield values, and tags with a single blank space
195+
* replaces non-ASCII characters in control fields with spaces
196+
* replaces subfield codes other than alphanumeric, `%`, `*`, `?`, or `@` with `a`. (We are not sure in what context such subfields might appear, although the [MARC specifications](https://www.loc.gov/marc/specifications/specrecstruc.html) do say that non-alphanumeric values can be used as subfield codes for local purposes
197+
* removes `974` fields for duplicate "dollar barcode" items - if both `uc1.b123456` and `uc1.$b123456` are present, it will remove `uc1.$b123456` - all this cleanup was completed long ago, so this shouldn't happen any more
198+
* if leader character 5 (record status) is `d` (deleted), changes it to `c` (corrected)
199+
* adds rights to the items (`974` fields)
200+
* sets `974$y` to the date the rights algorithm determined, if it determined something other than 9999
201+
* sets `974$r` to the rights attribute, `974$q` to the rights reason, and `974$t` to an explanation/summary of the reason for bib-determined rights
202+
* If there is a change in bib-determined rights, sets `974$d` to the current date
203+
* if there are no remaining `974` fields, don't output the record

0 commit comments

Comments
 (0)