-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Publish a WLC TEI edition that includes macula identifiers #122
Conversation
Gives us a consistent starting place for any additional transforms
* w elements are not at morpheme level * xml:id indicators are not valid Macula IDs
Look at RUT 1:8 (k, q) elems. Nodes include q but omit k. RUT 4:17 has a pe element that is rendered on Tanach.us and on marble.bible, but is treated as an "after" attribute in Nodes
Consult with Tanach.xsd for other allowed elements.
Ends up being some differences, e.g. * RUT 1:2!9 * Numbering in RUT 2:1 * RUT 4:17!16 (nearly like Marble) This approach is really just using Tanach.us XML as a "skeleton" to hang the w elements; whitespace sensitivity coming next if we go to the morpheme level.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was able to compare Genesis 1 and Psalm 1 to the equivalent WLC on tanach.us, and they look good. I did a spot check on both paseq and ׃פ in both tanach.us and symphony in Genesis 1 to make sure they show up in the same places.
I can't guarantee no errors, but this is definitely better than the status quo.
This PR:
tei-transform
pipeline that fetches XML from Tanach.us and transforms it into the TEI dataset published atWLC/tei
.sources/tanach.us/xml
(all processing happens on the pipeline, making it easier to "refresh" the upstream XML as-needed)The XML file for each book can be viewed in a browser and will be rendered with the wlc-tei.css stylesheet:
WLC/tei/08-ruth.xml
samekh
andpe
elements are rendered with additional whitespace / line breaks, mimicing the display of the HTML on Tanach.us:WLC/tei/09-1samuel.xml
Or view in the Symphony Browser
The XML uses significant whitespace, e.g.
o090010090111
ando090010090121
are in separate words, but there is no whitespace between them:Rendering of the new XML can be viewed in the Symphony Browser by adding a
tanachTEI=y
querystring parameter, e.g.https://deploy-preview-370--symphony-preview.netlify.app/?workspace=reading&osisRef=1Sam.1.9&tanachTEI=y&selectedLemma=%D7%A8%D6%B8%D7%90%D6%B8%D7%94