Skip to content

Latest commit

 

History

History
146 lines (115 loc) · 13.5 KB

README.md

File metadata and controls

146 lines (115 loc) · 13.5 KB

ACAI Deity Data

Initial Draft Edition, v1.00, 2024-10-08

Rick Brannan, rick.brannan@biblionexus.org, and Jessica Parks, jessica.parks@biblionexus.org, 2024-01-22

Context

BiblioNexus and Biblica are working together to create data representing explicit and implicit instances of people and places (and other things) in the Bible. This information is being compiled as part of the ACAI project, the Aquifer Concept Architecture for Information, which itself is a part of the Bible Aquifer project.

A Note on Jesus

For purposes of this analysis, "Jesus" (the Messiah, the incarnate Word of God) is considered human and thus data regarding Jesus is aggregated in the People data. This is not a statement about the divine and human natures of Jesus, it is instead that the data for Jesus needs a single home, and it is best included with the People data. Instances of "Lord" (κυριος, kurios) that are in reference to Jesus are aggregated in that entry as well.

Data regarding "God" and "Holy Spirit" are included in this Deity data.

Deity Types

In this data, the notion of "deity" is wide. Any regional or national god is understood as "deity."

  • deity: entity is treated as a god by some nation or cultural group.
  • angel: entity is sub-deity and treated as a messenger or agent of deity.
  • demon: entity is sub-deity and aligned against the primary deity of the nation/culture.
  • other: something else. Currently Teraphim (household gods) and Nehushstan (the name of Moses' bronze serpent) form this class/type.

Sources

  • Biblica/Clear-Bible's Macula Hebrew
  • Biblica/Clear-Bible's Macula Greek
  • Biblica/Clear-Bible's speaker-quotations, an attempt to identify the original language words, in both the Old and New Testaments, translated as quotations (material using "double" and 'single' quotation marks) in various English Bibles. It also attempts to associate speakers with the quotations, where possible, using data from Faith Comes By Hearing.
  • United Bible Societies' UBS Dictionary of Biblical Hebrew (UBSDBH) and UBS Dictionary of the Greek New Testament (UBSDGNT). See SemanticDictionary.org for an implementation and the git repo ubs-open-license for data (English, French, Spanish, and Chinese). Macula Hebrew and Greek encode domains and references from these resources at the word level for most OT and NT words.
  • Robert Rouse's theographic-bible-metadata (aka viz.bible)
  • STEPBible TIPNR
  • Copenhagen Alliance versification-specification. Bible references within this data reflect the 'ORG' scheme specified by the Copenhagen Alliance. This means that Old Testament references assume the versification structure of the Hebrew Bible, and New Testament references assume the structure of the Greek New Testament. For use with translations, the references may need to be be converted to the Copenhagen Alliance 'ENG' scheme. The repo cited above has information and sample code for how to achieve that; if assistance is needed please contact us.

Each of these sources are available as CC-BY-4.0 or CC-BY-SA-4.0 licensed data.

In particular, the definitions provided by the UBS Dictionaries supply a decent starting point and several of our English person descriptions are directly inspired by these definitions. We owe much to the UBS team (and Reinier DuBlois in particular) for their work and for their licensing of the material under a CC-BY-SA-4.0 license.

In addition to these sources, BiblioNexus have done a significant amount of curation and supplementation in order to account for and model the data according to the needs of the ACAI project.

Aggregation

This process started with Biblica's Macula Hebrew and Macula Greek data, which has word-level semantic domain annotations from the UBS Dictionary of Biblical Hebrew (UBSDBH) and UBS Dictionary of Biblical Greek (UBSDBG). This was used in an initial pass to identify explicitly named people and places.

We next processed data from viz.bible and STEPBible's TIPNR data to prepare person data to be integrated into one cohesive set.

Upon processing these datasets, we realized that STEPBible's TIPNR data did the best job of the lot of grouping together different methods of referring to the same deity (or non-person entity). So it made sense to begin with the TIPNR data as a basic representation, incorporate the semantic, instance, and referent information aggregated from the UBS Dictionaries and from the Macula datasets, and then fold in data from viz.bible.

We identified similar entities from the different datasets for merging through a comparison of labeling and known references. We then merged this data together with data that modeled relationships between places (and, importantly, curated descriptions of the locations) that had been on a separate development/curation track. In that merge, we considered this data primary and at times had to re-arrange our snapshot of the TIPNR data to support the merge.

STEPBible's TIPNR data had a few subentries in the "LORD" entry (for different common names of "God" in the Hebrew Bible) that had incomplete reference lists. These entries contained the reference "Etc.0.0" indicating "and the rest." We supplemented these entries through use of lemmatization of Biblica's Macula Hebrew as well as information from the UBS Dictionary of the Hebrew Bible. Further, TIPNR considers the Greek word κυριος a title and did not annotate instances. However, for our purposes, we find it useful to annotate when κυριος refers to Jesus (with people data in the entry for Jesus) and to God (with deity data in the entry for LORD). We also added data for "Holy Spirit" based on information retrieved from the UBS Dictionary of the Greek New Testament and supplemented with data from Macula Greek. We have supplemented and adjusted the TIPNR data in various other ways to align with the word segmentation of Macula Hebrew and the text of the SBL Greek New Testament.

Status / Completeness of Project

While we believe we've identified explicitly named deities in the Bible and associated references with them, there are some aspects of this data that are not complete.

Areas Still Under Development

Word-level references, particularly within the Hebrew Bible, may also have omissions. The UBS Dictionary of the Hebrew Bible remains a work in progress, and not all word tokens of the Hebrew Bible have been analyzed. Also, the Macula analysis of the Hebrew Bible is not as developed as its Greek New Testament counterpart, and the referent data information can only be considered to be an initial draft.

JSON Schema Documentation

This documentation represents the current (as of 2024-05-01) schema and is based on the Python dataclasses used while processing the place-specific data. We will do our best to keep this up to date, but there may be discrepancies.

AcaiDeityEntry

propertytypedescription
idstringa string representing the unique identifier of this deity (e.g. `deity:HolySpirit`)
primary_idstringa deity that has multiple representations has a primary entry identified by this string. The primary entry is identified where `primary_id` == `id`.
alternate_sourcesAlternateSources dataclassa dataclass that allows each possible alternate source to provide some information
typestringfor deities, `deity` is the only valid type
deity_typestringdefault is `deity`; acceptable values are: `deity`, `demon`, `angel`, or `other`.
related_placesdict[str, list]a dictionary listing places related to the deity, with `str` indicating the reason of relation (supported include `birth_place` and `death_place`) with a list of relevant `place:` ids.
ubsdbglista list of domain.article (##.##) strings representing the UBSDBG annotation
ubsdbhlista list of strings representing the article identifier(s) from UBSDBH
localizationsdict[str, dict[str, list]]an object for collecting strings and other structures by language for localization purposes. This is where `preferred_label`, `alternate_labels`, and `description` are collected.
referred_to_aslista list of `id` for deities that are considered functionally equivalent
only_mentioned_in_apocryphaboolIf `TRUE` this deity is only mentioned in the apocryphal portions of the Bible (note: here "apocrypha" are the apocryphal books of the protestant edition of the NRSV with apocyrpha).
non_biblicalboolIf `TRUE` there is no mention whatever of this deity in the Bible.
lemmasdict[str, list]The key is the language (`el`, `he`, or `arc`) with lemmas as values in the list.
key_referenceslistA list of BCV8 style references, where eight digits encode the zero-padded book (two digits), chapter (three digits), and verse (three digits) of the reference: `BBCCCVVV`. Note these references reflect the versification of the Hebrew Bible and Greek New Testament via the Copenhagen Alliance 'ORG' scheme.
referenceslistA list of BCV8 style references, where eight digits encode the zero-padded book (two digits), chapter (three digits), and verse (three digits) of the reference: `BBCCCVVV`. Note these references reflect the versification of the Hebrew Bible and Greek New Testament via the Copenhagen Alliance 'ORG' scheme.
explicit_instancesdict[str, list]initial key is edition (SBLGNT, WLC), list is corpus-specific word references where a 13 digit string encodes the reference to the word part position: `[on]BBCCCVVVWWWP`. Note these references reflect the versification of the Hebrew Bible and Greek New Testament via the Copenhagen Alliance 'ORG' scheme.
pronominal_referentsdict[str, list]initial key is edition (SBLGNT, WLC), list is corpus-specific word references where a 13 digit string encodes the reference to the word part position: `[on]BBCCCVVVWWWP`. Note these references reflect the versification of the Hebrew Bible and Greek New Testament via the Copenhagen Alliance 'ORG' scheme.
subject_referentsdict[str, list]initial key is edition (SBLGNT, WLC), list is corpus-specific word references where a 13 digit string encodes the reference to the word part position: `[on]BBCCCVVVWWWP`. Note these references reflect the versification of the Hebrew Bible and Greek New Testament via the Copenhagen Alliance 'ORG' scheme.
speechesdict[str, list[dict[str, obj]]]initial key is edition (SBLGNT, WLC), list is word-level information regarding reported speech the entity is responsible for. Each speech has a `quote_type` property (usually `Normal` but sometimes `Questioning` or other contextually important data) as well as `words`;a list of corpus-specific word references where a 13 digit string encodes the reference to the word part position: `[on]BBCCCVVVWWWP`. Note these references reflect the versification of the Hebrew Bible and Greek New Testament via the Copenhagen Alliance 'ORG' scheme.

AlternateSources

propertytypedescription
aquiferlistA pipe-delimited string (id|name|resource) derived from aquifer.bible
obilistInformation derived from openbible.info
ubsdbhlistInformation derived from the UBS Dictionary of Biblical Hebrew
ubsdbglistInformation derived from the UBS Dictionary of New Testament Greek
digital_atlas_roman_empirelistRelevant URL and ID from the Digital Atlas of the Roman Empire, extracted from OpenBible.info data
pleiadeslistRelevant URL and ID from the Pleiades Project, extracted from OpenBible.info data
tipnrlistRelevant ID from Tyndale House's StepBible project, Translators Individualized Proper Names with all References (TIPNR), extracted from OpenBible.info data
wikidatalistRelevant ID from wikidata, extracted from OpenBible.info data
wikipedialistRelevant ID from wikidata, extracted from OpenBible.info data