Skip to content

Latest commit

 

History

History
99 lines (80 loc) · 9.84 KB

README.md

File metadata and controls

99 lines (80 loc) · 9.84 KB

ACAI Flora Data

Initial Draft Edition, v1.00, 2024-10-08

Rick Brannan, rick.brannan@biblionexus.org, and Jessica Parks, jessica.parks@biblionexus.org

Context

BiblioNexus and Biblica are working together to create data representing explicit and implicit instances of people and places (and other things) in the Bible. This information is being compiled as part of the ACAI project, the Aquifer Concept Architecture for Information, which itself is a part of the Bible Aquifer project.

Sources

  • Biblica/Clear-Bible's Macula Hebrew
  • Biblica/Clear-Bible's Macula Greek
  • United Bible Societies' UBS Dictionary of Biblical Hebrew (UBSDBH) and UBS Dictionary of the Greek New Testament (UBSDGNT). See SemanticDictionary.org for an implementation and the git repo ubs-open-license for data (English, French, Spanish, and Chinese). Macula Hebrew and Greek encode domains and references from these resources at the word level for most OT and NT words.
  • United Bible Societies' Images. The Images project associates image data with particular words in the Bible text. The UBS images data also includes references to UBS Thematic Lexica for Fauna, Flora, and Realia.
  • Copenhagen Alliance versification-specification. Bible references within this data reflect the 'ORG' scheme specified by the Copenhagen Alliance. This means that Old Testament references assume the versification structure of the Hebrew Bible, and New Testament references assume the structure of the Greek New Testament. For use with translations, the references may need to be be converted to the Copenhagen Alliance 'ENG' scheme. The repo cited above has information and sample code for how to achieve that; if assistance is needed please contact us.

Each of these sources are available as CC-BY-4.0 or CC-BY-SA-4.0 licensed data.

In particular, the UBS Dictionaries include information associating semantic entries with entries from the UBS Thematic Lexica (for Fauna, Flora, and Realia). These references allowed the aggregation of the bulk of the available data. Several of our English descriptions are directly inspired by these definitions. We owe much to the UBS team (and Reinier DuBlois in particular) for their work and for their licensing of the UBS Dictionaries under a CC-BY-SA-4.0 license.

In addition to these sources, BiblioNexus have done a significant amount of curation and supplementation in order to account for and model the data according to the needs of the ACAI project.

Status / Completeness of Project

We have aggregated as much Flora data as possible from the UBS Hebrew and Greek Dictionaries and supplemented with references in the UBS Images. We hope to be able to utilize some of the information in the UBS Thematic Lexicon for Flora to provide a more complete picture of Flora in the Bible. The Flora encoded are not comprehensive, but is useful as-is, but should be considered a draft.

Areas Still Under Development

Word-level references, particularly within the Hebrew Bible, may also have omissions. The UBS Dictionary of the Hebrew Bible, which portions of the word-level analysis are based on, remains a work in progress, and not all word tokens of the Hebrew Bible have been analyzed. Also, the Macula analysis of the Hebrew Bible is not as developed as its Greek New Testament counterpart, and the referent data information can only be considered to be an initial draft.

JSON Schema Documentation

This documentation represents the current (as of 2024-06-20) schema and is based on the Python dataclasses used while processing the place-specific data. We will do our best to keep this up to date, but there may be discrepancies.

AcaiFloraEntry

propertytypedescription
idstringa string representing the unique identifier of this group (e.g. `group:Pharisees`)
primary_idstringa group that has multiple representations has a primary entry identified by this string. The primary entry is identified where `primary_id` == `id`.
alternate_sourcesAlternateSources dataclassa dataclass that allows each possible alternate source to provide some information
typestringfor flora, `flora` is the only valid type
ubsdbglista list of domain.article (##.##) strings representing the UBSDBG annotation
ubsdbhlista list of strings representing the article identifier(s) from UBSDBH
localizationsdict[str, dict[str, list]]an object for collecting strings and other structures by language for localization purposes. This is where `preferred_label`, `alternate_labels`, and `description` are collected.
referred_to_aslista list of `id` for other entities that are considered functionally equivalent
only_mentioned_in_apocryphaboolIf `TRUE` this group is only mentioned in the apocryphal portions of the Bible (note: here "apocrypha" are the apocryphal books of the protestant edition of the NRSV with apocyrpha).
non_biblicalboolIf `TRUE` there is no mention whatever of this group in the Bible.
lemmasdict[str, list]The key is the language (`el`, `he`, or `arc`) with lemmas as values in the list.
key_referenceslistA list of BCV8 style references, where eight digits encode the zero-padded book (two digits), chapter (three digits), and verse (three digits) of the reference: `BBCCCVVV`. Note these references reflect the versification of the Hebrew Bible and Greek New Testament via the Copenhagen Alliance 'ORG' scheme.
referenceslistA list of BCV8 style references, where eight digits encode the zero-padded book (two digits), chapter (three digits), and verse (three digits) of the reference: `BBCCCVVV`. Note these references reflect the versification of the Hebrew Bible and Greek New Testament via the Copenhagen Alliance 'ORG' scheme.
explicit_instancesdict[str, list]initial key is edition (SBLGNT, WLC), list is corpus-specific word references where a 13 digit string encodes the reference to the word part position: `[on]BBCCCVVVWWWP`. Note these references reflect the versification of the Hebrew Bible and Greek New Testament via the Copenhagen Alliance 'ORG' scheme.
pronominal_referentsdict[str, list]initial key is edition (SBLGNT, WLC), list is corpus-specific word references where a 13 digit string encodes the reference to the word part position: `[on]BBCCCVVVWWWP`. Note these references reflect the versification of the Hebrew Bible and Greek New Testament via the Copenhagen Alliance 'ORG' scheme.
subject_referentsdict[str, list]initial key is edition (SBLGNT, WLC), list is corpus-specific word references where a 13 digit string encodes the reference to the word part position: `[on]BBCCCVVVWWWP`. Note these references reflect the versification of the Hebrew Bible and Greek New Testament via the Copenhagen Alliance 'ORG' scheme.
speechesdict[str, list[dict[str, obj]]]initial key is edition (SBLGNT, WLC), list is word-level information regarding reported speech the entity is responsible for. Each speech has a `quote_type` property (usually `Normal` but sometimes `Questioning` or other contextually important data) as well as `words`;a list of corpus-specific word references where a 13 digit string encodes the reference to the word part position: `[on]BBCCCVVVWWWP`. Note these references reflect the versification of the Hebrew Bible and Greek New Testament via the Copenhagen Alliance 'ORG' scheme.

AlternateSources

propertytypedescription
aquiferlistA pipe-delimited string (id|name|resource) derived from aquifer.bible
obilistInformation derived from openbible.info
ubsdbhlistInformation derived from the UBS Dictionary of Biblical Hebrew
ubsdbglistInformation derived from the UBS Dictionary of New Testament Greek
ubsfaunalistPointer to relevant article in the UBS Thematic Lexicon: Fauna
ubsfloralistPointer to relevant article in the UBS Thematic Lexicon: Flora
ubsrealialistPointer to relevant article in the UBS Thematic Lexicon: Realia
digital_atlas_roman_empirelistRelevant URL and ID from the Digital Atlas of the Roman Empire, extracted from OpenBible.info data
pleiadeslistRelevant URL and ID from the Pleiades Project, extracted from OpenBible.info data
tipnrlistRelevant ID from Tyndale House's StepBible project, Translators Individualized Proper Names with all References (TIPNR), extracted from OpenBible.info data
wikidatalistRelevant ID from wikidata, extracted from OpenBible.info data
wikipedialistRelevant ID from wikidata, extracted from OpenBible.info data