Skip to content

Paratext Key Terms Parsing

Eli C. Lowry edited this page Sep 20, 2024 · 5 revisions

Currently, Paratext key terms are incorporated as training data by default for SmtTransfer and Nmt builds (but will not be pretranslated). In order to disable this feature, specify "use_key_terms":false in the options parameter when starting a build.

Terms will be extracted from the "TermRenderings.xml" of the Paratext project. Terms will be aligned and trained on across projects if both projects are using the same Biblical terms list (e.g. "Major::BiblicalTerms"). Only Biblical terms that are proper nouns will be included. If both projects are using the "Major" Biblical terms list, for a project in English, Spanish, French, Indonesian, or Portuguese, glosses will be used from the localizations of the Major Biblical terms in that language for terms that are not already present in the project's "TermRenderings.xml". All terms are cleaned, removing asterisks, question marks, and parenthetical information.