Discussion of Dell’Oro 2020 & Vierros 2018 #13

gabrielbodard · 2020-11-17T12:22:07Z

Francesca Dell'Oro, Helena Bermúdez Sabel & Paola Marongiu. 2020. “Implemented to Be Shared: the WoPoss Annotation of Semantic Modality in a Latin Diachronic Corpus.” Sharing the Experience: Workflows for the Digital Humanities. Proceedings of the DARIAH-CH Workshop 2019. Available: https://zenodo.org/record/3739440#.XzqoTZMzZTZ
Vierros, M. 2018. “Linguistic Annotation of the Digital Papyrological Corpus: Sematia.” In Nicola Reggiani (Editor), Digital Papyrology II: Case Studies on the Digital Edition of Ancient Greek Papyri. Berlin, Boston: De Gruyter. Pp. 105–118. Available: https://doi.org/10.1515/9783110547450-006

With both of these, please think about the aims and research questions behind the tools and methods discussed, rather than the technology and implementation.

chiaradimaio · 2020-11-20T16:34:00Z

The article by Francesca Dell'Oro describes the project called A World of Possibilities [WoPoss], aimed at tracking the evolution of modal meanings in the Latin language.

The writer explains how this analysis is carried out towards three steps:

Modals meaning 'necessity', 'possibility' and 'volition' in Latin are first collected from a diachronic corpus that ranges from 3rd century BCE to 7th century CE, including literary and documentary texts from different Latin-speaking regions of the ancient world.
These selected texts are checked and confirmed to be philologically correct, so that they can be reused under a creative commons licence.
Then, all text files are converted to plain text, but important structural information is kept (thanks to the so-called pseudo-markup)
The tool INCEpTION (a multi-modular annotation platform) is customized and adapted to the needs of this project: expressing the modal marker, its scope and their relation.
Then the WoPoss team carries on with manual annotation, which is particularly useful in cases of ambiguity, since the description of passages could allow future users to notice semantic shift.
The annotated files are exported in XMI and transformed according to the TEI standards. Multiple layers of linguistic annotation include: most ancient meaning of each modal marker; transformation of the pseudo-markup into the correspondent TEI elements; addition of metadata to each text, concerning chronology, genre, transmission, authorship.

The resulting TEI dataset will be freely accessible through a user-friendly interface. The whole WoPoss project is an open science product, stored in an open GitHub repository.

HLBallard44 · 2020-11-21T01:58:16Z

Vierros, M. “Linguistic Annotation of the Digital Papyrological Corpus: Sematia.”
This article focuses on a developing digital papyrological corpus called Sematia and its selected approach.

Corpus Design

Corpora in historical linguistics is usually concerned with how languages have developed and evolved through time. Sematia will be open-ended and include a corpus of Greek used in documentary papyri for a period of about a thousand years.
The users of Semata will be able to decide what they want to annotate or include in their searches. This way they can contribute to the corpus as a whole and researchers can select their own subcorpus and perform queries which makes repeating the research to obtain a consistent result easier.

How to Annotate Papyri

Sematia will provide the "basic" level of annotation in the hopes that the whole corpus will eventually be annotated. Current annotation includes morphological and syntactic annotation using dependency treebanks on Arethusa which has an API integration in Sematia.
Utilizes the Ancient Greek and Latin Dependency Treebank and the PROIEL treebank for Dependency Grammar. Annotation through Arethusa includes tokenization and then automatic lemmatization and morphological tagging which has to be checked and corrected by a human annotator.
Sematia creates two annotated parallel layers of the same text which allows researchers to study one version which has been preserved (the original layer) or to compare the preserved text with the standardized version possibly creating a third layer called variation.

Metadata and Its Purpose

Date and place of origin of papyri automatically put into Sematia from the PN (Papyrological Navigator) metadata field. Soon, PN metadata will include aspects of handwriting and writers vs authors to be able to identify writings by the same hand, to study idiolects, and to compare some writers to others.

The goals for Sematia are to have the whole papyrological corpus available, phonological searches, and an automatic morphological parser for Greek.

nicolealexandra33 · 2020-11-22T15:08:52Z

It would be interesting to see how Sematia does with the PGM especially as multiple languages are used (although as I understand it, Latin and Greek are currently the ones that the software can process). It could maybe help determine which spells were also translated or taken from another linguistic tradition despite the entry being in a different language

chiaradimaio · 2020-11-23T11:55:22Z

With regard to the background of the authors, it is worth saying that Francesca dell'Oro is currently teaching linguistics, but has a background as a classical philologist, while Marja Vierros is a classical philologist and a papyrologist. Both of them are mainly interested in the historical and developmental aspect of ancient languages. Their tools are highly customized and specialised, but they have the same target: offering to specialists reliable corpora that enable visualizing information about the diachrony of certain linguistic phenomena.

gabrielbodard added discussion ICS02 labels Nov 17, 2020

gabrielbodard self-assigned this Nov 17, 2020

gabrielbodard closed this as completed Jan 11, 2021

gabrielbodard reopened this Jan 11, 2021

gabrielbodard closed this as completed Jan 11, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Discussion of Dell’Oro 2020 & Vierros 2018 #13

Discussion of Dell’Oro 2020 & Vierros 2018 #13

gabrielbodard commented Nov 17, 2020

chiaradimaio commented Nov 20, 2020

HLBallard44 commented Nov 21, 2020

nicolealexandra33 commented Nov 22, 2020

chiaradimaio commented Nov 23, 2020

Discussion of Dell’Oro 2020 & Vierros 2018 #13

Discussion of Dell’Oro 2020 & Vierros 2018 #13

Comments

gabrielbodard commented Nov 17, 2020

chiaradimaio commented Nov 20, 2020

HLBallard44 commented Nov 21, 2020

nicolealexandra33 commented Nov 22, 2020

chiaradimaio commented Nov 23, 2020