Computational Linguistics

Sunoikisis Digital Classics: Fall 2021

Session 9: Computational Linguistics

Thursday Dec 2, 17:15–18:45 CET

Convenors: Alek Keersmaekers (KU Leuven), Martina Astrid Rodda (Oxford)

Youtube link: https://youtu.be/hPGw1yNTZUs

Session outline

This session will introduce some questions and approaches in Computational Linguistics. We will start by discussing how the approaches we will examine today build on what we saw in the previous sessions on search tools, text analysis, treebanking and translation alignment. We will also give a broad overview of Computational Linguistics as applied to ancient languages: what are the main questions it tries to address and the work that has been done for Greek and Latin, an introduction to the most important concepts and a discussion of the challenges that Greek and Latin present. We will also include two case studies: the first will illustrate how a computational approach can be used to study literary features, specifically the behaviour of formulae in early Greek epic poetry (Homer, Hesiod, and the Homeric hymns). Quantitative data shows that the behaviour of recurring set phrases in early epic poetry is measurably different from both set phrases in non-epic material and recurring expressions in later epic. The second case study discusses so-called ‘transformer’-based approaches to natural language processing, which use neural networks trained on a large corpus to obtain detailed mathematical representations about the usage of a given word. It will show how Electra, a transformer model tailored to languages with a smaller corpus, can considerably improve the state of the art for natural language processing in Greek through a number of examples.

Seminar readings

Yannis Assael, Thea Sommerschield & Jonathan Prag. 2019. “Restoring ancient text using deep learning: a case study on Greek epigraphy.” Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, pp. 6368–6375. Available: https://arxiv.org/abs/1910.06262
Florent Perek. “Using Distributional Semantics to Study Syntactic Productivity in Diachrony: A Case Study.” Linguistics 54 (2016): 149–88. Available: https://doi.org/10.1515/ling-2015-0043.

Other resources

DISSECT -- Distributional Semantics Composition Toolkit https://doi.org/10.5281/zenodo.3368837.

Exercise

Go back to the exercises for sessions 1 (Philological Search Tools) and 3 (Text Analysis with Voyant). Choose one of the texts you already looked at and discuss:
- How would you apply (one of) the approaches discussed in this session to the analysis of your target text?
- What resources would you need (e.g. corpora, lemmatised/treebanked texts, software)?
- Are these resources available, and where?
- (How) would the approaches discussed in this session allow you to develop your initial research questions from sessions 1 and 3?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly