-
Notifications
You must be signed in to change notification settings - Fork 10
Computational Morphology
Date: Thursday, March 16, 2017, 17h00-18h15 (CET time)
Session coordinators: Dag Haug (University of Oslo) and Barbara McGillivray (Alan Turing Institute)
YouTube link: https://www.youtube.com/watch?v=W-mrxsaS1KU
Slides: https://drive.google.com/file/d/0BzocAohSpt3-SGJOa3lDU1UxVGs/view?usp=sharing
This lecture gives an introduction to Computational Morphology for Latin. We introduce part-of-speech tagging and tagging of other morphological features using existing tools. Further, we present how to perform morphological tagging on Latin texts.
- Definition of computational morphology and its relevance to digital classicists
- Part-of-speech tagging and morphological tagging
- TreeTagger
- Running TreeTagger on Latin texts
- Barbara McGillivray (2013). Methods in Latin Computational Linguistics. Leiden: Brill (Chapters 1 and 2), available for the students of this course here
- Helmut Schmid (1994). Probabilistic Part-of-Speech Tagging Using Decision Trees. Proceedings of International Conference on New Methods in Language Processing, Manchester, UK. Available: http://www.cis.uni-muenchen.de/~schmid/tools/TreeTagger/data/tree-tagger1.pdf
- Daniel Jurafsky and James H. Martin (2009), Speech and Language Processing. An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition. 2nd Edition. Pearson Prentice Hall Available: http://www.cs.colorado.edu/%7Emartin/SLP/Updates/1.pdf
Discuss the importance and challenges of automatic morphological annotation for the comprehension of historical documents. You may consider the advantages of automatic annotation over manual processes, and related questions, such as scalability.
Create a text file with a Latin text you are interested in. (Just the text, no formatting.) Tag it with the two different Latin models available at the Tree Tagger site. Look at the results and analyze the errors that the models make. How reliable are the results?