Investigations into Evolutionary Linguistics using the Google Ngrams corpus. Sub-projects include Birth and Death of English Lexemes in Closed Lexical Classes | Lexicon Size
-
Updated
Sep 14, 2023 - Jupyter Notebook
Investigations into Evolutionary Linguistics using the Google Ngrams corpus. Sub-projects include Birth and Death of English Lexemes in Closed Lexical Classes | Lexicon Size
Atelier de visualisation cartographique dans le cadre de la Summer School "Phonologie de corpus", UNIL (22-26.07.2019)
A script for processing linguistic data with interlinear glosses from a PDF
Pdf, tex and data of corpus linguistics lessons delivered in Cagliari, December 2017
Word2Vec Model for Koine Greek Categorisation
Python class for creating vrt-annotated corpora
Java Software to analyze text files.
A web scraper for the student newspaper of Covenant College.
Custom search-engine for a small corpora
R script to calculate the Average Reduced Frequency (ARF) of all words in a corpus
Notebooks for processing various versions of the Switchboard corpus.
Notebook converts the Fisher Corpus to a relational format and processes it for a language model.
A repository describing the construction of a unigram language model from the Fisher corpus
Aufgaben zum Programmierkurs - Universität Stuttgart - Wintersemester
DataLad superdataset including all the datasets currently managed by the LAAC/LSCP team
corpus of unannotated, tokenized and lemmatized french sentences (22M).
Add a description, image, and links to the linguistic-corpora topic page so that developers can more easily learn about it.
To associate your repository with the linguistic-corpora topic, visit your repo's landing page and select "manage topics."