Skip to content

LingConLab/data_oral_besleney_corpus

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Spoken Corpus of the Besleney Dialect of East Circassian Data Repository

DOI

This repository is the place where the data from the Spoken Corpus of the Besleney Dialect of East Circassian is curated. This repository also provides an alternative way to access corpus data locally. The data is stored in data_oral_besleney_corpus.csv with 9478 rows and 14 columns:

  • filename
  • time_start
  • time_end
  • speaker
  • recorded
  • sentence_id
  • text
  • translation
  • word_forms
  • morphonology
  • gloss
  • language
  • dataset_creator
  • dataset_provider

About corpus

The spoken Besleney corpus includes texts of a wide variety of genres recorded in the village of Ulyap in the Republic of Adygea. The Besleney dialect is the western-most dialect of the Kabardian (East Circassian) language, and it contains a significant number of differences from the literary standard. The Ulyap subdialect of the Besleney dialect also actively interacts with the closely related West Circassian language.

The texts were recorded and glossed during expeditions by the Russian State University of the Humanities (RSUH) and Higher School of Economics (HSE University) in 2011-2013, and Irina Bagirokova (HSE University) also took part in later glossing work on the corpora. Morphological analysis was carried out by expedition participants under the supervision of Peter Arkadiev (Institute for Slavic Studies RAS / RSUH) with the additional participation of Yury Lander (HSE University). Technical work on the corpus was completed by Anna Sorokina and Elena Sokur (HSE university) with support from the Linguistic Convergence Laboratory at HSE University in 2020. The corpus was created within the framework of the HSE University Basic Research Program and funded by the Russian Academic Excellence Project '5-100'.

How to cite the corpus and the data

If you use data from the Spoken corpus of Bashkir in your research, please cite as follows:

Peter Arkadiev, Irina Bagirokova, Anna Sorokina, Elena Sokur. 2020. Corpus of oral texts in Besleney Kabardian. Moscow: Linguistic Convergence Laboratory, HSE University. (Available online at https://lingconlab.ru/spoken_besleney/, accessed on ...)

You may contact with questions about the Corpus data or leave an issue in this repository:

yulander@hse.ru (Yury Lander)

You may contact with questions about the search platform or leave an issue in its own repository:

elena.o.sokur@gmail.com (Elena Sokur)