Skip to content

Latest commit

 

History

History

cuentos-espanoles

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Corpus of Spanish Short Stories from 1880-1940: Corpus de Cuentos de la Edad de Plata

This corpus contains 20 texts (810 916 tokens) from 8 Spanish authors (Bazan, Blasco Ibáñez, Clarín, Galdós, Miró, Pereda, Unamuno y Valle), with a total of 302 short stories. It is part of the corpora of the PhD thesis of José Calvo Tello, who is a member of the young research group CLiGS, at the University of Würzburg, Germany.

See the "metadata.csv" file for information of the publication.

The TEI schema for the basic and the linguistically annotated TEI files corresponds to the general CLiGS schema which is available in the CLiGS reference repository.

The metadata keywords used in the text classification section of the TEI header are controlled by an external TEI keywords file and a schematron file which are stored in the keywords folder.

Formats

  • tei: following the Text Encoding Initiative and valid against the CLiGS schema (File name: id.xml)
  • txt_id: simple plain text of the body (File name: id.txt)
  • txt_author-title: simple plain text of the body (File name: Author_Title-id.txt)
  • annotated: TEI files further annotated with FreeLing and WordNet (keeping teiHeader and the chapter structure of the TEI)
  • pdf: Reading versions generated from the tei files

Copyright and Citation

  • The author's copyright of this texts have already expired. This collection is published under Creative Common Attribution 4.0 International.

  • Please provide a reference if you use this data in your teaching or research. The following is a citation suggestion: Corpus de Cuentos de la Edad de Plata, edited by José Calvo Tello. Würzburg: CLiGS, 2018. https://github.com/cligs/textbox/tree/master/spanish/cuentos-espanoles.