Skip to content

Latest commit

 

History

History

novela-hispanoamericana

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Collection of 19th Century Spanish-American Novels (1880-1916)

Contents

This corpus contains 24 novels and short novels from 8 Spanish-American (Argentine and Mexican) authors (3 texts per author). It is part of a larger corpus being prepared by Ulrike Henny-Krahmer for her PhD thesis. The corpus has been created in the context of the young research group for Computational Literary Genres Stylistics (CLiGS) at the University of Würzburg in Germany. See the metadata.csv file for basic information about the novels.

The following tables give overviews of the corpus' characteristics:

Number of texts and words by author:

author number of texts (words)
Bunge, Carlos Octavio 3 (101k)
Cambaceres, Eugenio 3 (115k)
Cuéllar, José Tomás de 3 (72k)
Frías, Heriberto 3 (100k)
Gutiérrez, Eduardo 3 (309k)
Holmberg, Eduardo Ladislao 3 (74k)
Payró, Roberto 3 (134k)
Sicardi, Francisco 3 (224k)
total 24 (1,128k)

Number of texts and words by decade:

decade number of texts (words)
1870s texts 1 (35k)
1880s texts 8 (484k)
1890s texts 7 (285k)
1900s texts 6 (202k)
1910s texts 2 (123k)
total 24 (1,128k)

Number of texts and words by subgenre:

subgenre number of texts (words)
autobiographical 1 (24k)
costumbrista 5 (184k)
crime 1 (20k)
fantastic 1 (20k)
gaucho 3 (309k)
historical 4 (172k)
naturalistic 6 (340k)
science fiction 1 (35k)
sentimental 1 (13k)
social 1 (14k)
total 24 (1,128k)

Sources

The texts have been compiled from the following sources:

Source formats have been HTML, PDF and image files.

Formats

  • tei: Encoded following the Guidelines of the Text Encoding Initiative and valid against the CLiGS schema (File names: identifier.xml, e.g. nh0002.xml)
  • txt_id: Simple plain text containing only the main text of the novels (File names: identifier.txt, e.g. nh0002.txt)
  • annotated: TEI files further annotated with FreeLing and WordNet
  • pdf: Reading versions generated from the TEI files

Schema

  • The TEI schema for the basic and the linguistically annotated TEI files corresponds to the general CLiGS schema which is available in the CLiGS reference repository.
  • The metadata keywords used in the text classification section of the TEI header are controlled by an external TEI keywords file and a schematron file which are stored in the keywords folder.

Data Curation

  • The texts have been submitted to a spellcheck based on a dictionary for contemporary Spanish. The result of the check can be found in spellcheck.csv.

Copyright and Citation

  • The author's copyright of this texts have already expired. This collection is published under Creative Common Attribution 4.0 International.
  • Please provide a reference if you use this data in your teaching or research. The following is a citation suggestion: Collection of 19th Century Spanish-American Novels (1880-1916), edited by Ulrike Henny-Krahmer. Würzburg: CLiGS, 2017. https://github.com/cligs/textbox/master/spanish/novela-hispanoamericana/.