This corpus contains 24 novels and short novels from 8 Spanish-American (Argentine and Mexican) authors (3 texts per author). It is part of a larger corpus being prepared by Ulrike Henny-Krahmer for her PhD thesis. The corpus has been created in the context of the young research group for Computational Literary Genres Stylistics (CLiGS) at the University of Würzburg in Germany. See the metadata.csv file for basic information about the novels.
The following tables give overviews of the corpus' characteristics:
author | number of texts (words) |
---|---|
Bunge, Carlos Octavio | 3 (101k) |
Cambaceres, Eugenio | 3 (115k) |
Cuéllar, José Tomás de | 3 (72k) |
Frías, Heriberto | 3 (100k) |
Gutiérrez, Eduardo | 3 (309k) |
Holmberg, Eduardo Ladislao | 3 (74k) |
Payró, Roberto | 3 (134k) |
Sicardi, Francisco | 3 (224k) |
total | 24 (1,128k) |
decade | number of texts (words) |
---|---|
1870s texts | 1 (35k) |
1880s texts | 8 (484k) |
1890s texts | 7 (285k) |
1900s texts | 6 (202k) |
1910s texts | 2 (123k) |
total | 24 (1,128k) |
subgenre | number of texts (words) |
---|---|
autobiographical | 1 (24k) |
costumbrista | 5 (184k) |
crime | 1 (20k) |
fantastic | 1 (20k) |
gaucho | 3 (309k) |
historical | 4 (172k) |
naturalistic | 6 (340k) |
science fiction | 1 (35k) |
sentimental | 1 (13k) |
social | 1 (14k) |
total | 24 (1,128k) |
The texts have been compiled from the following sources:
- Wikisource (7 texts)
- Biblioteca Virtual Miguel de Cervantes (5 texts)
- Internet Archive (3 texts)
- Biblioteca Digital Argentina (2 texts)
- La novela corta. Una biblioteca virtual (2 texts)
- Project Gutenberg (2 texts)
- Biblioteca Digital Hispánica (1 text)
- Biblioteca Virtual Antorcha (1 text)
- El Libro Total (1 text)
Source formats have been HTML, PDF and image files.
- tei: Encoded following the Guidelines of the Text Encoding Initiative and valid against the CLiGS schema (File names: identifier.xml, e.g. nh0002.xml)
- txt_id: Simple plain text containing only the main text of the novels (File names: identifier.txt, e.g. nh0002.txt)
- annotated: TEI files further annotated with FreeLing and WordNet
- pdf: Reading versions generated from the TEI files
- The TEI schema for the basic and the linguistically annotated TEI files corresponds to the general CLiGS schema which is available in the CLiGS reference repository.
- The metadata keywords used in the text classification section of the TEI header are controlled by an external TEI keywords file and a schematron file which are stored in the keywords folder.
- The texts have been submitted to a spellcheck based on a dictionary for contemporary Spanish. The result of the check can be found in spellcheck.csv.
- The author's copyright of this texts have already expired. This collection is published under Creative Common Attribution 4.0 International.
- Please provide a reference if you use this data in your teaching or research. The following is a citation suggestion: Collection of 19th Century Spanish-American Novels (1880-1916), edited by Ulrike Henny-Krahmer. Würzburg: CLiGS, 2017. https://github.com/cligs/textbox/master/spanish/novela-hispanoamericana/.