DOC: Add doc/formats/seq/timit.md, fixes #157

vocalpy · Jun 17, 2022 · 30339fc · 30339fc
1 parent 653bc19
commit 30339fc
Showing 1 changed file with 21 additions and 0 deletions.
diff --git a/doc/formats/seq/timit.md b/doc/formats/seq/timit.md
@@ -0,0 +1,21 @@
+(timit)=
+
+# the TIMIT dataset
+
+Annotations from transcription files in the 
+DARPA TIMIT Acoustic-Phonetic Continuous Speech Corpus (TIMIT).
+See the README for the dataset here in the U. Penn Catalog:  
+<https://catalog.ldc.upenn.edu/docs/LDC93S1/timit.readme.html>
+
+The formats that can be loaded with `crowsetta` are those used 
+by the .wrd and .phn transcription files, 
+where each segment is specified in terms of 
+the sample number in the audio files where it begins, 
+the sample where it ends,
+and a text label. 
+Columns are in that order, and there is no header.
+For more detail, see section 5 of the TIMIT README, 
+"File Types".
+
+The annotations can be loaded with the following class: 
+{py:class}`crowsetta.formats.seq.timit.Timit`.