Skip to content

Commit

Permalink
DOC: Add doc/formats/seq/timit.md, fixes #157
Browse files Browse the repository at this point in the history
  • Loading branch information
NickleDave committed Jun 17, 2022
1 parent 653bc19 commit 30339fc
Showing 1 changed file with 21 additions and 0 deletions.
21 changes: 21 additions & 0 deletions doc/formats/seq/timit.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
(timit)=

# the TIMIT dataset

Annotations from transcription files in the
DARPA TIMIT Acoustic-Phonetic Continuous Speech Corpus (TIMIT).
See the README for the dataset here in the U. Penn Catalog:
<https://catalog.ldc.upenn.edu/docs/LDC93S1/timit.readme.html>

The formats that can be loaded with `crowsetta` are those used
by the .wrd and .phn transcription files,
where each segment is specified in terms of
the sample number in the audio files where it begins,
the sample where it ends,
and a text label.
Columns are in that order, and there is no header.
For more detail, see section 5 of the TIMIT README,
"File Types".

The annotations can be loaded with the following class:
{py:class}`crowsetta.formats.seq.timit.Timit`.

0 comments on commit 30339fc

Please sign in to comment.