asr-tagalog

Project for COSI 136a ASR

Requirements

Command line: ffmpeg, sox/soxi

Python: Using Python 3.9+,

pip install -r requirements.txt

Resampling

Resample all mp3 files in a directory to wav files:

./resample.sh <DIR>

Check the total length of the resampled files:

soxi resampled/ | tail -n1

Split corpus

Split directory of parallel .TextGrid and .wav files into short segments to use in a model:

usage: split_corpus.py [-h] [--max-seconds MAX_SECONDS] indir outdir

positional arguments:
  indir                 Directory of parallel .TextGrid and .wav files to load
  outdir                Directory to write segmented parallel .txt and .wav files

options:
  -h, --help            show this help message and exit
  --max-seconds MAX_SECONDS
                        Maximum duration in seconds of segmented audio files

Statistics

Calculate statistics on the train, dev, and test splits (type/token counts, OOV rate):

usage: corpus_stats.py [-h] [--train TRAIN] [--dev DEV] [--test TEST]

options:
  -h, --help     show this help message and exit
  --train TRAIN  Directory for train partition containing .txt files
  --dev DEV      Directory for dev partition containing .txt files
  --test TEST    Directory for test partition containing .txt files

Training a model

Follow the instructions in train.ipynb to fine-tune a pre-trained Whisper model on the newly created data and evaluate the results. Note: this has only been tested in Google Colab using a T4 GPU, so there is no guarantee it won't crash on another platform/architecture, including on CPU.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
corpus_stats.py		corpus_stats.py
requirements.txt		requirements.txt
resample.sh		resample.sh
scraper.py		scraper.py
split_corpus.py		split_corpus.py
train.ipynb		train.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

asr-tagalog

Requirements

Resampling

Split corpus

Statistics

Training a model

About

Releases

Packages

Contributors 2

Languages

License

velociburner/asr-tagalog

Folders and files

Latest commit

History

Repository files navigation

asr-tagalog

Requirements

Resampling

Split corpus

Statistics

Training a model

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages