CTC LSTM

spoken word recognition using CTC LSTMs

Instructions

Create a virtual environment: python -m venv venv
Install the required packages: ./venv/bin/pip install -r requirements.txt
Train the model: ./venv/bin/python main.py train (takes a few hours and needs around 20GB disk and 5GB memory)
- or download my pre-trained model (25 epochs, not good) from here and move it to target/model-final.ckpt
Test the final model: ./venv/bin/python main.py test
Infer text from flac: ./venv/bin/python main.py infer audio.flac

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
data		data
target		target
.gitignore		.gitignore
license		license
main.py		main.py
presentation.pdf		presentation.pdf
readme.md		readme.md
requirements.txt		requirements.txt