African language Speech Recognition

In this project we are going to build deep learning model to process and convert African language (Amharic) speech/voice to text format.

Table of Content

Introduction
Install
Data
Notebooks
Scripts
Technologies used

Introduction

The World Food Program wants to deploy an intelligent form that collects nutritional information of food bought and sold at markets in three different countries in Africa - Ethiopia and Kenya.

The design of this intelligent form requires selected people to install an app on their mobile phone, and whenever they buy food, they use their voice to activate the app to register the list of items they just bought in their own language. The intelligent systems in the app are expected to live to transcribe the speech-to-text and organize the information in an easy-to-process way in a database.

It is our obligation to create a deep learning model capable of converting speech to text. The model we create should be precise and resistant to background noise. This project was created during the fourth week of the Machine Learning training session at 10Academy.

Instalation

Install Required Python Moduls

git clone https://github.com/Micky373/speech_to_text
cd speech_to_text
pip install -r requirements.txt

Jupiter Notebook

cd notebooks
jupyter notebook

Model Training ui (Not implemented yet)

mlflow ui

Dashboard (Not implemented yet)

streamlit run app.py

Data

The folder is being tarcked with DVC and the files are only shown after cloning and setting up locally. The sub-folder AMHARIC contain training and testing files for our model. Both files contain similar file structure.

wav/ : a folder containing all audio files
text : file contining the metadata (audio file name and cropsonding transcription)
spk2utt, trsTest.txt, utt2spk, wav.scp : these are files provided with the dataset, Currently they don't have a purpose but could be used for future analysis.

Notebooks

Preprocessing.ipynb: all the data preprocessing done here before model training.

Name		Name	Last commit message	Last commit date
Latest commit History 173 Commits
.dvc		.dvc
.github/workflows		.github/workflows
charts		charts
dashboard		dashboard
data		data
logs		logs
models		models
notebooks		notebooks
scripts		scripts
src		src
.dvcignore		.dvcignore
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
dvc.lock		dvc.lock
dvc.yaml		dvc.yaml
params.yaml		params.yaml
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

African language Speech Recognition

Table of Content

Introduction

Instalation

Data

Notebooks

Scripts

Technologies used

Contributors

About

Releases

Packages

Languages

License

benbel376/speech_to_text

Folders and files

Latest commit

History

Repository files navigation

African language Speech Recognition

Table of Content

Introduction

Instalation

Data

Notebooks

Scripts

Technologies used

Contributors

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages