This project utilizes deep learning techniques for the classification of the free-spoken-digit-dataset, offering an audio-oriented counterpart to the traditional MNIST dataset.
To run this project, ensure that you have Python 3 installed on your system. Follow these steps:
-
Create a Python environment using the following commands:
conda create -n audio python=3 activate audio
-
Install the necessary libraries listed in the requirements.txt file:
pip install -r requirements.txt
-
Clone this repository and navigate to the root folder.
-
Run the main.py file:
python src/main.py
When you execute the main.py file, it will perform the following workflow:
-
Folder Preparation:
- Create the necessary folders for organizing data, models, and reports:
-
Data Arrangement:
-
Production Testing Data Selection:
-
Graphical Representations:
-
Feature Extraction:
-
Data Splitting:
-
Model Training:
- Train the models using the prepared dataset.
-
Performance Assessment:
-
Model Saving:
In addition to images, refer to console logs for a clear understanding of the program's evolution.
Let's try the app in production by following these steps:
-
Run the Application:
- Execute the
app.py
file to launch the application.
- Execute the
-
Access the Application:
-
Open your web browser and go to http://127.0.0.1:5000.
-
-
Choose Model for Prediction:
- Select the desired model for prediction from the available options:
- CNN Model: http://127.0.0.1:5000/predict_using_cnn/
- Conv1D Model: http://127.0.0.1:5000/predict_using_conv1d/
- LSTM Model: http://127.0.0.1:5000/predict_using_lstm/
- Hybrid Model: http://127.0.0.1:5000/predict_using_hybrid/
- Select the desired model for prediction from the available options:
-
Audio File Prediction:
- Choose an audio file from the
data/production_data
folder.
- Choose an audio file from the
-
Insert Parameter:
- Insert the selected file's name as a parameter in the prediction route URL.
-
Run Prediction:
-
Audio-Classification with Seth Adams: Seth Adams' repository on Audio-Classification, offering valuable insights and code that contributed to the development of this project.
-
Deep Learning (Audio) Application: From Design to Deployment: A video tutorial providing a comprehensive overview of designing and deploying deep learning applications for audio, influencing the development of this project.
-
Deep Learning for Audio Classification: A series of videos covering deep learning techniques specifically tailored for audio classification, serving as a valuable resource during the project.
Feel free to explore these references for deeper insights and guidance on audio classification, deep learning, and related topics.
Happy coding!