Skip to content

aammaan/Audio-Word-Classification-Using-Gaussian-Mixture-Model-GMM-

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Audio Classification Project

Overview

This project focuses on classifying a dataset of 65,000 one-second audio utterances into 30 distinct words using a Gaussian Mixture Model (GMM). The classification task adheres to competition guidelines and utilizes classical machine learning techniques. Python scripts were developed to preprocess data, train the GMM model, and classify test audio files accurately.

Features

  • Audio Classification: Utilizes a Gaussian Mixture Model (GMM) to classify 65,000 one-second audio utterances into 30 distinct words.
  • Data Processing: Implements classical machine learning techniques for data preprocessing and model creation.
  • Python Scripts: Developed Python scripts ensure accurate classification of test audio files, emphasizing precision and scalability in audio recognition tasks.

System Architecture

Model Training and Evaluation

  • Gaussian Mixture Model (GMM): Selected for its suitability in handling audio data and achieving high classification accuracy.
  • Feature Extraction: Utilizes Mel-frequency cepstral coefficients (MFCCs) and their derivatives for capturing audio features.
  • Data Handling: Detailed steps for data preprocessing, feature extraction, and model training are included in the provided Python scripts.

Code Structure

  • Kaggle_2.py: Python script for preparing the audio data, including feature extraction and normalization. It is also used for training the Gaussian Mixture Model (GMM) on the preprocessed data.
  • script.py: Script for classifying test audio files using the trained model.

Link to DATASET

Link 2

About

GMM model to classify audio words.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages