Skip to content

Music Genre Classificator for the the GTZAN music corpus

Notifications You must be signed in to change notification settings

fferlito/Music-Genre-Classifier

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

59 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Music-Genre-Classificator

Different architectures to lassify music files based on genre from the GTZAN music corpus, namely:

  • Convolutional Neural Network (CNN)
  • Recurrant Neural Network (RNN)
  • Inception V3
  • MobileNet V2

(Implementated with Tensorflow)

Dataset

In the GTZAN music corpus there's 10 genres with 100 songs each (1000 in total): 80% of it was used during the training phase (800 images), and 20% for testing (200 images). After the split, each song of 30 seconds is split in chunks of 10 seconds (resulting in 2400 and 600 training and testing samples).

Dataset can be downlaoded here: http://marsyas.info/downloads/datasets.html

Audio augmentation

To increase further the amount of data, some augmentation were done on the audio files. For each song chunk, we applied:

  • Add light random noise in the wave form
  • Add intense random noise in the wave form
  • Increase randomly pitch (2% at most)

Audio features extracted

For exctracting the audio features, the library librosa was used.

Example of Mel-frequency Spectrograms

Results

Model Trainig accuracy Test accuracy
MobileNet V2 (TL) 77% 77%
Inception V3(TL) 99% 84%
CNN 55% 62%
RNN 77% 66%

About

Music Genre Classificator for the the GTZAN music corpus

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published