CNN-RNN hybrid model using PyTorch for genre classification of diverse music tracks
Ashley Leal, Winnie Hsiang, Victor Deng, Coline Zhang
This project aims to successfully implement a CNN-RNN hybrid model for music genre classification. With the ever-expanding volume of music content available on the internet, the demand for efficient and automated music classification systems becomes increasingly vital. Deep learning, especially when leveraging PyTorch, offers an ideal approach for this task, as it can learn from pre-classified music and extract crucial information for accurately classifying newly created tracks. The core goals of the project are outlined as follows:
- Collect and preprocess a large dataset of diverse music tracks, including a wide range of genres and sub-genres.
- Design and implement a CNN-RNN hybrid model architecture using PyTorch, taking advantage of the strengths of both convolutional neural networks (CNNs) and recurrent neural networks (RNNs) for music classification.
- Achieve prediction accuracy over 70% on new music tracks. The propsed deep learning model takes a grayscale image of an MFCC spectrogram as a numpy 2D array as input and outputs a predicted classification for the input music track.