Check out the fully deployed project on our website. You can also find a more thorough walk-through of our process on our summary page.
- Jasmeet Bajwa: Github | LinkedIn
- Erendiz Tarakci: Github | LinkedIn
- Kay Royo: Github | LinkedIn
- Niama Bagaga: Github |LinkedIn
Our team was curious about the relationship between song genre, song lyrics, and sound features. We aimed to use various machine learning models to predict song genre using lyrics and sound data. The datasets used for this project were obtained from www.spotify.com and www.azlyrics.com . The initial dataset gathered from Spotify api contains 114,832 songs from 3,132 artists and 111 song genres while the AzLyrics dataset contains 147,872 songs from 6,464 artists and 111 song genres.
We had two main sources of data for this project: A lyric dataset from AZ Lyrics and a genre and audio dataset from the Spotify API. Below you can explore how each dataset was cleaned and eventually merged for the NLP model.
Below you can explore the training of our two models for NLP and audio feature ML:
- The Natural Language Processing model to predict the genre from the lyrics. Accuracy: 35%
- The Machine Learning KNN model to predict genre from the audio features. Accuracy: 55%
You can test the NLP model yourself by pasting lyrics into the textbox on our prediction page. We also have a quiz under development to predict genre based on audio features that you can preview on our quiz page.