from kitchen import coffee
Lots of people are creative enough to write awesome lyrics, but they lack the musical knowledge to create songs with them. What if we had a program to generate a good song depending on the text input? That's exactly what AutoDJ does.
Using Microsoft's Text Analysis API from Cognitive Services, we analyse the key-phrases of the input text given by the user. We also look if it is happy / sad, using the sentiment analysis feature of the API. We compare the resulted features against a large collection of song lyrics, using the cosine similarity measure. The first few best matches are then fed into a Restricted Boltzmann Machine that generates a new song from them.
First we downloaded a large number of popular songs from YouTube (the instrumentals only) and their lyrics separately from different websites, we parse the midi and lyrics files then index them in a JSON
database. After that we have a script that calls the Microsoft's Text Analysis API and gets the key phrases for the user input, and compares it using the cosine similarity measure with the key phrases from all the lyrics indexed. We take the instrumental songs of the best matches (encoded as .midi
) and feed them into the neural network that generates a new .midi
with similar sounding.
It's very hard to generate new music, as this is a cutting-edge current research topic. It is also very hard (and thus unreliable) to convert .mp3
into .midi
. We had some troubles doing that, and even now it's not too accurate. Of course we also had the usual issues with permissions, dependencies, ƲƮƑ-8, etc., but who doesn't have those ?
{
"MIDI_PATH": "PATH/TO/ALL/THE/MIDI/FILES/",
"LYRI_PATH": "PATH/TO/ALL/THE/LYRI/FILES",
"MS_CS_API_KEY": "<INSERT KEY HERE>",
"INPUT_FILE": "USERINPUTFILE.txt"
}
By default the MIDI_PATH
is inside data/midi/
and the LYRI_PATH
is data/lyri/
.
The MS_CS_API_KEY
is the 32 character long string containing both letters and numbers provided by the Microsoft API.
This key can be found on My Account page and should look something like this:
The default input file for the user input (transferred using PHP from the main website) is stored in input1.txt
.
Praesent tincidunt accumsan orci vel eleifend. Vestibulum et luctus purus. Vestibulum eu rhoncus enim. Donec pretium posuere scelerisque. Nam nec tellus orci. Ut ac magna tempor, tincidunt nulla eu, pretium elit. Vestibulum faucibus neque sed neque rutrum, a dignissim diam finibus. Vivamus quis consectetur neque.
The indexer.py
in conjuction with makeuplink.py
create the INDEX.json
which stores all the lyric names, artist and the sentiment score offered by Microsoft Cognitive Services, Text Analytics API. The indexer script can be run with multiple arguments:
python indexer.py midi
which goes trough the specifiedMIDI_PATH
and normalizes all file names.python indexer.py lyri
which goes trough the specifiedLYRI_PATH
and normalizes all file names, after that it removes any white spaces and empty lines from inside the lyrics file.python indexer.py index
calls the indexer in production mode, meaning that it will read all training data then call the Microsoft API and index: the lyricsfilename
with their appropiate sentimentscore
and the corespondingmidi
file. All indexed data is stored inJSON
format, representingPython Dictionary
objects.
The structure of the JSON
INDEX
is as follows:
{
"score": 0.4,
"hash": "4b3510047885e8d8a5faff9ce821ee234d5cfd8680aae44ba40e5f749637f8cf",
"filename": "awesome-song-by-mozzart-2016.lyri",
"sentiment": "sad",
"midi": "/SOMEPATH/oxfordhack-2016/data/midi/awesome-song-by-mozzart-2016.midi"
}
The score
represents the sentiment value given by Microsoft Text Analytics API, from which we obtain sad
or hpy
. The hash is associated to the .lyri
file, as the hash is a SHA256
hash of the content of the lyri file. The filename
points to the origin of the lyrics while the midi
represents the location of the midi instrumental.
Pellentesque viverra nunc vel nisi viverra porta. Aliquam dolor quam, sodales et arcu eget, posuere hendrerit magna. Vestibulum non rhoncus est. Pellentesque ullamcorper nibh a mi finibus volutpat. Donec facilisis quam massa, eget tincidunt tortor pretium vel. Aliquam erat volutpat. Mauris elementum turpis ut dui venenatis, eget porttitor eros faucibus.
In ligula massa, dignissim a sapien vitae, ornare dignissim leo. Vestibulum ante ipsum primis in faucibus orci luctus et ultrices posuere cubilia Curae; Curabitur nec lectus libero. Sed metus sapien, interdum non porttitor in, mattis et nisl. Praesent nec tincidunt nisi. Vivamus volutpat urna id rhoncus eleifend. Mauris ligula urna, sollicitudin quis erat in, pulvinar blandit tellus. Donec felis nibh, sagittis nec feugiat at, gravida id mauris. Phasellus sed odio at urna bibendum hendrerit nec nec augue. Donec sit amet sodales lectus. Quisque posuere sapien vitae ex aliquam tincidunt. Maecenas ut odio sit amet sapien dapibus gravida et eu felis.
Microsoft Cognitive Services, Text Analytics API is used by sending text data (lyrics) to the Microsoft API then receiving a respons containing: keywords (obtained using Natural Language Processing Natural Language Processing), language and sentiment score (represented by a numeric score between 0 and 1). Scores close to 1 indicate positive sentiment and scores close to 0 indicate negative sentiment. Sentiment score is generated using classification techniques.
Devpost Submission - Devpost Link
The Devpost Page was created for the Oxford Hack 2016, 24 hackathon.
The !AFK License is a derivative of Simple Machine License
Video - Video Link
The video on YouTube, contains a demonstration and explanation (5 minutes long) of how the program works.