Hadoop based project that can consolidate and manipulate music contextual data.
- Virtual Box / Ubuntu Linux OS
- JDK 1.7
- JRE 1.7
- Apache Hadoop 2.7.2
Register yourself as one of the hadoop group user and proceed with the following commands from your terminal, Make sure hadoop is running before you proceed with the commands
hadoop jar Table0.jar org.Table0 /input0 /output0
Output format : <artist_name><artist_id>.....<artist_id>............
hadoop jar Table1.jar org.Table1 /output0 /output1
Output format : <artist_name1><artist_name2>................
hadoop jar Table2.jar org.Table2 /output1 /output0 /output2 Task1O/P Task0 O/P Task2O/P
Output format : <artist_id>.....<artist_id>............
- There can be multiple locations, artistids and songtitles available, so included them all while I was merging in Task0
- Task2 has three command line arguments, i.e. args[0] will be Task1's output, args[1] will be Task0's output and args[2] will be the result Task2's output.
- Location & Artist Names are captured by Regular expression.