The project aims at developing Real time twitter feed filtering the Top-N Hashtags (N=10 in my case).An example of the give would be
The project uses Apache Storm and Twitter API to create design a Storm Topology and implement a new streaming joins to dynamically calculate Top-N Hashtags and display real-time tweets that contain trending Top Hashtags which involves implementation of Twitter spouts (Called tweet spouts) which is then connected to Parse Tweet Bolt using shuffle grouping which is connected to Count Bolt using fields grouping which is connected to an open source bolt called Intermediate rankings Bolt via fields grouping which is connect to Total rankings Bolt by global grouping which is connected via global grouping to Report Bolt which in turn is connected to Redis and Flask MicroServer which is connected to uses word-cloud visualization written D3js .Flask provides a handy way of running the visualization on streams of data at runtime while Redis provides an efficient runtime in-memory key value storage mechanism.
To download the code
$ git clone https://github.com/AadityaJ/Real_time_Storm
Feel free to Fork the code. To send a pull request please contact :
- Aaditya Jamuar (@AadityaJ)