My exploration of Twitter Storm
Wow, three build trackers (just because I can):
- Storm author Nathan Marz's github repo
- My tutorial on setting up Storm in Eclipse with maven, git and GitHub
- My reminder note about integrating a GitHub repo with Travis-CI
- My completely unnecesarry guide for setting up a GitHub project build on Drone.io
Storm is distributed and fault-tolerant realtime computation platform that offers stream processing, continuous computation, distributed RPC, and more.
Storm is a free and open source distributed realtime computation system. Storm makes it easy to reliably process unbounded streams of data, doing for realtime processing what Hadoop did for batch processing. Storm is simple, can be used with any programming language, and is a lot of fun to use!
Storm has many use cases: realtime analytics, online machine learning, continuous computation, distributed RPC, ETL, and more. Storm is fast: a benchmark clocked it at over a million tuples processed per second per node. It is scalable, fault-tolerant, guarantees your data will be processed, and is easy to set up and operate.
Storm integrates with the queueing and database technologies you already use. A Storm topology consumes streams of data and processes those streams in arbitrarily complex ways, repartitioning the streams between each stage of the computation however needed. Read more in the tutorial.