Skip to content

Latest commit

 

History

History
29 lines (20 loc) · 909 Bytes

README.md

File metadata and controls

29 lines (20 loc) · 909 Bytes

LearnStormCrawler

This has been generated by the StormCrawler Maven Archetype as a starting point for building your own crawler. Have a look at the code and resources and modify them to your heart's content.

mvn clean compile exec:java -Dexec.mainClass=net.pic.crawler.CrawlTopology -Dexec.args="-conf crawler-conf.yaml -local"

to run the demo CrawlTopology in local mode, without Storm installed.

With Storm installed, you can generate an uberjar:

mvn clean package

and then submit the topology using the storm command:

storm jar target/stormcrawler-1.0-SNAPSHOT.jar net.pic.crawler.CrawlTopology -conf crawler-conf.yaml -local

to run in local mode. Simply remove the '-local' to run the topology in distributed mode.

You can also use Flux to do the same:

storm jar target/stormcrawler-1.0-SNAPSHOT.jar  org.apache.storm.flux.Flux --local crawler.flux