Skip to content

Simple spark template project to use as a getting started point when writing new Apache Spark projects.

Notifications You must be signed in to change notification settings

eric-kimbrel/spark-word-count

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 

Repository files navigation

spark-word-count

Simple spark template project to use as a getting started point when writing new Apache Spark projects.

To run locally use an IDE such as IntelJ IDEA or Eclipse to run the main method. Make sure to set paramaters for an input and output file.

To run on a cluster

# on your machine
gradle clean dist
scp -r build/dist <location on your cluster you can launch jobs from>

# on the cluster
cd <location on your cluster you can launch jobs from>/dist
./yarn-runner hdfs://<name node>/<path to input> hdfs://<nade node>/<path to output>

About

Simple spark template project to use as a getting started point when writing new Apache Spark projects.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published