Skip to content

PageRank algorithm implementation which make use of the Apache Hadoop framework

Notifications You must be signed in to change notification settings

ZNClub-PA-ML-AI/hadoop-pagerank

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 

Repository files navigation

Hadoop PageRank

PageRank algorithm implementation which make use of the Apache Hadoop framework.

Execute the program

  • Install Hadoop on your machine [OSX], [Linux]
  • Pick a dataset from the Stanford web graphs collection
  • Place the dataset in your Hadoop FS
  • Create the directory which will contain the output
  • Build a JAR using this source code and name it pagerank.jar
  • Launch the software using Hadoop: hadoop jar pagerank.jar --input <in> --output <out>
  • Browse the PageRank output result which can be found in the Hadoop FS

Usage reference

  • --help (-h): display the help text
  • --damping (-d) : the damping factor [OPTIONAL] [DEFAULT = 0.85]
  • --count (-c) : the amount of iterations [OPTIONAL] [DEFAULT = 2]
  • --input (-i) : the directory of the input graph [REQUIRED]
  • --output (-o) : the directory of the output result [REQUIRED]

Google Cloud PLatform

Installation Guide Run Map Reduce Jobs Cloudera Simple Tutorial

About

PageRank algorithm implementation which make use of the Apache Hadoop framework

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Java 100.0%