MPI-MapReduce-PageRank

Usage and Implementation Details

Compilation Step - make PG{implementation}
where {implementation} is one of:

Serial (Uses No MapReduce library and is Vanilla C++)
C++ (Uses MapReduce C++ Library from https://github.com/cdmh/mapreduce which has been made GCC 7.1 Compatible at https://github.com/jainvasu631/mapreduce)
MPIBase (Uses MPI MapReduce C++ Library from https://mapreduce.sandia.gov)
MPISelf (Uses our own MPI MapReduce C++ Library)

Run Algorithm - For Serial and C++ the {implementation}.o can be run directly as ./{implementation}.o Tests/{Testfile} output-{Testfile} For MPIBase and MPISelf mpirun must be used - mpirun -np {NUM_PROC} ./{implementation}.o Tests/{Testfile} output-{Testfile} where {NUM_PROC} is the number of processors.

Example
make PGC++
./C++.o Tests/erdos-20000.txt output-erdos-20000.txt make PGMPIBase
mpirun -np 4 ./MPIBase.o Tests/erdos-100000.txt output-erdos-100000.txt

The PageRank Algorithm

We will assign to each web page P a measure of its importance I(P), called the page's PageRank.

Suppose that page Pj has lj links. If one of those links is to page Pi, then Pj will pass on 1/lj of its importance to Pi. The importance ranking of Pi is then the sum of all the contributions made by pages linking to it. That is, if we denote the set of pages linking to Pi by Bi, then I(P_i)=\sum_{P_j\in B_i} \frac{I(P_j)}{l_j}

Hyperlink Matrix H := [Hij] = {if Pj in Bi then 1/lj else 0} H>=0
H := [Hij] = {if Pj -> Pi is Edge then 1/lj else 0} H>=0
H is a stochastic matrix. The sum of all entries in a column is 1/0

Importance Vector I := [I(Pi)] whose components are the PageRanks or the importance rankings of all the pages.
I = HI i.e. I is an eigenvector of H with eigenvalue 1. This is the stationary vector of H

Computing I H is very very large matrix of the order of 10e10. However most entries of H are zero.
Therefore its a sparse matrix.
I(k+1) = HI(k). Sequence of I(k) converge to I.

Problems with Convergence of I:

Dangling Node which has no out links. H isn't perfectly stochastic.
Circular Reference Problem. S isn't Primitive
Importance Sink. S is reducible

Google Matrix:

Replace all 0 columns in matrix H with 1/n to create S. S = H + A where A := [Aij] = {if Hi==0 then 1/n else 0}
Choose a parameter alpha between 0 and 1. G := alpha x S + (1-alpha) x 1/n x 1.

Choosing alpha The parameter alpha has an important role. When alpha is 1 we get G:=S and when alpha is 0 G:=(1-alpha)/n x 1.
The rate of convergence of I depends on alpha. Therefore as a compromise we use alpha = 1.

Simplifying Formula I(k+1) = alpha x H x I(k) + alpha x A x I(k) + (1-alpha)/n x 1 x I(k).
This can be further simplified. A = J/n where J = [Ji] = {1 if Corresponding Column is 0 else 0}.Then,
beta := (1/alpha-1) and now M:= (J + beta x 1)/n.
Then I(k+1) = alpha x (H x I(k) + (J + (1/alpha-1) x 1)/n x I(k)) which simplifies to. I(k+1) = alpha x (H x I(k) + M x I(k)).

Calculating M x I(k) As all rows of M are identical, we can write M = 1 x N and M x I(k) = 1 x N x I(k).
factor:= N x I(k) = (1 + beta if Corresponding column is 0 else beta) x I(k)/n Now the expression simplifies to. I(k+1) = alpha x H x I(k) + 1 x factor.

Name		Name	Last commit message	Last commit date
Latest commit History 179 Commits
C++		C++
Data_Graphs		Data_Graphs
MPI Base		MPI Base
MPI Self		MPI Self
Report		Report
Serial		Serial
Tests		Tests
.gitignore		.gitignore
Graph.h		Graph.h
Makefile		Makefile
README.md		README.md
Submission_A2.2.zip		Submission_A2.2.zip
Utility.cpp		Utility.cpp
runscript.sh		runscript.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MPI-MapReduce-PageRank

Usage and Implementation Details

The PageRank Algorithm

About

Releases

Packages

Contributors 2

Languages

jainvasu631/MPI-MapReduce-PageRank

Folders and files

Latest commit

History

Repository files navigation

MPI-MapReduce-PageRank

Usage and Implementation Details

The PageRank Algorithm

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages