MapReduce

Description

Distributed MapReduce algorithm for indexing words from a list of files.

Details

Made in Visual Studio IDE using C/C++ and MPI.

Implementation

Algorithm has a MASTER process and a few WORKERS.

The MASTER process reads the file names from folder and sends them to workers.

Indexing algorithm with two phases:

Map phase
The WORKER receives the file name, opens it and parses it. It adds the words found into a Hash table. The Hash table has 500 records and the dispersion function calculates the sum of ascii codes. It adds all the words from file into the Hash table and it saves the words with frequencyes into a temporary file. After that it tells to MASTER that it finished the processing.
Reduce phase
The MASTER adds the processed words (from workers when finished) to it's Hash table, where would be stored all the words from parsed files. The MASTER then can find or display all words and it's occurencies.

Documentation

See Documentation for mode details.

Example result

Input
Text with English words.
Output
<antiquity, {16.txt:4,25.txt:3}> <possesses, {21.txt:11,14.txt:1,5.txt:2,24.txt:5,15.txt:1,7.txt:2,25.txt:58}> <intellectionem, {21.txt:2}> <contrasedative, {12.txt:1}> <faded, {17.txt:1,12.txt:5,20.txt:2,13.txt:7,18.txt:1,10.txt:2,2.txt:1,15.txt:6,19.txt:3}> <injurious, {23.txt:1,22.txt:1,14.txt:2,16.txt:1,24.txt:3,25.txt:2}> <supposing, {23.txt:1,22.txt:1,21.txt:5,14.txt:3,5.txt:1,24.txt:8,7.txt:8,25.txt:6,19.txt:1}> <ourselves, {6.txt:8,23.txt:10,22.txt:10,17.txt:2,12.txt:3,21.txt:2,14.txt:29,5.txt:18,16.txt:4,20.txt:2,11.txt:6,1.txt:7,13.txt:1,18.txt:7,24.txt:29,3.txt:2,10.txt:23,2.txt:13,15.txt:11,7.txt:22,25.txt:132,19.txt:19}> <positions, {6.txt:2,17.txt:3,11.txt:1,1.txt:2,18.txt:2,24.txt:2,3.txt:2,2.txt:4,25.txt:2,19.txt:9}>

License

MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
MapReduce_MPI		MapReduce_MPI
LICENSE		LICENSE
README.md		README.md
documentation.pdf		documentation.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MapReduce

Description

Details

Implementation

Documentation

Example result

License

About

Releases 1

Languages

License

alexgrigoras/map_reduce_indexer

Folders and files

Latest commit

History

Repository files navigation

MapReduce

Description

Details

Implementation

Documentation

Example result

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 1

Languages