Dwarf is a Domain Specific Language for Representative-Based Clustering Algorithms. Its Parallelizing Compiler takes the sequential Dwarf code and produces either Distributed-memory MPI C++ code or Hybrid-memory (MPI and OpenMP) C++ Code.
sh dwarf.sh <dwarf file name>.dw
- Compiles Dwarf Source code to generate C++ code as per Dwarf Compiler Flags
- Master's C++ Code: output.cpp
- Slave's C++ Code: outputslave.cpp
- C++ Header File: output.h
- Point Type Implementation C++ Code: Point.cpp
- Point.h in cppsrc directory.
- Compiles C++ Code to generate executables
- master
- slave
- Execute the executables using one of make targets
- localserial : Sequential Execution on the local machine
- localpar p=4 : Distributed-memory Execution by 4 processes on the local machine
- localhybrid p=4 t=2 : Hybrid-memory Execution by 4 processes and 2 threads on the local machine
- serial : Sequential Execution on the given host machine
- mpircluster p=4 : Distributed-memory Execution by 4 processes on the cluster (assumes a hostlist for MPI)
- hybridrcluster p=4 t=2 : Hybrid-memory Execution by 4 processes and 2 threads on the cluster (assumes a hostlist for MPI)
- Java version 1.8
- C++ version C++11
- Weka Java API : weka.jar available in the dependencies directory.
- Dwarf Compiler Flag File : config.txt available in the dependencies directory.
Compiler generates code in cppsrc directory. It already has a few codes.
Platform\Algorithm | K-means Algorithm | EM Algorithm |
---|---|---|
Serial C++ Code | KDSeq | EDSeq |
Distributed-memory Parallel MPI C++ Code | KDDis | EDDis |
Hybrid-memory Parallel MPI OpenMP C++ Code | KDHyb | EDHyb |
Comparison of various manual parallel implementations of Representative-based Clustering Algorithms with Dwarf Compiler generated automatically parallelized codes.
-
K-means Algorithm
- KMSeq (Manual C): Prof. Wei-keng Liao's Repository
- KMDis (Manual MPI C): Prof. Wei-keng Liao's Repository
- KMHyb (Manual MPI C): KMHyb
- KSJav (Manual Spark Java): AMP Camp Two - Big Data Bootcamp Strata 2013
- KSScl (Manual Spark Scala): AMP Camp Two - Big Data Bootcamp Strata 2013
-
EM Algorithm