Simple project that implements a graph using CPP, and a greedy algorithm to approximate the location of K centroids in the graph.
The algorithm attempts to minimize the maximum distance to the set 'centers' for each node in the graph.
To run the algorithm simply clone the repository to your machine and run 'make', './main'.
(proof by Gonzalez[Theoretical
Computer Science])
We will denote S* as the optimal solution. (A set with K centroids)
We will denote S as the solution chosen by the greedy algorithm.
For a vertex x and a group A we will denote the minimum distance between x and one of the vertices in A as r* = d(x, S*).
r = d(x, S).
Let us observe two arbitrary vertices that share the same centriod, i and j and they share s. The distance between them is less then 2r*:
$$ d(i,j) <= d(i, s) + d(s, j) <= r* + r* = 2r* $$
Consider the first iteration of the greedy algorithm where it chosen a vertex i even though it has already chosen a vertex i' s.t in the optimal solution they share the same cluster.
Up until that point, suppose each iteration chosen a vertex that is in another cluster.
So, for every vertex j, the following argument holds:
$$ d(j, S) <= d(i, i') $$
Otherwise j would have been added before i' (greedy choice)
Recall that for each two vertices i, j that share the same centroids d(i, j) <= 2r*. Implying that d(i, i') <= 2r*.
Therefore, for all the vertices added after i' the distance between the new vertex and the closest cluster is also bounded by 2r*.
We can conclude that: $$ r* <= r <= 2r* $$