This repository contains examples showing how to use the code located at caponetto/bayesian-hierarchical-clustering
-
Create an anaconda environment using the file environment.yml.
$ conda env create -f environment.yml
-
Activate the environment after the installation is completed.
$ conda activate bayesian-hierarchical-clustering-examples
-
Run the file example.py.
$ python example.py
-
Check out the output images in the results folder.
Note: You can optionally add your own data in the file data.csv but the hyperparameters must be optimized.
A plot of the input data (2D). Suppose we want to find two clusters of data (orange and blue).
Dendrograms obtained from linkage algorithms. Notice that, none of them can tell us the presence of the two clusters.
A binary hierarchy obtained from the Bayesian hierarchical clustering algorithm. Notice that, two clusters have been identified containing the expected data points (leaves).
A non-binary hierarchy obtained from the Bayesian rose trees algorithm. Notice that, two clusters have been identified containing the expected data points (leaves).
All contributions are welcome, so don't hesitate to submit a pull request. ;-)
If you run into issues with graphviz when running the example, you might need to install it through
sudo apt-get install graphviz
This code is released under GPL 3.0 License.
Check LICENSE file for more information.