Skip to content

Use unsupervised machine learning, PCA algorithm, and K-Means clustering to analyze and classify a database of cryptocurrencies.

Notifications You must be signed in to change notification settings

dw251414/Cryptocurrencies

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation

Cryptocurrencies

Resources

Analysis Overview

The purpose of this project is to use unsupervised machine learning to analyze a database of cryptocurrencies and create a report including the traded cryptocurrencies classified by group according to their features. In practice, this classification report could be used by an investment bank to propose a new cryptocurrency investment portfolio to its clients.

Methods for the analysis:


  • preprocessing the database
  • reducing the data dimension using Principal Component Analysis
  • clustering cryptocurrencies using K-Means
  • visualizing classification results with 2D plots

Results

After preprocessing and cleaning, we have a total of 532 tradable cryptocurrencies.

Clustering Cryptocurrencies using K-Means - Elbow Curve

Deployed unsupervised machine learning to identify clusters of the cryptocurrencies. The elbow curve below using the K-Means method iterating on k values from 1 to 10.


Screen Shot 2021-08-28 at 11 33 45 PM

The best k value appears to be 4 so we would conclude on an output of 4 clusters to categorize the crytocurrencies.

Visualizing Cryptocurrencies Results

2D-Scatter plot with clusters


Screen Shot 2021-08-28 at 11 34 37 PM

This 2-D scatter plot was obtained using the PCA algorithm to reduce the crytocurrencies dimensions to two principal components. This plot shows the distribution and the four clusters of cryptocurrencies. Amongst other variability, we're able to identify outliers like the unique cryptocurrency in the class #2.

Tradable Cryptocurrencies Table


Screen Shot 2021-08-28 at 11 15 06 PM

Most of the cryptocurrencies are part of class #0 and #1.The snapshot above shows that BitTorrent is the only cryptocurrency in class #2.

2D-Scatter plot with TotalCoinMined vs TotalCoinSupply


Screen Shot 2021-08-28 at 11 34 12 PM

The PCA algorithm identifies class as a parameter, and is the better visualization.

Summary

We have identified the classification of 532 cryptocurrencies based on feature similarities. To determine their performance along with potential IB interest, further analysis on unique group traits should be conducted.

About

Use unsupervised machine learning, PCA algorithm, and K-Means clustering to analyze and classify a database of cryptocurrencies.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published