This project is a comparative study of six different machine learning classification algorithms. The algorithms are evaluated on four synthetic datasets, created using sklearn datasets modules make_circles, make_blobs, make_moons, and a composition of the three. The aim of the project is to determine the effectiveness of each algorithm in accurately classifying the datasets, and to provide insights on which algorithm works best for each dataset.
The four datasets are named blob_classes, circle_classes, moons_classes, and nl_blob_classes. Each dataset has distinct features and characteristics, and is designed to evaluate the performance of the classification algorithms in different scenarios.
The six machine learning classification algorithms compared in this project are:
- Support Vector Machine (SVM)
- Random Forest
- Decision Tree
- K-Nearest Neighbor (KNN)
- Quadratic Discriminant Analysis (QDA)
The performance of the algorithms is evaluated using the following metrics:
- Accuracy
- F1 score
- Precision
- Recall The evaluation metrics are used to compare the algorithms across the different datasets and to determine which algorithm performs best in each scenario.
The results of this project will provide valuable insights into the performance of different machine learning algorithms on synthetic datasets, and will help to guide the selection of the best algorithm for a given classification problem. The code is written in Python and uses Jupyter notebooks to facilitate easy reproducibility and experimentation.