This repository contains the project for Machine Learning course 2024/2025 and it is structured as follows:
-
code/: Contains all the scripts and Jupyter notebooks used for data processing, feature selection, model training, and evaluation.
adaboost.ipynb
: Implementation of the AdaBoost model.data.ipynb
: Information about the dataframes and their columns.dataset_functions.py
: Utility functions for dataset handling, mainly transformations and feature engineering.feature_importance.ipynb
: Analysis of feature importance.feature_selection.ipynb
: Feature selection methods.final_predictions.ipynb
: Final model predictions.gradboost.ipynb
: Implementation of the Gradient Boosting model.knn.ipynb
: Implementation of the k-Nearest Neighbors (k-NN) model.random_forest.ipynb
: Implementation of the Random Forest model.SVM.ipynb
: Implementation of the Support Vector Machine (SVM) model.visualization.ipynb
: Data visualization and exploratory analysis.
-
dataset/: Contains the dataset used in the project.
-
doc/: Includes documentation related to the project.
-
submission.csv
: The final predictions generated by the model.
To run this project, ensure you have all dependencies installed. Open the Jupyter notebooks inside the code/
folder to explore data, train models, and generate predictions.
Luca Panariello & Enrico Loda