Mini-Project involves a Kaggle competition where our objective is to predict which clients are more likely to default on their loans.
Competition Link: https://www.kaggle.com/competitions/home-credit-credit-risk-model-stability
To run the code download the datasets from Kaggle : https://www.kaggle.com/competitions/home-credit-credit-risk-model-stability/data and load the notebook in GoogleColab or PythonJupyterNotebook to visualize and analyze the model performances.
The emergence of machine learning (ML) presents a powerful tool for financial institutions to bolster their credit risk assessment capabilities. ML models can analyze vast datasets, encompassing traditional credit history information and alternative data sources. By doing so, they can identify complex patterns and relationships that might be missed by conventional methods. These patterns can then be leveraged to predict loan defaults with greater accuracy, enabling lenders to make more informed decisions.
This research investigates the potential of machine learning models in predicting loan defaults. We aim to achieve the following objectives:
● Evaluate the performance of different machine learning models
● Explore the benefits of ensemble learning
● Highlight the significance of data science in finance
We used 4 models here and following are the AUC scores obtained by each model:
CatBoost - 0.7535
LightGBM - 0.7632
XGBoost - 0.7454
Ensemble - 0.7633