Skip to content

Latest commit

 

History

History
65 lines (49 loc) · 2.77 KB

File metadata and controls

65 lines (49 loc) · 2.77 KB

Cassandra Udyam Defaulter Prediction

This repository contains the code and the explanation of our approach for building a Machine Learning model capable of predicting loan defaulters for a bank. This was the problem statement of the event Cassandra, a data science event of Udyam, the annual technical fest of the Electronics Engineering Society of IIT-BHU. With this we were able to secure the 2nd position.

Our Approach in Brief

On extensive analysis of the data, we found several key attributes in it. This included temporal consistency in the last_update column, relations between last_update and recent_payment_activity columns and the imbalance of labels in the dataset to name a few. Data cleaning and feature engineering were applied before feature aggregation and merging of the 2 datasets. This was followed by splitting the dataset via StratifiedKFold and applying SMOTE to the training dataset. We used the ROC-AUC-Score to validate our models and the Optuna Framework for Hyperparameter Tuning. We used an ensemble of a Decision Tree Classifier and an Adaboost Classifer as our model.


Feature Aggregation

Team Members ✨


Yash Sahijwani


Somnath Sendhil Kumar


Vikhyath Venkatraman