Cassandra Udyam Defaulter Prediction

This repository contains the code and the explanation of our approach for building a Machine Learning model capable of predicting loan defaulters for a bank. This was the problem statement of the event Cassandra, a data science event of Udyam, the annual technical fest of the Electronics Engineering Society of IIT-BHU. With this we were able to secure the 2nd position.

Our Approach in Brief

On extensive analysis of the data, we found several key attributes in it. This included temporal consistency in the last_update column, relations between last_update and recent_payment_activity columns and the imbalance of labels in the dataset to name a few. Data cleaning and feature engineering were applied before feature aggregation and merging of the 2 datasets. This was followed by splitting the dataset via StratifiedKFold and applying SMOTE to the training dataset. We used the ROC-AUC-Score to validate our models and the Optuna Framework for Hyperparameter Tuning. We used an ensemble of a Decision Tree Classifier and an Adaboost Classifer as our model.

Feature Aggregation

Team Members ✨

_{Yash Sahijwani}

_{Somnath Sendhil Kumar}

_{Vikhyath Venkatraman}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Cassandra Udyam Defaulter Prediction

Kaggle Competition

Approach In Detail

Code

Our Approach in Brief

Team Members ✨

Files

README.md

Latest commit

History

README.md

File metadata and controls

Cassandra Udyam Defaulter Prediction

Kaggle Competition

Approach In Detail

Code

Our Approach in Brief

Team Members ✨