In this Mini Project we will explore the use of pre-processing methods and Gradient Boosting on the popular Lending Club dataset. We are provided with two files: loan train.csv and loan test.csv.
We have be to pre-process the data appropriately, and then apply gradient boosting to classify whether a customer should be given a loan or not.
The target attribute is in the column loan status, which has values “Fully Paid” for which we can assign +1 to, and “Charged off” for which we can assign -1 to. The other records with loan status values “Current” (in both train and test) are not relevant to this problem.
We will be mainly working on the following:
(a) Pre-process the data as needed to apply the classifier to the training data.
(b) Apply gradient boosting using the function sklearn.ensemble.GradientBoostingClassifier
for training the model.
Training Data : loan_train.csv
Testing Data : loan_test.csv
Proper Analysis can be found in the Report