Skip to content

Latest commit

 

History

History
23 lines (12 loc) · 1.77 KB

README.md

File metadata and controls

23 lines (12 loc) · 1.77 KB

MSE246-2018-Project

This project implements and compares a number of different models for predicting loan default and loss at default. We use data from the U.S. SBA 504 loan program, consisting of 150,000 loans issued between 1990 and 2014. We augment our data with several macroeconomic factors, including the Consumer Price Index and yearly S&P 500 returns.

Data Processing

The data_processed_final folder contains our final processed data, created with data_processed_hujia.ipynb. data_exploration.ipynb contains code for preliminary analysis and generating exploratory graphs.

Logistic Model

The logistic model.ipynb notebook in logistic_model folder contains code for tuning and analyzing our logistic model. logistic_roc.csv is the validation ROC curve.

Neural Network

The neural_network folder contains our attempts at implementing a binary classification neural network. NNprocessing.py contains neural network-specific preprocessing. static_net.py and dynamic_net.py are first attempts, exploring PyTorch's support for dynamic computational graphs. default_net.py contains our final implementation, which uses batch normalization, dropout, and Adam gradient descent. nn_eval.py analyzes our model parameters and tests its validation performance. Unfortunately, were were unable to implement a fully functioning neural network.

Hazard Model

The hazard model is in the hazard_lifelines_michelle.ipynb notebook in the data_processed_final folder.

Loss Model

The loss model is in the loss folder, in loss_model_michelle.ipynb. The 1_and_5_year_loss_michelle.ipynb notebook contains the tranche loss simulation code. Generated graphs in said notebook were also screenshotted and placed in the graphs folder.