Skip to content

I have experience in Supervised, Reinforcement, and Unsupervised Learning, with a focus on a loan prediction project using a Kaggle dataset. I trained and tested the model, achieving high accuracy through Python programming with Pandas, NumPy, and Matplotlib for data analysis and visualization. My primary development environment is Jupyter Notebook

License

Notifications You must be signed in to change notification settings

Ash0508/Amazon_ML_Summer_School_Loan-Prediction

Repository files navigation

Loan Prediction Using Machine Learning and Python

Aim

The primary objective of this project is to leverage Python libraries such as pandas, matplotlib, and seaborn to extract valuable insights from the data. Additionally, we aim to utilize xgboost and scikit-learn libraries for machine learning.

A secondary objective is to learn how to fine-tune the parameters using grid search cross-validation for the xgboost machine learning model.

Ultimately, the goal is to predict whether a loan applicant can repay the loan using voting ensemble techniques that combine predictions from multiple machine learning algorithms.

Dataset Attributes

The dataset contains the following attributes: Loan ID, Gender, Marital Status, Dependents, Education, Self-Employment Status, Applicant Income, Coapplicant Income, Loan Amount, Credit History, Property Area, and Loan Status.

Key Observations from the Data

Income Trends: Male and married applicants tend to have higher incomes compared to female and married applicants, who have the lowest incomes.

Education Impact: Male graduates have higher incomes than non-graduates.

Marital and Educational Impact: Married graduates have the highest incomes among all groups.

Employment Status: Non-self-employed applicants have higher incomes than self-employed ones.

Dependents: Applicants with more dependents have the lowest incomes, while those with no dependents have the highest.

Property and Credit History: Applicants with property in urban areas and a credit history tend to have the highest incomes.

Education and Credit History: Graduates with a credit history earn more than those without.

Income and Loan Amount: Loan amounts are linearly dependent on applicant incomes.

Correlation: Heatmaps indicate a strong positive correlation between applicant income and loan amount.

Gender Distribution: There are more male applicants than female applicants.

Marital Status: More applicants are married than unmarried.

Dependents: The majority of applicants have no dependents.

Education: There are more graduates than non-graduates among the applicants.

Property Area: Most properties are located in semi-urban areas, with the least in rural areas.

About

I have experience in Supervised, Reinforcement, and Unsupervised Learning, with a focus on a loan prediction project using a Kaggle dataset. I trained and tested the model, achieving high accuracy through Python programming with Pandas, NumPy, and Matplotlib for data analysis and visualization. My primary development environment is Jupyter Notebook

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published