INTERNSHIP PROJECT
Problem Statement:
The dataset contains different features like age,gender,education,occupation,capital-gain,capital-loss,race,work per hour,country etc. The proposed approach will implement a different techniques and algorithms like Random Forest and Boosting Techniques. Random Forest performed well with 86% accuracy.
Approach used This is basically a binary classification problem where a person is classified into the >50K group or <=50K group
Data Exploration : I started exploring dataset using pandas,numpy and pandas-profiling.
Data visualization : Ploted graphs to get insights about dependend and independed variables.
Feature Engineering : Removed missing values and created new features as per insights.
Model Selection I : Tested all base models to check the base accuracy, Also ploted residual plot to check whether a model is a good fit or not.
Pickle File : Selected model as per best accuracy and created pickle file .
Python Sklearn Flask Html Pandas Numpy pandas-profiling
Project Title: Adult Census Income Prediction
Technologies: Machine Learning Technology
Domain :Finance
Project Difficulties level: Intermediate
video link of depolyment: https://github.com/sriphaniN/adult-income-prediction/blob/4057050a051344dc3c16fb4a65cf1f0c00e0edde/project1/prediction.webm