Skip to content

A classification methodology to determine whether a customer is placing a fraudulent vehicle insurance claim.

Notifications You must be signed in to change notification settings

swarnava-96/Insurance-Fraud-Detection-MLOPS

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

30 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Insurance-Fraud-Detection-MLOPS

Goal: To build a classification methodology to determine whether a customer is placing a fraudulent vehicle insurance claim.

About the Data set:

The data was sent from the client side in multiple sets of files in batches at a given location. The data has been extracted from the census bureau. The Dataset has 38 features(including the target feature).

Project Description:

Schema file was also sent by the client containing the relevent informations about training files. Data Validation was performed as an initial step followed by data insertion into Database(SQLite). Here data is divided into good and bad data based on the schema file and then sent to respective folders. Then the entire lifecycle of a Data Science project was followed like, exporting data from database,data preprocessing(Imbalanced dataset was handled using Imblearn's Random Over sampler),clustering using Kmeans,model selection(multiple models were tested and the top 2 models were selected based on accuracy score and AUC score), model building(XgBoost for the first cluster and SVC for the second), hyperparameter optimization(using GridSearchCV) and finally model deployment(into GCP). API testing was done using Postman. Logs were maintained at each and every step of action. Similar set of actions were performed for the predicting data. Codes were written following OOPS concept.

For more details about the project click here

Project Architecture:

image

Installation:

The Code is written in Python 3.7.3 If you don't have Python installed you can find it here. If you are using a lower version of Python you can upgrade using the pip package, ensuring you have the latest version of pip. To install the required packages and libraries, run this command in the project directory after cloning the repository:

1. First create a virtual environment by using this command:
conda create -n myenv python=3.7
2. Activate the environment using the below command:
conda activate myenv
3. Then install all the packages by using the following command
pip install -r requirements.txt
4. Then, in cmd or Anaconda prompt write the following code:
python main.py
Make sure to change the directory to the root folder.

Deployment on GCP

Login or sign up in order to create virtual app and many more things. Free tier account on Google console provides $300 credit for one year. For application deployment download the Google SDK installer.

Frontend using HTML and Backend using Flask:

Demo: https://insurancefraud.de.r.appspot.com/

Screenshot (99) Screenshot (100)

Currently app disabled,since GCP is chargeable. Screenshot (101)

Technology Stack:

PyCharm SciPy Seaborn

Further Changes to be Done:

  • Deploying the Web Application on Cloud.
    • Heroku
    • Azure
    • AWS EC2 Instance

About

A classification methodology to determine whether a customer is placing a fraudulent vehicle insurance claim.

Topics

Resources

Stars

Watchers

Forks