About the Project
This is a Customer Churn Prediction application that predicts whether a customer is likely to churn based on historical data. The application follows a robust pipeline-based workflow for preprocessing, model building, and evaluation. It also includes a Flask web application for easy user interaction and is containerized using Docker for seamless deployment and scalability.
Workflow
The application workflow is structured as follows:
-
Data Ingestion:
- Load the dataset from a CSV file.
-
Data Cleaning and Preprocessing:
- Handle missing values.
- Encode categorical variables.
- Normalize/standardize numerical features.
-
Pipeline Creation:
- Build a pipeline using
scikit-learn
to streamline preprocessing and model training. - Save the pipeline using
joblib
for reuse.
- Build a pipeline using
-
Model Building and Evaluation:
- Train multiple classification models.
- Evaluate models using metrics such as accuracy, precision, recall, and F1-score.
- Select the best-performing model.
Key Features
- End-to-end machine learning pipeline for classification tasks.
- Modular and reusable code for data preprocessing and model training.
- Simple and intuitive web interface built with Flask.
- Fully containerized setup for deployment using Docker.