Telco Customer Churn Prediction

Problem Statement

The goal of this project is to develop a machine learning model to predict customers who will churn from the company. Before developing the model, data analysis and feature engineering steps are performed.

Dataset Description

The Telco customer churn dataset contains information about 7,043 customers of a fictional telecom company in California in the third quarter. It includes information about which customers have left, stayed, or signed up for home phone and internet services.

Variables

CustomerId: Customer ID
Gender: Gender
SeniorCitizen: Whether the customer is a senior citizen (1, 0)
Partner: Whether the customer has a partner (Yes, No)
Dependents: Whether the customer has dependents (Yes, No)
Tenure: Number of months the customer has stayed with the company
PhoneService: Whether the customer has phone service (Yes, No)
MultipleLines: Whether the customer has multiple lines (Yes, No, No phone service)
InternetService: Customer's internet service provider (DSL, Fiber optic, No)
OnlineSecurity: Whether the customer has online security (Yes, No, No internet service)
OnlineBackup: Whether the customer has online backup (Yes, No, No internet service)
DeviceProtection: Whether the customer has device protection (Yes, No, No internet service)
TechSupport: Whether the customer has technical support (Yes, No, No internet service)
StreamingTV: Whether the customer has streaming TV (Yes, No, No internet service)
StreamingMovies: Whether the customer has streaming movies (Yes, No, No internet service)
Contract: Customer's contract term (Month-to-month, One year, Two years)
PaperlessBilling: Whether the customer has paperless billing (Yes, No)
PaymentMethod: Customer's payment method (Electronic check, Mailed check, Bank transfer (automatic), Credit card (automatic))
MonthlyCharges: Monthly charges collected from the customer
TotalCharges: Total charges collected from the customer
Churn: Whether the customer has churned (Yes or No) - Indicates whether the customer left in the last month or quarter

Data Preparation

Data is read from the provided CSV file.
Data types and missing values are checked and corrected.
Encoding is applied to binary categorical variables.
Standardization is performed for numeric variables.

Exploratory Data Analysis (EDA)

General overview of the dataset is provided.
Numeric and categorical variables are identified and analyzed.
Target variable analysis is conducted, including mean values by categorical variables and numeric variables by the target variable.
Outlier analysis is performed.
Missing observation analysis is conducted.
Correlation analysis is done.

Feature Engineering

Missing values and outliers are handled.
Encoding operations are performed for categorical variables.
Standardization is applied to numeric variables.

Model Building

Several machine learning models are built and evaluated using cross-validation:

Logistic Regression
K-Nearest Neighbors (KNN)
Decision Tree
Random Forest
CatBoost
Light GBM
XGBoost

The models are tuned using hyperparameter optimization, and the best-performing model is selected based on evaluation metrics such as accuracy, precision, recall, F1 score, and confusion matrix.

Model Evaluation

The performance of each model is evaluated, and the results are stored in a dataframe. The best model is selected based on the F1 score.

Model Export

The best-performing model (XGBoost) is saved to a file named "best_model.pkl" using pickle for future use.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.gitattributes		.gitattributes
README.md		README.md
Telco_Churn.py		Telco_Churn.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Telco Customer Churn Prediction

Problem Statement

Dataset Description

Variables

Data Preparation

Exploratory Data Analysis (EDA)

Feature Engineering

Model Building

Model Evaluation

Model Export

About

Releases

Packages

Languages

kaanerdenn/Customer-Churn-Prediction-Telco

Folders and files

Latest commit

History

Repository files navigation

Telco Customer Churn Prediction

Problem Statement

Dataset Description

Variables

Data Preparation

Exploratory Data Analysis (EDA)

Feature Engineering

Model Building

Model Evaluation

Model Export

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages