This project applies RFM (Recency, Frequency, Monetary) Analysis to segment customers based on their purchase behavior. It's a smart way to identify high-value customers, inactive ones, and plan targeted marketing strategies.
- Source: Simulated customer data (1,000 records)
- Fields Included:
Customer ID
,Name
,Email
,Subscription Date
- Simulated:
LastPurchaseDate
(randomized from Jan 2023 – Apr 2024)TotalPurchases
(Frequency)TotalSpend
(Monetary)
- Python
- Pandas, NumPy
- (Optional) Matplotlib, Seaborn
- Jupyter Notebook / Kaggle
- Recency: How recently a customer purchased
- Frequency: How often they purchase
- Monetary: How much they spend
-
Data Loading
- Loaded and explored the dataset using Pandas
-
Feature Simulation
- Generated realistic values for
LastPurchaseDate
,TotalPurchases
, andTotalSpend
using NumPy
- Generated realistic values for
-
RFM Metric Calculation
- Calculated Recency as the days since the last purchase
- Used Frequency and Monetary values as simulated
-
Scoring
- Used
pd.qcut()
to assign scores (1–5) for each R, F, M metric:- Lower
Recency
= higher score - Higher
Frequency
= higher score - Higher
Monetary
= higher score
- Lower
- Used
-
RFM Score Combination
- Combined
R_Score
,F_Score
, andM_Score
to generate a single RFM score like555
,431
, etc.
- Combined
-
Customer Segmentation
- Based on RFM_Score, segmented customers as:
Champions
,Loyal Customers
,At Risk
,Need Attention
, etc.
- Based on RFM_Score, segmented customers as:
Customer ID | Recency | Frequency | Monetary | R_Score | F_Score | M_Score | RFM_Score | Segment |
---|---|---|---|---|---|---|---|---|
1001 | 8 | 9 | 4000 | 5 | 5 | 5 | 555 | Champion |
1040 | 88 | 2 | 300 | 1 | 2 | 2 | 122 | At Risk |
- Real-world business application
- Teaches scoring and segmentation logic
- Builds strong data manipulation and analysis skills
- Easily extendable to dashboards (Power BI, Tableau)
- Create an interactive dashboard in Power BI
- Use real-world transactional data
- Apply clustering or ML for deeper analysis