This repository focuses on predicting the success or failure of Kickstarter campaigns using supervised machine learning models. It involves Exploratory Data Analysis (EDA), feature engineering, model evaluation, and comparison using classification algorithms.
Using cleaned Kickstarter data, the project:
- Performs binary classification (successful vs. not successful)
- Applies Recursive Feature Elimination (RFE) for feature selection
- Trains and evaluates models:
- ✅ Logistic Regression
- ✅ Decision Tree
- ✅ K-Nearest Neighbors
- ✅ Random Forest
- ✅ Support Vector Machine (SVM)
- Implements cross-validation for robust accuracy comparison
Model | Accuracy (± Std) |
---|---|
🟢 Random Forest | 0.97 ± 0.01 |
🟢 K-Nearest Neighbors | 0.97 ± 0.01 |
🟡 Decision Tree | 0.90 ± 0.03 |
🟠 Logistic Regression | 0.87 ± 0.07 |
🔴 SVM | 0.80 ± 0.05 |
- ⚙️ Feature Selection using RFE
- 🔁 Pre-Cross-Validation vs. Cross-Validated Accuracy Comparison
- 📏 Model performance evaluated using accuracy & standard deviation
- 📉 Accuracy distributions visualized using box plots
- Python
- scikit-learn
- Matplotlib & Seaborn
- RFE (Recursive Feature Elimination)
- Jupyter Notebook
- Clone the repository
git clone https://github.com/saivivek55/Modelling_Kickstarter-Data.git cd Modelling_Kickstarter-Data
- Install dependencies
- Launch the notebook
✅ Random Forest and KNN outperformed all models with 97% accuracy
✅ RFE helped reduce noise and improve classification reliability
🔁 SVM struggled with generalization and had the lowest accuracy
📊 Box plots revealed clear variance trends among classifiers
Licensed under the Apache 2.0 License – see the LICENSE file for details.