Predictioneer

Predictive modeling for Ebola outbreaks to forecast cases, deaths, and fatality ratios using geographical and epidemiological data. This repository includes predictive models, implementation scripts, and a detailed report on methodology, results, and applications in outbreak management.

The project was developed as part of Predictioneer, a hackathon organized by the Indian Institute of Technology (IIT) Bombay.

This project aims to predict the number of deaths and the Case Fatality Rate (CFR) due to Ebola. The dataset was first cleaned and preprocessed, followed by model building for predictions. The Random Forest Single-Task Model was selected for its superior performance in predicting these metrics.

Directory Structure

Predictioneer/
│
├── Codes/                              # 📂 Contains all code files
│   ├── images/                         # 🖼️ Contains image files used in the project
│   ├── models/                         # 🤖 Contains model-related scripts for training and evaluation
│   └── data/                           # 📊 Contains all data-related folders
│       ├── raw/                        # 📑 Raw data files (e.g., original dataset)
│       ├── interim/                    # 🛠️ Interim processed data before final cleaning
│       └── final/                      # ✅ Final cleaned and preprocessed data ready for modeling
├── Problem Statement/                  # 📄 Folder containing the project's problem statement
├── Documents/                          # 📚 Contains explanation and reports
│   └── Predictioneer Model Report.pdf  # 📑 Visit this file for detailed project insights and results

Project Description

This project focuses on predicting the number of deaths and Case Fatality Rate (CFR) due to Ebola. The dataset used was first cleaned to ensure accuracy and consistency. Subsequently, a series of models were trained and evaluated to determine the best approach for these predictions.

Model Selection and Evaluation

After training and evaluating various models such as AdaBoost, Decision Tree, and others, we finalized the Random Forest Single-Task Model based on its superior performance. The table below summarizes the evaluation metrics for all models, including MAE, MSE, and R² scores:

Model	Deaths_MAE	Deaths_MSE	Deaths_R²	CFR_MAE	CFR_MSE	CFR_R²
Linear Regression	43.26389	2625.766	0.004933	1.006661	123.8043	0.00341
Random Forest	13.02522	284.476	0.892194	0.338237	26.52387	0.78649
SVR	41.12771	2504.951	0.050717	0.729752	124.6278	0.00322
Gradient Boosting	33.4908	1697.821	0.356589	0.442241	0.47705	0.99616
Decision Tree	0	0	1	0	0	1
K-Nearest Neighbors	28.56804	1348.983	0.488786	0.734198	100.1167	0.19409
AdaBoost	39.70689	2198.235	0.166951	0.675545	0.88858	0.99284

Working of the Random Forest Single-Task Model

The Random Forest Single-Task Model is designed to predict two key metrics: Deaths and Case Fatality Rate (CFR). It consists of two separate models:

Deaths Prediction Model – Trained to predict the number of deaths due to Ebola in a given region.
CFR Prediction Model – Trained to predict the Case Fatality Rate (CFR), which is the proportion of confirmed deaths among confirmed cases of Ebola.

Both models are trained and evaluated based on their performance metrics, with R² (coefficient of determination) and MAE (Mean Absolute Error) being the key evaluation criteria. For each model, we selected the best-performing model based on the R² score, as it provides the best explanation of variance in the predictions.

Model Selection Process

Deaths Prediction Model: Multiple models were trained, and the one with the highest R² score was selected for maximum accuracy in predicting the number of deaths.
CFR Prediction Model: Similarly, the model with the best R² score was selected to predict the case fatality rate with the highest reliability.

Final Prediction Process

For new predictions, we use the model with the best R² score for both metrics (Deaths and CFR) to ensure accurate results. By applying these two separate models, we calculate the final Confirmed Deaths by combining the outputs from the Death Prediction Model and the CFR Prediction Model, following the given formulation. This method leverages the strengths of Random Forest in handling complex, non-linear relationships between features (latitude, longitude, etc.) and predicted outcomes (Deaths and CFR). It ensures precise and reliable predictions.

Usage

Clone this repository to your local machine.
Install the necessary dependencies. (It is in the requirements.txt file inside the Codes folder)
Run the code in the Codes/ folder to train and evaluate the models.
Check the Documents/Predictioneer Model Report.pdf for detailed information and the final report.

Resources and Links

Illustrations and Logo:

Illustration (Designed using GPT)

License

This project is licensed under the MIT License - see the LICENSE file for details.

Thanks for Visiting 😄

Drop a 🌟 if you find this repository useful.
If you have any doubts or suggestions, feel free to reach me.

📫 How to reach me:
Contribute and Discuss: Feel free to open issues 🐛, submit pull requests 🛠️, or start discussions 💬 to help improve this repository!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Predictioneer

Directory Structure

Project Description

Model Selection and Evaluation

Working of the Random Forest Single-Task Model

Model Selection Process

Final Prediction Process

Usage

Resources and Links

License

Thanks for Visiting 😄

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
Codes		Codes
Documents		Documents
Problem Statement		Problem Statement
LICENSE		LICENSE
README.md		README.md

License

madhurimarawat/Predictioneer

Folders and files

Latest commit

History

Repository files navigation

Predictioneer

Directory Structure

Project Description

Model Selection and Evaluation

Working of the Random Forest Single-Task Model

Model Selection Process

Final Prediction Process

Usage

Resources and Links

License

Thanks for Visiting 😄

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages