GitHub - stefagnone/-Birthweight-Prediction-Modeling-Case-Study: Data-driven analysis to predict birthweight, leveraging computational analytics and machine learning techniques to identify critical factors, evaluate correlations, and implement actionable public health insights.

Project Overview

This project focuses on leveraging computational analytics and machine learning techniques to analyze and predict birthweight, a crucial metric for neonatal health. The analysis aims to identify key factors influencing birthweight, assess strong correlations, and implement advanced machine learning models to derive actionable insights for improving public health outcomes.

The study employs data-driven methods to address the following:

Identifying correlations with birthweight, including strong positive and negative relationships.
Transforming birthweight data to evaluate improvements in correlation metrics.
Implementing classification models to predict low birthweight and evaluate their performance.

Key deliverables include exploratory data analysis (EDA), feature engineering, multiple model implementations, and the final predictions submitted to Kaggle.

Key Features

Exploratory Data Analysis (EDA): Insights into the dataset through descriptive statistics, histograms, and correlation matrices.
Feature Engineering: Designed and transformed features to enhance predictive power and address data challenges.
Modeling Techniques: Developed and evaluated multiple classification models, including Logistic Regression, Ridge Classification, and Random Forest.
Confusion Matrix Analysis: Analyzed model performance and errors to prioritize correct predictions for low birthweight cases.
Actionable Insights: Provided recommendations based on model results to inform public health strategies.

Repository Structure

|-- Compagnone_Stefano_A2.ipynb       # Jupyter Notebook containing analysis and model development
|-- Compagnone_Stefano_A2.html        # HTML version of the notebook for easy viewing
|-- birthweight.csv                   # Dataset used for training and analysis
|-- submission.csv                    # Final predictions submitted to Kaggle
|-- Images/
    |-- correlation_matrix.png        # Heatmap of feature correlations
    |-- confusion_matrix.png          # Confusion matrix of the final model
    |-- feature_histograms.png        # Histogram visualizations for continuous variables

Key Insights

Correlation Analysis: Strong correlations identified between birthweight and factors such as parental age, education level, and health-related behaviors (e.g., smoking and drinking).
Threshold Analysis: Explored birthweight thresholds distinguishing healthy and non-healthy categories, supported by public health research.
Model Interpretability: Highlighted impactful features, such as maternal education and prenatal visits, to provide actionable insights for healthcare professionals.

Technologies Used

Programming Language: Python Libraries: pandas, numpy, matplotlib, seaborn, scikit-learn, PHIK Models: Logistic Regression, Ridge Classifier, K-Nearest Neighbors, Decision Tree, Random Forest, Gradient Boosting Machine (GBM)

Contact

For any inquiries or further collaboration, please contact: Stefano Compagnone Email: stefanocompagnone98@gmail.com

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
Code		Code
Dataset		Dataset
Documentation		Documentation
Images/PICS		Images/PICS
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Project Overview

Key Features

Repository Structure

Key Insights

Technologies Used

Contact

About

Releases

Packages

Languages

stefagnone/-Birthweight-Prediction-Modeling-Case-Study

Folders and files

Latest commit

History

Repository files navigation

Project Overview

Key Features

Repository Structure

Key Insights

Technologies Used

Contact

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages