Skip to content

tyler-dardis/NHANES-kmeans-Clustering

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 

Repository files navigation

NHANES Banner

NHANES k-means Clustering Analysis

To view the entire project, open the full notebook here.


Objective: Identify clusters of survey respondents in the NHANES dataset.

Data: NHANES 2017-March 2020 Pre-Pandemic Questionnaire Data

Model Type: k-means clustering (unsupervised machine learning)

Tools/Libraries: Pandas, Scikit-Learn, Matplotlib, Seaborn

Summary

For this unsupervised machine learning project, I built a k-means model that clusters individuals based on medical and demographic data. The NHANES is a great data source for this because it is publicly available and captures many aspects of survey participants' health. This includes demographics, physical exam and blood test results, dietary observations, and lifestyle questionnaires. This dataset also works well with a k-means model because much of the data consists of continuous numeric values, and many of the categorical variables are reported as numeric values with an ordinal logic.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published