NHANES k-means Clustering Analysis

To view the entire project, open the full notebook here.

Objective: Identify clusters of survey respondents in the NHANES dataset.

Data: NHANES 2017-March 2020 Pre-Pandemic Questionnaire Data

Model Type: k-means clustering (unsupervised machine learning)

Tools/Libraries: Pandas, Scikit-Learn, Matplotlib, Seaborn

Summary

For this unsupervised machine learning project, I built a k-means model that clusters individuals based on medical and demographic data. The NHANES is a great data source for this because it is publicly available and captures many aspects of survey participants' health. This includes demographics, physical exam and blood test results, dietary observations, and lifestyle questionnaires. This dataset also works well with a k-means model because much of the data consists of continuous numeric values, and many of the categorical variables are reported as numeric values with an ordinal logic.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
NHANES_k_means_Clustering.ipynb		NHANES_k_means_Clustering.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NHANES k-means Clustering Analysis

To view the entire project, open the full notebook here.

Summary

About

Releases

Packages

Languages

tyler-dardis/NHANES-kmeans-Clustering

Folders and files

Latest commit

History

Repository files navigation

NHANES k-means Clustering Analysis

To view the entire project, open the full notebook here.

Summary

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages