Skip to content

andrewnana/R-Machine-Learning-TCR

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 

Repository files navigation

R-Machine-Learning

https://andrewnana.github.io/R-Machine-Learning-TCR/

This project investigates recurrence in thyroid cancer patients using both statistical logistic regression and machine learning methods. With the logistic regression, we explore the relationship between ever smoking and the likelihood of recurrence. We then shift to machine learning techniques for risk stratification and evaluate the predictive accuracy of the following algorithms:

logistic regression,

K-Nearest Neighbors (KNN),

Decision Tree,

Random Forest,

Support Vector Machines (SVM),and,

Artificial Neural Networks (ANN)

These algorithms are evaluated based on the following performance metrics:

Sensitivity (ability to correctly identify patients who experience recurrence),

Specificity (ability to correctly identify those who do not),

Positive Predictive Value (PPV) (the probability that predicted recurrences are true),

Negative Predictive Value (NPV) (the probability that predicted non-recurrences are true),

Area Under the ROC curve (AUC) (overall discriminatory ability),

Accuracy (the proportion of correct classifications across all cases).

The dataset in this study is publicly accessible and the University of California Machine Learning Repository, and was generously provided by Borzooei et al., who published their original findings in 2024.

Disclaimer: This analysis is intended for educational and research purposes only and has not been peer-reviewed. While efforts have been made to ensure the accuracy of the methods and results, the author does not guarantee the correctness or completeness of the analysis. The author bears no responsibility or liability for any errors, omissions, or outcomes resulting from the use of this material. Use at your own discretion.