This assignment (in the form of a research paper) was conducted as part of the module ‘Machine Learning’ for the MSc ‘Data Science and Machine Learning’. The objective was to test different Machine Learning models in order to accurately predict the vital status of patients with high grade serous ovarian cancer. The trained models implemented the classifiers K-Nearest-Neighbors, Support Vector Machine, Logistic Regression, Random Forest and XGBoost. The methodologies used were K-Nearest-Neighbors for filling missing values in the dataset, PCA and variance threshold for attribute selection, Min-Max scaling and Z-Score for normalization, 5-fold Cross Validation for the validation of the models and Grid Search for hyperparameter selection. The performance of the models was evaluated using the metrics Accuracy and Area Under the Curve (AUC).
-
Notifications
You must be signed in to change notification settings - Fork 0
amoustakis/Estimation-of-vital-status-of-cancer-patients-using-Machine-Learning
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
Estimation of vital status of patients with ovarian cancer using Machine Learning models (K-Nearest-Neighbors, Support Vector Machine, Logistic Regression, Random Forest and XGBoost)
Topics
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published