GitHub - steve12512/thesis: This repository was used for my thesis. The goal was to find a biased dataset, and mitigate its bias. That is done under the patients directory. Check the README file for more.

The purpose of this project was to find a biased dataset, and try different methods to mitigate its bias. That is done under the patients directory, in the patients.ipynb file. The dataset used is this; https://www.kaggle.com/datasets/majdmustafa/diabetes-hospital-readmission-dataset We firstly train a Random Forest Classifier on the dataset, in order to predict whether or not a patient will be readmitted to a hospital. We then notice a disparity between the outcomes, and the true positive/negative, false positive/negative results for different genders, and especially races. We then try to mitigate such unfair-biased predictions using different algorithms. We first use preprocessing techniques, such as reweighting and resampling. We then use in processing techniques, such as fairness constraints(demographic parity and equalized odds), and an Adversial Debiasing Model. Lastly, we use post processing techniques, such as a Threshold Optimizer. We then compare the resuts, and the different trade offs they induce between accuracy and fairness.

The rest of the directories contain different attempts to find bias. The greek directory contains the training of an nlp model(word2vec) on a corpus on classical Greek literature.

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
Thesis		Thesis
__pycache__		__pycache__
adult		adult
cancer		cancer
col		col
greek		greek
income2		income2
japan		japan
jobs		jobs
korean		korean
misc_to_be_deleted		misc_to_be_deleted
patients		patients
tries		tries
youth		youth
.gitignore		.gitignore
Cleaned_Students_Performance.csv		Cleaned_Students_Performance.csv
README.md		README.md
car_insurance.csv		car_insurance.csv
covid.ipynb		covid.ipynb
covid_data_log_200908.csv		covid_data_log_200908.csv
covid_data_log_200922.csv		covid_data_log_200922.csv
deepseek.ipynb		deepseek.ipynb
income.ipynb		income.ipynb
students.ipynb		students.ipynb
x		x
x.ipynb		x.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Releases

Packages

Languages

steve12512/thesis

Folders and files

Latest commit

History

Repository files navigation

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages