The data set being analysed was gotten from kaggle. Click here to download.
It contains information collected from 110,527 medical appointments in Brazil.
The obvious singular question for analysis is: How do different variables affect a patients showing up or absence from an appointment?
The dataset has 14 variables distributed across type of disease, neighborhood of patient, age, gender, alcohol addict, sms received and the variable which provides the information on whether a patient shows up or not.
- Python programming language (version 3.6 or higher)
- Pandas
- Matplotlib
- Seaborn