It's an end-to-end Machine Learning Project. The purpose of this project is to predict whether a person is suffering from a particular disease or not on the basis of his/her input data. The prediction has been done by using Machine Learning (ML) classification algorithms and it has been deployed as a Flask web app on Heroku. Currently, this web app can predict 3 types of diseases (Diabetes, Parkinson's and Heart Disease).
Visit: https://multiple-disease-predictor-ml.herokuapp.com/ (This link might not work if the Heroku will discontinue its Free tier/plan, in that case, please scroll down and see how this deployed web app looks like on Heroku)
Home Page
Diabetes Prediction Page
Diabetes Prediction Page With Inputs Provided by the User
Diabetes Prediction Page Displaying the Output for the Inputs Provided by the User
Parkinson's Prediction Page
Heart Disease Prediction Page
This project requires Python and the following Python libraries installed:
- NumPy
- Pandas
- matplotlib
- Seaborn
- scikit-learn
- Flask
Step-1 : Build and trained ML models for each of the 3 diseases, whose code is written in the diabetes.py
, heart.py
and parkinsons.py
files and saved the model in pickle file diabetes.pkl
, heart.pkl
, and parkinsons.pkl
respectively.
Step-2 : Created Flask web app whose code is written in app.py
file. For the interactive user interface, HTML and CSS have been used. HTML files are stored in templates
directory while CSS files and web app's background image is stored in static
directory.
Step-3 : Uploaded the project on GitHub and deployed the web app using Heroku.
Procfile is a mechanism for declaring the commands that are executed by an Heroku app on startup. So for this project, the Procfile contains web: gunicorn app:app
where the first app represents the name of the python file (app.py
) that runs the whole application. The second app represents the app name (app=Flask(__name__)
) that is named inside app.py file.
Step-1 : Login to Heroku, then Create the new app.
Step-2 : Connect to the GitHub and then Connect to the Repository sidroy9/Multiple-Disease-Predictor-ML-Flask-WebApp
where this project exists.
Step-3 : Go to Manual Deploy Section, then choose the main branch to deploy and then click on the Deploy Branch. Now, Build main will start.
Step-4 : After sometime, the app will be deployed successfully. You can click on View to see the live web app.
In a terminal or command window, navigate to the top-level project directory Multiple-Disease-Predictor-ML-Flask-WebApp/
(that contains this README) and run the following command:
python app.py
This will show you the localhost address, type the same address in the browser and it will open WebApp in your browser.
The datasets that are used for training the ML models are:
- The diabetes dataset consists of 768 data points, with each datapoint having 8 features. This dataset is Pima Indians Diabetes Database found on the kaggle.
Features
Pregnancies
: Number of times pregnantGlucose
: Plasma glucose concentration a 2 hours in an oral glucose tolerance testBloodPressure
: Diastolic blood pressure (mm Hg)SkinThickness
: Triceps skin fold thickness (mm)Insulin
: 2-Hour serum insulin (mu U/ml)BMI
: Body mass index (weight in kg/(height in m)^2)DiabetesPedigreeFunction
: Diabetes pedigree functionAge
: Age (years)
Target Variable
9. Outcome
: Class variable (0 or 1) 268 of 768 are 1, the others are 0
- The heart dataset consists of 1025 data points, with each datapoint having 13 features. This dataset is Heart Disease Dataset found on the kaggle.
Features
age
: age in yearssex
: (1 = male; 0 = female)cp
: chest pain typetrestbps
: resting blood pressure (in mm Hg on admission to the hospital)chol
: serum cholestoral in mg/dlfbs
: (fasting blood sugar > 120 mg/dl) (1 = true; 0 = false)restecg
: resting electrocardiographic resultsthalach
: maximum heart rate achievedexang
: exercise induced angina (1 = yes; 0 = no)oldpeak
: ST depression induced by exercise relative to restslope
: the slope of the peak exercise ST segmentca
: number of major vessels (0-3) colored by flourosopythal
: 0 = normal; 1 = fixed defect; 2 = reversable defect
Target Variable
14. target
: Class variable (0 or 1) 526 of 1025 are 1, the others are 0. Value 0 = no heart disease and 1 = heart disease
- The ParkinsonsDisease dataset consists of 195 data points, with each datapoint having 22 features. This dataset is Parkinsons Disease Dataset found on the kaggle.
Features
MDVP:Fo(Hz)
: Average vocal fundamental frequencyMDVP:Fhi(Hz)
: Maximum vocal fundamental frequencyMDVP:Flo(Hz)
: Minimum vocal fundamental frequencyMDVP:Jitter(%)
MDVP:Jitter(Abs)
MDVP:RAP
MDVP:PPQ
Jitter:DDP
: Several measures of variation in fundamental frequencyMDVP:Shimmer
MDVP:Shimmer(dB)
Shimmer:APQ3
Shimmer:APQ5
MDVP:APQ
Shimmer:DDA
:Several measures of variation in amplitudeNHR
HNR
: Two measures of ratio of noise to tonal components in the voiceRPDE
DFA
: Signal fractal scaling exponentspread1
spread2
PPE
: Three nonlinear measures of fundamental frequency variationD2
: Two nonlinear dynamical complexity measures
Target Variable
23. status
: Class variable (0 or 1) 147 of 195 are 1, the others are 0. Value 1 - Parkinson's, 0 - healthy