This project is a Python implementation of various machine learning classifiers for predicting the presence of heart disease in patients based on their clinical data. The classifiers used in this project include Support Vector Machine (SVM), Naive Bayes, Decision Tree, and K-Nearest Neighbors (KNN).
The project uses the "Heart Disease Classification Dataset" from Kaggle. The dataset contains various features such as age, sex, chest pain type, resting blood pressure, cholesterol levels, and more. The target variable indicates the presence or absence of heart disease. CSV file of dataset is given in the repository.
- Python 3.x
- NumPy
- Pandas
- Scikit-learn
-
Clone the repository:
git clone https://github.com/yashshah035/Classifier_Models.git
-
Install the required dependencies:
pip install numpy pandas scikit-learn
-
Navigate to the project directory:
cd Classifier_Models
-
Run the main Python file:
python demo_ml_classifier.py
-
The script will load the dataset, preprocess the data, and present you with a menu to choose the classifier you want to use.
-
Select the classifier by entering the corresponding number:
- Support Vector Machine
- Naive Bayes
- Decision Tree
- K-Nearest Neighbors
-
The script will train the chosen classifier on the training data and evaluate its performance on the test data, displaying the accuracy score.
-
You can repeat the process by selecting a different classifier or exit the program by entering 'q'.
The provided code performs the following steps:
- Imports the necessary libraries: Pandas, Scikit-learn classifiers, and other required modules.
- Loads the dataset from the CSV file.
- Performs data preprocessing steps, such as handling missing values and encoding categorical features.
- Splits the data into training and testing sets.
- Defines a
Classifier
class with methods for each classifier (SVM, Naive Bayes, Decision Tree, and KNN). - Implements a loop that prompts the user to select a classifier.
- Trains the chosen classifier on the training data and evaluates its performance on the test data, printing the accuracy score.
- Allows the user to select a different classifier or exit the program.
Contributions are welcome! If you find any bugs or have suggestions for improvements, feel free to open an issue or create a pull request.