Iris 🌸 Classification using Decision Trees and k-NN

A comprehensive machine learning project implementing and comparing Decision Trees and k-Nearest Neighbors (k-NN) algorithms for classifying Iris flowers. This project focuses on binary classification between Versicolor and Virginica species using their petal measurements.

📑 Table of Contents

Project Overview
Key Features
📂 Project Structure
Installation
📊 Results and Analysis
- k-NN Performance Analysis
- Decision Tree Comparison
Usage
🔬 Technical Details
- Implemented Algorithms
- Performance Metrics
🤝 Contributing
📄 License

Project Overview

This project implements and analyzes two fundamental machine learning algorithms:

k-Nearest Neighbors (k-NN) with various distance metrics
Decision Trees with two different splitting strategies (Brute-force and Binary Entropy)

The implementation uses the Iris dataset, specifically focusing on distinguishing between Versicolor and Virginica species using only their second and third features.

Key Features

Advanced k-NN Implementation:
- Multiple k values (1, 3, 5, 7, 9)
- Different distance metrics (L1, L2, L∞)
- Comprehensive error analysis across parameters
Dual Decision Tree Approaches:
- Brute-force approach constructing all possible trees
- Binary entropy-based splitting strategy
- Visualizations of tree structures and decision boundaries

📂 Project Structure

.
├── models/                  # Core ML model implementations
│   ├── __init__.py
│   ├── decision_trees.py   # Decision tree algorithms
│   └── knn.py             # k-NN implementation
├── results/                # Generated visualizations
│   ├── decision_tree_errors.png
│   ├── decision_tree_figure1_visualization.png
│   ├── decision_tree_figure2_visualization.png
│   └── k-NN_errors.png
├── data_utils.py          # Data handling utilities
├── main.py               # Main execution script
├── metrics.py            # Evaluation metrics
└── visualization.py      # Visualization tools

Installation

Clone the repository:

git clone https://github.com/yourusername/iris-classification.git
cd iris-classification

Set up a virtual environment (recommended):

python -m venv venv
source venv/bin/activate  # On Windows use: venv\Scripts\activate

Install dependencies:
```
pip install -r requirements.txt
```

📊 Results and Analysis

k-NN Performance Analysis

The k-NN implementation was tested with various parameters:

k values: 1, 3, 5, 7, 9
Distance metrics: L1 (Manhattan), L2 (Euclidean), L∞ (Chebyshev)

💡 Key Findings:

Higher k values generally resulted in more stable predictions

L2 distance metric showed slightly better performance

Best performance achieved with k=9 using L2 distance

Decision Tree Comparison

Two decision tree implementations were compared:

Brute-Force Approach 🔍:
- Error rate: 5.00%
Entropy-Based Approach 🎯:
- Error rate: 7.00%

Usage

Run the main analysis script:

python main.py

This will execute:

📥 Load and preprocess the Iris dataset
📊 Perform k-NN analysis with various parameters
🌳 Generate decision trees using both approaches
📈 Create visualizations and error analysis

🔬 Technical Details

Implemented Algorithms

k-Nearest Neighbors:
- Custom implementation with multiple distance metrics
- Parameter evaluation framework
- Cross-validation with 100 iterations
Decision Trees:
- Brute-force tree construction
- Entropy-based splitting
- Visualization of tree structures and decision boundaries

Performance Metrics

The project employs several metrics for evaluation:

Classification error rates
Training vs. Test set performance
Error difference analysis

🤝 Contributing

We welcome contributions! Please feel free to submit a Pull Request. For major changes:

🍴 Fork the repository.
🌿 Create a new branch (git checkout -b feature-branch).
💡 Commit your changes (git commit -m 'Add new feature').
📤 Push to the branch (git push origin feature-branch).
🔍 Open a Pull Request.

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Iris 🌸 Classification using Decision Trees and k-NN

📑 Table of Contents

Project Overview

Key Features

📂 Project Structure

Installation

📊 Results and Analysis

k-NN Performance Analysis

Decision Tree Comparison

Usage

🔬 Technical Details

Implemented Algorithms

Performance Metrics

🤝 Contributing

📄 License

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
__pycache__		__pycache__
models		models
results		results
.gitignore		.gitignore
README.md		README.md
data_utils.py		data_utils.py
main.py		main.py
metrics.py		metrics.py
visualization.py		visualization.py

benami171/ML_kNN_Decision-trees

Folders and files

Latest commit

History

Repository files navigation

Iris 🌸 Classification using Decision Trees and k-NN

📑 Table of Contents

Project Overview

Key Features

📂 Project Structure

Installation

📊 Results and Analysis

k-NN Performance Analysis

Decision Tree Comparison

Usage

🔬 Technical Details

Implemented Algorithms

Performance Metrics

🤝 Contributing

📄 License

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages