Spam Classification Project

Overview

This project aims to classify SMS messages as either spam or not spam using various machine learning models. The dataset used in this project contains labeled SMS messages indicating whether they are spam or not spam.

Features

Data preprocessing: The SMS messages are preprocessed to remove noise and irrelevant information.
Exploratory Data Analysis (EDA): Various visualizations are used to analyze the distribution of spam and not spam messages.
Feature Engineering: Additional features such as the number of characters, words, and sentences are extracted from the SMS messages.
Model Building: Several machine learning models such as Naive Bayes, Logistic Regression, Support Vector Machines (SVM), Random Forest, etc., are trained and evaluated.
Model Improvement: Techniques such as hyperparameter tuning and ensemble methods like Voting Classifier are employed to improve model performance.
Saving the Model: The trained model and vectorizer used for feature extraction are saved for future use.

Requirements

Python 3.x
Libraries: numpy, pandas, matplotlib, seaborn, nltk, scikit-learn, xgboost, wordcloud

Usage

Clone the repository:

git clone https://github.com/SyedFahad7/Spam-or-Ham.git

Install the required libraries:

pip install -r requirements.txt

Run the Jupyter Notebook or Python script:

jupyter notebook sms-classifier.ipynb

Follow the instructions in the notebook/script to preprocess the data, train the models, and evaluate their performance.

File Structure

sms-classifier.ipynb: Jupyter Notebook containing the project code.
spam_or_not_spam.csv: Dataset containing labeled SMS messages.
README.md: Documentation providing an overview of the project, usage instructions, and file structure.
requirements.txt: Text file listing all the required libraries and their versions.

Author

Your Name

License

This project is licensed under the MIT License - see the LICENSE file for details.

Feel free to further customize the colors or style according to your preferences!

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.vscode		.vscode
Heatmaps.png		Heatmaps.png
Histogram_characters.png		Histogram_characters.png
Histogram_sentences.png		Histogram_sentences.png
Histogram_words.png		Histogram_words.png
README.md		README.md
pairplot.png		pairplot.png
pie_chart.png		pie_chart.png
sms-classifier.py		sms-classifier.py
spam_or_not_spam.csv		spam_or_not_spam.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Spam Classification Project

Overview

Features

Requirements

Usage

File Structure

Author

License

About

Releases

Packages

Languages

SyedFahad7/Spam-or-Ham

Folders and files

Latest commit

History

Repository files navigation

Spam Classification Project

Overview

Features

Requirements

Usage

File Structure

Author

License

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages