Skip to content

🛡️ Spam classifier using Python, capable of accurately categorizing messages as spam or non-spam. Leveraging machine learning techniques and natural language processing, it's a robust tool for filtering unwanted messages.

Notifications You must be signed in to change notification settings

SyedFahad7/Spam-or-Ham

Repository files navigation

Spam Classification Project

Overview

This project aims to classify SMS messages as either spam or not spam using various machine learning models. The dataset used in this project contains labeled SMS messages indicating whether they are spam or not spam.

Features

  • Data preprocessing: The SMS messages are preprocessed to remove noise and irrelevant information.
  • Exploratory Data Analysis (EDA): Various visualizations are used to analyze the distribution of spam and not spam messages.
  • Feature Engineering: Additional features such as the number of characters, words, and sentences are extracted from the SMS messages.
  • Model Building: Several machine learning models such as Naive Bayes, Logistic Regression, Support Vector Machines (SVM), Random Forest, etc., are trained and evaluated.
  • Model Improvement: Techniques such as hyperparameter tuning and ensemble methods like Voting Classifier are employed to improve model performance.
  • Saving the Model: The trained model and vectorizer used for feature extraction are saved for future use.

Requirements

  • Python 3.x
  • Libraries: numpy, pandas, matplotlib, seaborn, nltk, scikit-learn, xgboost, wordcloud

Usage

  1. Clone the repository:
git clone https://github.com/SyedFahad7/Spam-or-Ham.git
  1. Install the required libraries:
pip install -r requirements.txt
  1. Run the Jupyter Notebook or Python script:
jupyter notebook sms-classifier.ipynb
  1. Follow the instructions in the notebook/script to preprocess the data, train the models, and evaluate their performance.

File Structure

  • sms-classifier.ipynb: Jupyter Notebook containing the project code.
  • spam_or_not_spam.csv: Dataset containing labeled SMS messages.
  • README.md: Documentation providing an overview of the project, usage instructions, and file structure.
  • requirements.txt: Text file listing all the required libraries and their versions.

Author

Your Name

License

This project is licensed under the MIT License - see the LICENSE file for details.


Feel free to further customize the colors or style according to your preferences!

About

🛡️ Spam classifier using Python, capable of accurately categorizing messages as spam or non-spam. Leveraging machine learning techniques and natural language processing, it's a robust tool for filtering unwanted messages.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages