Email Spam Detection

We all receive a lot of emails in our daily life. Some emails are also very meaningless and irrelevant. We call such emails "spam". So, would you like to know which e-mail is spam and which is ham?

DATASET

Dataset consist of two classes. These are "ham" and "spam". We have 4825 ham data and 747 spam data. The dataset is heavily unbalanced.

The following two figures show WordCloud representation for spam and ham.

TRAINING

We have trained the data set with the machine learning algorithms.

Naive Bayes
Support Vector Machine
KNN
Decision Tree
Random Forest

Below, for each algorithm you can see the accuracy.

You can also do your predicts for each algorithm or you can choose one for prediction.

MultinomialNB() This is a Real email 

SVC(C=1000, gamma=0.001) This is a Real email 

KNeighborsClassifier(n_neighbors=3) This is a Real email 

DecisionTreeClassifier() This is a Real email 

RandomForestClassifier() This is a Real email

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
README.md		README.md
mail_spam_detection.ipynb		mail_spam_detection.ipynb
spam.csv		spam.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Email Spam Detection

DATASET

TRAINING

About

Releases

Packages

Languages

MelihGulum/Email-Spam-Detection

Folders and files

Latest commit

History

Repository files navigation

Email Spam Detection

DATASET

TRAINING

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages