Skip to content

This project is an implementation of Naive Bayes algorithm to classify It was originally collected by Ken Lang, probably for his Newsweeder: Learning to filter netnews.

Notifications You must be signed in to change notification settings

AmirNiaraki/Naive-Bayes-Classifier

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 

Repository files navigation

Naive Bayes Network

This classification code is implemented using Naive Bayes Classifier. The 20 Newsgroups data set is a collection of approximately 20,000 newsgroup documents, partitioned (nearly) evenly across 20 different newsgroups. It was originally collected by Ken Lang, probably for his Newsweeder: Learning to filter netnews paper: Ken Lang, Newsweeder: Learning to fillter netnews, Proceedings of the Twelfth International Conference on Machine Learning, 331-339 (1995).

Though he did not explicitly mention this collection. The 20 newsgroups collection has become a popular data set for experiments in text applications of machine learning techniques, such as text classiffcation and text clustering. The data is organized into 20 different newsgroups, each corresponding to a different topic. Here

The original data set is available at http://qwone.com/~jason/20Newsgroups/.

Required packages for Python 3.7 are numpy, pandas, time and sklearn.metrics.

The code should be placed next to '20Newsgroups' folder. This folder should contain these CSV files: ./20newsgroups/train_data.csv

./20newsgroups/train_label.csv

./20newsgroups/test_data.csv

./20newsgroups/test_label.csv

A short report on the performance comparison of Maximum Likelihood Estimator and Naive Bayes Estimator is attached.

About

This project is an implementation of Naive Bayes algorithm to classify It was originally collected by Ken Lang, probably for his Newsweeder: Learning to filter netnews.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published