Skip to content

Explore the world of text classification with this project that showcases a text classifier built from the ground up using a self-implemented Naive Bayes algorithm. Leveraging the 20 Newsgroups dataset from scikit-learn, this project guides you through the process of data exploration, preprocessing, and model training.

Notifications You must be signed in to change notification settings

halfdeb/Text-Classifier-using-self-implemented-Naive-Bayes-

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 

Repository files navigation

Text Classification Project(Using self-implemented Naive Bayes)📗

Overview 👀

Welcome to the Text Classification Project! In this project, I'll be implementing a text classification model using the NaiveBayes algorithm on the 20 Newsgroups dataset from scikit-learn.

The Dataset 📦

20 Newsgroups Dataset

The 20 Newsgroups dataset is a collection of approximately 20,000 newsgroup documents spanning 20 different newsgroups. It is often used for text classification and clustering tasks. The dataset covers a wide range of topics, including politics, sports, technology, and more.

Key Information:

  • Classes/Topics: 20
  • Data Split: Training and Testing
  • Dataset Source: scikit-learn

Dataset Exploration:

The dataset is distributed across various newsgroups, each representing a specific category. It includes both the training and testing sets for comprehensive model evaluation. Each document is labeled with its corresponding newsgroup, allowing for supervised learning.

Acknowledgments 🙏🏻

This project is inspired by the scikit-learn community and the 20 Newsgroups dataset contributors.

Happy coding and text classifying! 🚀

About

Explore the world of text classification with this project that showcases a text classifier built from the ground up using a self-implemented Naive Bayes algorithm. Leveraging the 20 Newsgroups dataset from scikit-learn, this project guides you through the process of data exploration, preprocessing, and model training.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published