Skip to content

Latest commit

 

History

History
18 lines (12 loc) · 1.02 KB

README.md

File metadata and controls

18 lines (12 loc) · 1.02 KB

Document Classification using NLP, Machine Learning

Objective

Performed document classification into four defined categories (World, Sports, Business, Sci/Tech). Trained the classifier accuracy with different models ranging from Naïve Bayes to Convolutional Neural Network (CNN) and RCNN and compared the accuracy. By making use of different feature engineering techniques and Natural Language Processing (NLP) features created an accurate text classifier.

Tech Stack

  • Language- Python
  • Libraries- Pandas, Numpy, Matplotlib, Scikit Learn, NLTK, Keras, TensorFlow backend
  • Models- Naive Bayes, Logistic Regression, Random Forest, XGBoost, Shallow Neural Network, Convolutional Neural Network, RCNN

Implementation

Open document_classifier.ipynb Jupyter file to go to the implementation details

The model can be downloaded from below link.

https://drive.google.com/drive/folders/10Ivt175DEkILxwHsF2Ltti8IZpVLtOyo?usp=sharing

The jupyter file also demonstrates loading and using the model for real-time predictions