NLP-tools-in-Dash

A Natural Language Processing (NLP) interactive Plotly Dash tool to process text data - from tokenizing, lemmatizing, etc. all the way to Machine Learning (ML) classification and word prediction.

About this app

NLP analysis in a single app. 11 figures, dropdown and slider analysis controls, ML training and classification

About dash

Dash and how to use it

Here is a direct quote:

Dash is the most downloaded, trusted Python framework for building ML & data science web apps. Built on top of Plotly.js, React and Flask, Dash ties modern UI elements like dropdowns, sliders, and graphs directly to your analytical Python code. Read our tutorial (proudly crafted ❤️ with Dash itself).

Getting Started in Python

Prerequisites and usage

Make sure that dash and its dependent libraries and others listed below are correctly installed (using pip or conda, pip shown here):

pip install dash
pip install dash-bootstrap-components
pip install dash-loading-spinners
pip install matplotlib
pip install networkx
pip install nltk
pip install numpy 
pip install pandas
pip install seaborn
pip install wordcloud
pip install yellowbrick

Features

Written entirely in Python - with an interactive ploty Dash web application
Load text dataframe, parse, tokenize, lemmatize, analyze, train a naive bayes classification model and predict word class.
Tabbed, interactive and visually-pleasing environment which is easy to use
Support for doing word relationships using bigram market basket analysis
Automatic file processing with dropdown for categories and sliders of how many top words (frequency) to plot and display in basket analysis.

Algorithm steps

Panels (tabs)

DATA & FREQUENCY - has word frequency plots in different formats, and a datatable
TREEMAP - Treemap of headline length distributions
WORD RELATIONSHIPS - Basket analysis (netowrk and heatmap), top 5 word relationships. Calculated from lemmatized word co-occurence
ML (NAIVE BAYES) - detailed freqency distribution for all categories, train and predict words using multinomina naive bayes

DATA & FREQUENCY

TREEMAP

WORD RELATIONSHIPS

ML (NAIVE BAYES)

Controls

How to use

Install Python 3.8 or newer and packages mentioned above
Run the app from the comman line with the python file name followed by the dataframe to use.

python3 nlp_dash_tool.py assets/News_Category_Dataset_v3.json
use the dropdown and sliders in the first panel (tab) named "DATA & FREQUENCY" to control analysis.
The slider for sampling the data is set at 30% by default to give enough data for ML algorithm training

self.sample_percent = 30 #percent
Use your command-line to follow app loading and analysis results. A few print outs are intentionally added to spy on performance. You will see changes as you play with sliders and the drop down. It will look like this:

WELLNESS
Length of all words:  85439
FreqDist:
life     628
time     561
one      557
peopl    539
dtype: int64
...........class built
Dash is running on http://127.0.0.1:9132/

 * Serving Flask app 'nlp_dash_tool'
 * Debug mode: on

And if you press "Run model" in "ML (NAIVE BAYES)" tab ~this shows up:

Train accuracy score: 84.41%
Test accuracy score: 80.82%

After which, if you type in a word to predict, you will see something like this:

Your input
tel

Prediction
ENTERTAINMENT

Your input
tele

Prediction
WELLNESS

Documentation

The Dash contains everything you need to know about the library. It contains useful information of on the core Dash components and how to use callbacks, examples, functioning code, and is fully interactive. You can also use the Press & news for a complete and concise specification of the API.

More references

💻 Github Repository
🗺 Component Reference

Contributing and Permissions

Please do not directly copy anything without my concent. Feel free to reach out to me at https://www.linkedin.com/in/mulugeta-semework-abebe/ for ways to collaborate or use some components.

License

Dash is licensed under MIT. Please view LICENSE for more details. For other packages click on corresponding links at the top of this page (first line).

Acknowledgments

Huge thanks to the following contributors on kaggle. This app would not have been possible without their massive work!

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
assets/images		assets/images
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
nlp_dash_tool.py		nlp_dash_tool.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NLP-tools-in-Dash

A Natural Language Processing (NLP) interactive Plotly Dash tool to process text data - from tokenizing, lemmatizing, etc. all the way to Machine Learning (ML) classification and word prediction.

About this app

NLP analysis in a single app. 11 figures, dropdown and slider analysis controls, ML training and classification

About dash

Getting Started in Python

Prerequisites and usage

Features

Algorithm steps

Panels (tabs)

How to use

Documentation

More references

Contributing and Permissions

License

Acknowledgments

About

Releases

Packages

Languages

License

semework/NLP-tools-in-Dash

Folders and files

Latest commit

History

Repository files navigation

NLP-tools-in-Dash

A Natural Language Processing (NLP) interactive Plotly Dash tool to process text data - from tokenizing, lemmatizing, etc. all the way to Machine Learning (ML) classification and word prediction.

About this app

NLP analysis in a single app. 11 figures, dropdown and slider analysis controls, ML training and classification

About dash

Getting Started in Python

Prerequisites and usage

Features

Algorithm steps

Panels (tabs)

How to use

Documentation

More references

Contributing and Permissions

License

Acknowledgments

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages