Natural-language-Processing

Text Summarization & Web Scrapping

This file consists of a piece of text scrapped from a website and involves basic text data pre-processing techniques such as lemmatization, word and senetnce tokenization , stopword removal, punctuation removal, uper to lower case conversion and digit removal. The pre-processed text data is then used for frequency distribution count of words and then used for text ranking and summarization using TF-IDF and Gensim and the results are compared.

Text Summarization with N-Grams

Similar approach as above is performed for this text data but N-Gram is sued for the frequency of words calculation and summarization. Here Unigrams, Bigrams and Trigrams are created first and then used for frequency count.

Word Prediction with N-Grams

Created word tokens of the sentence, found frequency for each of the unigrams and relative frequency for bigrams. Performed word prediction using the relative frequency and probability.

NER & De-Identification Using SPACY

Used Spacy library to perform Named Entity Recognition on a webscarpped news article.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
NER & De-Identification Using SPACY.ipynb		NER & De-Identification Using SPACY.ipynb
README.md		README.md
Text Summarization & Web Scrapping.ipynb		Text Summarization & Web Scrapping.ipynb
Text Summarization with N-Grams.ipynb		Text Summarization with N-Grams.ipynb
Word Prediction with N-Grams.ipynb		Word Prediction with N-Grams.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Natural-language-Processing

Text Summarization & Web Scrapping

Text Summarization with N-Grams

Word Prediction with N-Grams

NER & De-Identification Using SPACY

About

Uh oh!

Languages

rshinde03/Natural-Language-Processing

Folders and files

Latest commit

History

Repository files navigation

Natural-language-Processing

Text Summarization & Web Scrapping

Text Summarization with N-Grams

Word Prediction with N-Grams

NER & De-Identification Using SPACY

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Languages