Writing algorithms without using external nlp libraries.
• Byte-Pair Encoding | BPE
• LISA | ELIZA-like program
• Levenshtein Minimum Edit Distance
• Alternative Levenshtein Minimum Edit Distance
• Minimum Edit Distance with backtrace
• Unigram and Bigram Language Models
For further information feel free to check files.
• Text generation using bigram
• Naive Bayes Algorithm for text classification
-> It will continue after my finals
@article = { Byte-Pair Encoding },
author = {Rico Sennrich and Barry Haddow and Alexandra Birch |2016,
Philip Gage |1994}
title = { Neural Machine Translation of Rare Words with Subword Units (2016),
A New Algorithm for Data Compression (1994)},
@article = { Minimum Edit Distance },
author = { Dan Jurafsky, James H. Martin },
title = { Speech and Language Processing (3rd ed. draft)}
url= { https://web.stanford.edu/class/cs124/lec/med.pdf },
@article = { ELIZA },
author = { Joseph Weizenbaum },
title = { ELIZA A Computer Program For the Study of Natural
Language Communication Between Man And Machine }
@article = { N-grams Language Models },
author = { Dan Jurafsky, James H. Martin },
title = { Speech and Language Processing (3rd ed. draft) }
The main motivation behind this repo is getting deep understanding with the essential structures of NLP.
I follow the "Speech and Language Processing 3rd Ed." written by Dan Jurafsky and James H. Martin.