Creating Word Embeddings Using The CBoW Model

CBoW (Continuous Bag of Words Model)

The CBoW model architecture tries to predict the current target word (the center word) based on the source context words (surrounding words). Considering a simple sentence, "the quick brown fox jumps over the lazy dog”, this can be pairs of (context_window, target_word) where if we consider a context window of size 2, we have examples like ([quick, fox], brown), ([the, brown], quick), ([the, dog], lazy) and so on. Thus the model tries to predict the target_word based on the context_window words.

Implementation

We will first introduce the Continuous Bag of Words (CBoW) Model in this Jupyter Notebook and then implement it on a small dataset consisting of textual data from Shakespeare Novels and create word embeddings for a few words in this Notebook.

We will then use pre-trained word embeddings from the standard word2vc implementation by Google and show how we can perform PCA (Principal Component Analysis) on our word embeddings. We also show how to perform logical comparisons and Language Translation using word embeddings in this Notebook.

Overview

The following steps have been followed in the overall pipeline:

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
assets		assets
generating-word-embeddings		generating-word-embeddings
papers		papers
using-pretrained-word-embeddings		using-pretrained-word-embeddings
word-embeddings-pipeline		word-embeddings-pipeline
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Creating Word Embeddings Using The CBoW Model

CBoW (Continuous Bag of Words Model)

Implementation

Overview

Further Reading

About

Languages

License

anishLearnsToCode/word-embeddings

Folders and files

Latest commit

History

Repository files navigation

Creating Word Embeddings Using The CBoW Model

CBoW (Continuous Bag of Words Model)

Implementation

Overview

Further Reading

About

Topics

Resources

License

Stars

Watchers

Forks

Languages