This is an efficient implementation of Word2vec on game of thrones textbooks
Note: This implemented on Windows OS, please find all path strings and change \ with / if running on linux or Mac
- Tensorflow
- Python 3
- Numpy
- os
- argpase
- glob
if you would like to run the model yourself and configure the hyper-parameters specified in main.py please do delete the following folders first to avoid conflicts when running tensorflow:
- visualizations
- graph
- checkpoints
To train from scratch have a look at main.py and choose the hyper-parameters you would like to experiment with, there is only 2 mandatory arguments --data-dir and --vocab-dir
python main.py --data-dir data\\ --vocab-dir vocab\\
you can use my trained model and run tensorboard to visualize the word vectors generated; to do so:
1- open terminal (cmd on windows)
2- Navigate to visualizations folder
3- run
tensorboard --logdir=visualizations`
4- copy and paste the url provided by tensorboard in chrome
5- load the vocab_3000.tsv file located in visualization folder in tensorboard to identify each word
you may as well run evaluate.py to find analogies and nearest words regarding game of thrones
my favourite one is
Mother is to Joffrey as "ghost/Sam" is to Jon