Multimodal Skip-gram Model

This is an unoffical implementation of multimodal Skip-gram model. Forked from Word2Vec in C++11.

Prerequisites

g++ >= 4.8.5
Boost >= 1.53.0
openblas >= 0.3.3
HDF5 (library) >= 1.8.12

Usage

# compile sources
$ make

# By -h option, you can find the details of its options
$ ./word2vec -h
Usage : ./word2vec [options] input_path
Allowed options:
  -h [ --help ]                         help.
  -m [ --mode ] arg (=train)            Mode train/test.
  -o [ --output ] arg (=./vectors.bin)  Output path.
  -d [ --dim ] arg (=300)               Dimensionality of word embedding.
  -w [ --window ] arg (=5)              Window size.
  -s [ --sample ] arg (=0.00100000005)  Subsampling probability.
  -c [ --min-count ] arg (=5)           The minimum frequency of words.
  -n [ --negative ] arg (=5)            The number of negative samples.
  -a [ --alpha ] arg (=0.0250000004)    The initial learning rate.
  -b [ --min-alpha ] arg (=9.99999975e-05)
                                        The minimum learning rate.
  -p [ --n_workers ] arg (=0)           The number of threads
  -f [ --format ] arg (=bin)            Output file format: bin/text
  -i [ --iteration ] arg (=5)           The number of iterations
  -M [ --method ] arg (=HS)             Methos: HierarchicalSoftmax(HS)/Negativ
                                        eSampling(NS)
  -I [ --multimodal-input ] arg         Path to multimodal feature file
  --input_path arg                      Path to input file

With the --multimodal-input option, it works as multimodal skip-gram model, otherwise it just the same as the ordinary word2vec. Image search demo can be found in notebook/image_search.ipynb.

Note

The parameter learning scheme may be different from that of the original MM-Skipgram.

Reference

Combining Language and Vision with a Multimodal Skip-gram Model

Name		Name	Last commit message	Last commit date
Latest commit History 127 Commits
ThreadPool @ 9a42ec1		ThreadPool @ 9a42ec1
eigen-git-mirror @ 167613f		eigen-git-mirror @ 167613f
flatbuffers		flatbuffers
notebook		notebook
utf8cpp		utf8cpp
.gitignore		.gitignore
.gitmodules		.gitmodules
Makefile		Makefile
README.md		README.md
cvt.h		cvt.h
download.sh		download.sh
grad_utils.h		grad_utils.h
huffman_tree.h		huffman_tree.h
main.cc		main.cc
main_zh.cc		main_zh.cc
math_utils.h		math_utils.h
model.fbs		model.fbs
model_generated.h		model_generated.h
my_utils.h		my_utils.h
param_update.h		param_update.h
v.h		v.h
word2vec.h		word2vec.h

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Multimodal Skip-gram Model

Prerequisites

Usage

Note

Reference

About

Releases

Packages

Languages

kafku/mm_word2vec

Folders and files

Latest commit

History

Repository files navigation

Multimodal Skip-gram Model

Prerequisites

Usage

Note

Reference

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages