You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
For an "imputed" word (missing from the vocabulary). The word embedding is computed as the sum of embedding for n-grams:
embeds['ttttt'].dtype==dtype('float64')
The problem in models/wrappers/fasttext.py::FastTextKeyedVectors.word_vec. In the case of a missing word, the zero vector is initialised to be a 64-bit float array to which a bunch of 32-bit embeddings are added to.
Description
gensim.models.wrappers.FastText
returns inconsistent dtypes.Steps/Code/Corpus to Reproduce
For an existing word:
For an "imputed" word (missing from the vocabulary). The word embedding is computed as the sum of embedding for n-grams:
The problem in
models/wrappers/fasttext.py::FastTextKeyedVectors.word_vec
. In the case of a missing word, the zero vector is initialised to be a 64-bit float array to which a bunch of 32-bit embeddings are added to.Versions
Linux-4.4.0-97-generic-x86_64-with-Ubuntu-16.04-xenial
Python 3.5.2 (default, Nov 17 2016, 17:05:23)
[GCC 5.4.0 20160609]
NumPy 1.13.3
SciPy 0.19.1
gensim 3.0.1
FAST_VERSION 1
The text was updated successfully, but these errors were encountered: