You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Updating vocabulary causes an unintended change in the dtype of model.wv.syn0_vocab from float32 to float64. The primary cause of this issue is the float64 type numpy array returned by numpy.random.uniform which when vstacked with a float32 numpy array casues the change in dtype. This also produces unpredictable segmentation faults in Cython implementation -- #1742.
Steps/Code/Corpus to Reproduce
from gensim.models.word2vec import LineSentence
from gensim.models.fasttext import FastText as FT_gensim
from gensim.test.utils import common_texts as sentences
new_sentences = [
['computer', 'artificial', 'intelligence'],
['artificial', 'trees'],
['human', 'intelligence'],
['artificial', 'graph'],
['intelligence'],
['artificial', 'intelligence', 'system']
]
model = FT_gensim(size=10, min_count=1)
model.build_vocab(sentences)
print model.wv.syn0_vocab.dtype
model.build_vocab(new_sentences, update=True)
print model.wv.syn0_vocab.dtype
Description
Updating vocabulary causes an unintended change in the
dtype
ofmodel.wv.syn0_vocab
fromfloat32
tofloat64
. The primary cause of this issue is thefloat64
typenumpy
array returned bynumpy.random.uniform
which whenvstack
ed with afloat32
numpy array casues the change in dtype. This also produces unpredictable segmentation faults in Cython implementation -- #1742.Steps/Code/Corpus to Reproduce
Expected Results
Actual Results
Versions
Linux-4.10.0-40-generic-x86_64-with-Ubuntu-16.04-xenial
('Python', '2.7.12 (default, Nov 19 2016, 06:48:10) \n[GCC 5.4.0 20160609]')
('NumPy', '1.13.3')
('SciPy', '1.0.0')
('gensim', '3.1.0')
('FAST_VERSION', 1)
The text was updated successfully, but these errors were encountered: