fastText models from 2.3.0 can't be loaded in 3.0.0 #1642

Liebeck · 2017-10-22T17:40:41Z

Description

I do have a compatibility issue with fastText and version 3.0.0. In version 2.3.0, I used the fastText C++ wrapper to train a model based on the code available at that time from
https://github.com/RaRe-Technologies/gensim/blob/develop/docs/notebooks/FastText_Tutorial.ipynb

This code works in 2.3.0

from gensim.models.wrappers.fasttext import FastText as FT_wrapper
model = FT_wrapper.load(model_path)
if key in model:
    character_embedding = model[key]

In 3.0.0 it fails due to

File "scripts/foo.py", line 43, in reduce_fasttext_embedding
character_embedding = model[key]
File "/usr/local/lib/python3.5/dist-packages/gensim/models/word2vec.py", line 1345, in getitem
return self.wv.getitem(words)
File "/usr/local/lib/python3.5/dist-packages/gensim/models/keyedvectors.py", line 602, in getitem
return self.word_vec(words)
File "/usr/local/lib/python3.5/dist-packages/gensim/models/wrappers/fasttext.py", line 94, in word_vec
word_vec = np.zeros(self.syn0_ngrams.shape[1])
AttributeError: 'FastTextKeyedVectors' object has no attribute 'syn0_ngrams'

Expected Results

I expected the model from 2.3.0 to be loadable in 3.0.0. I was able to get my code working by downgrading to 2.3.0. I made some evaluations with trained models and I'd be happy to still use these models. Otherwise, I'm stuck at gensim 2.3.0

@menshikh-iv
I guess this has something to do with this commit 6e51156#diff-cd6e655ec64f5b3927aa96ce5d006207 and split 'syn0_all' into 'syn0_vocab' and 'syn0_ngrams'. I'm guessing that models trained with 2.3.0 aren't compatible with version 3. Is it possible that the load method checks whether the model was trained in 2.3.0, loads the 2.3.0 method, and internally makes the same split?

The text was updated successfully, but these errors were encountered:

Liebeck · 2017-10-22T17:50:25Z

Or another idea to solve this: Can you create a utilsscript that transforms a 2.3.0 model into a 3.0.0 model?

menshikh-iv · 2017-10-23T08:03:09Z

@Liebeck Thanks for the report

I think possible to check this in load method, wdyt @chinmayapancholi13?

Can you fix this bug and create PR @Liebeck @chinmayapancholi13?

Liebeck · 2017-10-25T07:14:34Z

I'm not sure if I understand enough of gensim's architecture to contribute a quick fix. I might be able to have a further look at in January 😐

chinmayapancholi13 · 2017-10-25T09:28:38Z

@Liebeck Thanks for reporting this issue! Seems to be a problem in the load function.

@menshikh-iv Hey Ivan! I am a little occupied in this week. So I can take a look at this and try to get it resolved in the following week. I hope this is fine. I'll give an update about my progress here. :)

menshikh-iv · 2017-10-25T10:02:30Z

It will be great @chinmayapancholi13, I'm glad to see you here again :)

menshikh-iv · 2017-11-20T08:26:06Z

Fixed in #1723

piskvorky added the bug Issue described a bug label Oct 22, 2017

menshikh-iv added the difficulty medium Medium issue: required good gensim understanding & python skills label Oct 23, 2017

chinmayapancholi13 mentioned this issue Nov 17, 2017

Fixed incompatability in persistence for older versions #1723

Merged

menshikh-iv closed this as completed Nov 20, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fastText models from 2.3.0 can't be loaded in 3.0.0 #1642

fastText models from 2.3.0 can't be loaded in 3.0.0 #1642

Liebeck commented Oct 22, 2017 •

edited

Loading

Liebeck commented Oct 22, 2017

menshikh-iv commented Oct 23, 2017

Liebeck commented Oct 25, 2017

chinmayapancholi13 commented Oct 25, 2017

menshikh-iv commented Oct 25, 2017

menshikh-iv commented Nov 20, 2017

fastText models from 2.3.0 can't be loaded in 3.0.0 #1642

fastText models from 2.3.0 can't be loaded in 3.0.0 #1642

Comments

Liebeck commented Oct 22, 2017 • edited Loading

Description

Expected Results

Liebeck commented Oct 22, 2017

menshikh-iv commented Oct 23, 2017

Liebeck commented Oct 25, 2017

chinmayapancholi13 commented Oct 25, 2017

menshikh-iv commented Oct 25, 2017

menshikh-iv commented Nov 20, 2017

Liebeck commented Oct 22, 2017 •

edited

Loading