Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wmdistance returns inf for single words in vocabulary #1073

Closed
Tixierae opened this issue Jan 4, 2017 · 5 comments
Closed

wmdistance returns inf for single words in vocabulary #1073

Tixierae opened this issue Jan 4, 2017 · 5 comments

Comments

@Tixierae
Copy link

Tixierae commented Jan 4, 2017

import gensim
# loading Google News vectors
model = gensim.models.word2vec.Word2Vec.load_word2vec_format('E:\\GoogleNews-vectors-negative300.bin.gz', binary=True)

The following:

# computing the Word Mover's Distance of an in-vocab word to itself
model.wmdistance(['obama'],['obama'])

returns inf, when obviously it should be zero (according to the documentation, float(inf) is only returned for out-of-vocab words).

Interestingly, model.wmdistance('obama','obama') does return zero, and model.wmdistance(['obama'],['obama','illinois']), model.wmdistance(['obama'],['illinois']) both work. The problem only seems to arise for single identical words.

@tmylk
Copy link
Contributor

tmylk commented Jan 5, 2017

Hmm... There was a fix for when the dictionary size is 1 by @rbahumi and it should have covered this case.

@Tixierae
Copy link
Author

Tixierae commented Jan 5, 2017

Thanks! I will update my installation of the module and try again.

@tmylk
Copy link
Contributor

tmylk commented Jan 20, 2017

@Tixierae did an update fix the issue?

@tmylk
Copy link
Contributor

tmylk commented Jan 25, 2017

Closing as abandoned.

@tmylk tmylk closed this as completed Jan 25, 2017
@Tixierae
Copy link
Author

Tixierae commented Feb 8, 2017

Sorry for the delay in response. Updating indeed fixed the issue. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants