Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tests for the evaluate_word_pairs function #1061

Merged
merged 62 commits into from
Dec 28, 2016
Merged

Conversation

akutuzov
Copy link
Contributor

Test for evaluating model against semantic similarity datasets (#1047).
Also fixes an error in the function call.

tmylk and others added 30 commits November 5, 2015 19:07
Conflicts:
	CHANGELOG.txt
	gensim/models/word2vec.py
@akutuzov
Copy link
Contributor Author

@tmylk the tests are ready.

Copy link
Contributor

@tmylk tmylk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the tests. An oov_ratio sanity test would be great

pearson = correlation[0][0]
spearman = correlation[1][0]
self.assertTrue(0.1 < pearson < 1.0)
self.assertTrue(0.1 < spearman < 1.0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we please test for oov_ratio in correlation[2] too?

@akutuzov
Copy link
Contributor Author

Sure, done.

@tmylk tmylk merged commit 88d032b into piskvorky:develop Dec 28, 2016
@tmylk
Copy link
Contributor

tmylk commented Dec 28, 2016

Thanks for the improvement!

@tmylk
Copy link
Contributor

tmylk commented Dec 30, 2016

By the way, how is it better than using https://github.com/mfaruqui/eval-word-vectors ?
@anmol01gulati what code did you use to convert gensim word2vec to that format? A short script for that would be useful

@akutuzov
Copy link
Contributor Author

It's better in that this code works directly from Gensim :)
In fact, my code is simpler as it uses Scipy functions for Pearson and Spearman coefficients (eval-word-vectors implements Spearman from scratch). Also, it features some useful options, like case-(in)sensitivity and smart handling of OOV pairs.

jayantj pushed a commit to jayantj/gensim that referenced this pull request Jan 4, 2017
@anmolgulati
Copy link
Contributor

I agree with @akutuzov. The code currently in gensim for Pearson and Spearman coefficients is shorter. But I feel, we could also include the whole dataset for evaluating word vectors, given in https://github.com/mfaruqui/eval-word-vectors. It's just 205 KB, and contains all the major gold standards, it'd be good to integrate them into gensim itself, and have one method to directly evaluate word2vec models, right inside gensim. What do you think?

The script I used to convert word2vec into the format for evaluating word vectors is quite small actually:

import gensim

model = gensim.models.Word2Vec.load_word2vec_format(
    'GoogleNews-vectors-negative300.bin', binary=True)

words = [line.split()[0] for line in open(
    "eval-word-vectors/vocab.txt", 'r')]

with open('output_vecs.txt', 'wb') as f:
    for word in words:
        if word in model:
            word_vector = model[word]
            f.write("%s " % word)
            f.write(" ".join(str(x) for x in word_vector))
            f.write("\n")

@akutuzov
Copy link
Contributor Author

akutuzov commented Jan 7, 2017

I am not sure it's a good idea to overload Gensim with various semantic similarity datasets included in the distribution.
Most people would use their own gold datasets anyway, either because they deal with non-English data or because their text preprocessing differs from the preprocessing in SimLex999 or WS353 (lemmatization/stemming, POS-tagging, etc).
So I think it's better to leave WS353 as an example (and for testing), and may be put a couple of links to other datasets in the documentation.

@anmolgulati
Copy link
Contributor

Yeah you are right. Sounds Good.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants