Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG] API for Static Topic for word in Vocabulary. #706

Merged
merged 11 commits into from
May 31, 2016

Conversation

bhargavvader
Copy link
Contributor

@tmylk to complement the dynamic word document topic, this is a PR for static word topics.

@bhargavvader bhargavvader changed the title [WIP] API for Static Topic for word in Vocal. [WIP] API for Static Topic for word in Vocal May 25, 2016
@bhargavvader bhargavvader changed the title [WIP] API for Static Topic for word in Vocal [WIP] API for Static Topic for word in Vocabulary. May 25, 2016
@tmylk
Copy link
Contributor

tmylk commented May 25, 2016

Thanks. Can the output be in same format as get_document_topics?

@bhargavvader
Copy link
Contributor Author

Would that mean a tuple with that word and corresponding topic?

@bhargavvader
Copy link
Contributor Author

bhargavvader commented May 26, 2016

@tmylk made it the same format as get_document_topic. Again, glove2word2vec fails. How do I restart it so that it passes? The static word tests work fine.

@bhargavvader bhargavvader changed the title [WIP] API for Static Topic for word in Vocabulary. [MRG] API for Static Topic for word in Vocabulary. May 26, 2016
@@ -905,6 +905,28 @@ def get_document_topics(self, bow, minimum_probability=None):
return [(topicid, topicvalue) for topicid, topicvalue in enumerate(topic_dist)
if topicvalue >= minimum_probability]

def get_static_topic(self, word_id, minimum_probability=None):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Default minimum_probability=0 is most intuitive

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree. 0.0 is a specific value to use, whereas None stands for "use default". Different semantics.

@bhargavvader
Copy link
Contributor Author

@tmylk , @piskvorky , could you please review? I've finished and wrote tests.

@@ -905,6 +905,28 @@ def get_document_topics(self, bow, minimum_probability=None):
return [(topicid, topicvalue) for topicid, topicvalue in enumerate(topic_dist)
if topicvalue >= minimum_probability]

def get_static_topic(self, word_id, minimum_probability=None):
"""
Returns static topics for word in vocabulary.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs better docs -- what is a "static topic"?

@piskvorky
Copy link
Owner

piskvorky commented May 30, 2016

The method name doesn't seem right; @cscorley any better ideas?

Maybe get_term_topics()? This is essentially a reverse method to get_topic_terms(), if I understand correctly.

@bhargavvader
Copy link
Contributor Author

bhargavvader commented May 30, 2016

@piskvorky , yeah get_term_topics() does sound better. That or get_word_topics.

@piskvorky
Copy link
Owner

Looks good, thanks @bhargavvader .

I leave the final sanity checks & merge touches to @tmylk

@tmylk
Copy link
Contributor

tmylk commented May 30, 2016

Just a note in the changelog and then ready to merge.

@bhargavvader
Copy link
Contributor Author

bhargavvader commented May 30, 2016

@tmylk done!
(glove2word2vec failing again, other tests are ok)

@cscorley
Copy link
Contributor

Yep, get_term_topics sounds great for this.

@piskvorky
Copy link
Owner

piskvorky commented May 31, 2016

@tmylk when are you planning to release? This version is getting scarily bulky again.

Does it make sense to release before the PyCon coding sprint, so people can work off a fresh version?

@tmylk tmylk merged commit 97c8455 into piskvorky:develop May 31, 2016
@bhargavvader bhargavvader deleted the StaticWord branch June 22, 2016 04:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants