Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Documentation error in Doc2Vec #1302

Closed
jlorince opened this issue May 4, 2017 · 1 comment
Closed

Documentation error in Doc2Vec #1302

jlorince opened this issue May 4, 2017 · 1 comment

Comments

@jlorince
Copy link

jlorince commented May 4, 2017

This is super simple, but the documentation says the default value of the sample parameter in Doc2Vec is 0:

sample = threshold for configuring which higher-frequency words are randomly downsampled;
default is 0 (off), useful value is 1e-5.

But the actual default value is 1e-3 (0.001), based on the fact that the DocVec inherits from WordVec, where sample is set to 1e-3. See: https://github.com/RaRe-Technologies/gensim/blob/develop/gensim/models/word2vec.py#L367

I can confirm this by simply initiating a model with all defaults:

>>> from gensim.models import Doc2Vec
>>> model = Doc2Vec()
>>> print(model.sample)
0.001

Not sure if the desire behavior is what's reflected in the documentation, in which case the default should be changed, or if the documentation should be updated. Simple fix either way, I imagine, but as is it causes some confusion.

@gojomo
Copy link
Collaborator

gojomo commented May 5, 2017

The comment should accurately reflect the current inherited default.

@tmylk tmylk closed this as completed in 7b012a7 May 7, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants